Regular expression – overlapping text replacement using Perl regular expression

I have a text file containing a bunch of sentences. The sentences contain spaces (spaces, tabs, new lines) to separate words composed of letters and/or numbers.
I want to find the word “123” or “-123” and insert a dot (.) before the start of the number. Therefore, all occurrences of “123” and “-123” will be converted to “.123” and “-123” “.

I tried the following method:

$line =~ s/(\s+-*123\s+)/getNewWord( $1)/ge

where $line contains the line read from the file, and the function getNewWord will put the dot (.) in the appropriate position in the matched word.

But It does not apply to the situation where there are two consecutive “123”, such as “123 123”. When the first “123” is replaced by “.123”, the space after the word has been matched, and the second “123” is not Match, because the regular expression engine cannot match the previous space with the word.

Who can help me with this? Thanks!

I agree with MRAB (and his/her answer is 1), but there is no real need for getNewWord function .I will change the entire statement to one of the following similar:

$line =~ s/((?:^|\s)-?)(123 )(?=\s|$)/$1.$2/g;

$line =~ s/(?:^|(?<=\s))(-?)(123) (?=\s|$)/$1.$2/g;

$line =~ s/(?:^|(?<=\s)|(?<=\s-) )(?=123(?:\s|$))/./g;

I have a text file containing a bunch of sentences. The sentences contain spaces (spaces , Tab, new line) to separate words composed of letters and/or numbers.
I want to find the word "123" or "-123" and insert a dot (.) before the beginning of the number. Therefore, All occurrences of "123" and "-123" will be converted to ".123" and "-123".

I tried the following method:

$line =~ s/(\s+-*123\s+)/getNewWord($1)/ge

where $line contains the line read from the file, and the function getNewWord will Put the dot (.) in the appropriate position in the matched word.

But it does not apply to the situation where there are two consecutive "123", such as "123 123". When the first "123" When replaced by ".123", the space after the word has been matched, and the second "123" does not match, because the regular expression engine cannot match the previous space with the word.

Who can help My? Thanks!

I agree with MRAB (and his/her answer is 1), but there is no real need for the getNewWord function. I will change the entire statement to one of the following similar :

$line =~ s/((?:^|\s)-?)(123)(?=\s|$)/$1. $2/g;

$line =~ s/(?:^|(?<=\s))(-?)(123)(?=\s|$)/$1.$2 /g;

$line =~ s/(?:^|(?<=\s)|(?<=\s-))(?=123(?:\s|$ ))/./g;

Leave a Comment

Your email address will not be published.