Subject: bin/650: possible bug with regular expressions in awk
To: None <gnats-admin@sun-lamp.cs.berkeley.edu>
From: None <ram@cs.arizona.edu>
List: netbsd-bugs
Date: 12/20/1994 08:35:05
>Number: 650
>Category: bin
>Synopsis: backslash does not escape *
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people (Utility Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Dec 20 08:35:03 1994
>Originator:
>Organization:
" "
>Release:
>Environment:
System: netbsd 1.0 i386
>Description:
The sub command, when used in awk does not properly escape the
meaning of *.
>How-To-Repeat:
BTW -- I "ported" send-pr to my solaris machine to send the email report.
Sorry if I left something out.
file bar has:
janis:ram {58} cat bar
MOVIE RATINGS REPORT
New Distribution Votes Rank Title
.0.03020.. 11 5.5 $
.121100... 11 4.0 $1,000,000 Duck
0011211000 142 5.0 'burbs, The
0000122100 418 6.4 'Crocodile' Dundee
.2.224.... 5 4.6 After School
..2.24.... 7 4.9 After the Fall of New York
...0522... 14 5.6 After the Fox
* ...242.2.. 5 5.6 After the Rehearsal
janis:ram {59}
Use this awk program to see the problem:
#!/usr/bin/awk -f
BEGIN {
in_report=0
}
/MOVIE RATINGS REPORT/ {
in_report=1
}
in_report==1 && /[0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.]/ {
new_line = $0;
sub( "^ \* ", " X ", new_line);
print new_line
}
Running the following line "cat bar | filter_ratings" gives
X .0.03020.. 11 5.5 $
X .121100... 11 4.0 $1,000,000 Duck
X 0011211000 142 5.0 'burbs, The
X 0000122100 418 6.4 'Crocodile' Dundee
X .2.224.... 5 4.6 After School
X ..2.24.... 7 4.9 After the Fall of New York
X ...0522... 14 5.6 After the Fox
X * ...242.2.. 5 5.6 After the Rehearsal
Apparently the sub command is matching all leading white space. When the
backslash should force the regexp to match a space asterisk space.
>Fix:
I have no idea.
>Audit-Trail:
>Unformatted: