NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/39002: harmful AWK extension: non-portable escaped character
>Number: 39002
>Category: bin
>Synopsis: harmful AWK extension: non-portable escaped character
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: change-request
>Submitter-Id: net
>Arrival-Date: Fri Jun 20 20:10:00 +0000 2008
>Originator: cheusov%tut.by@localhost
>Release: NetBSD 4.0_STABLE
>Organization:
>Environment:
System: NetBSD chen.chizhovka.net 4.0_STABLE NetBSD 4.0_STABLE (GENERIC) #3:
Wed Apr 23 00:58:08 EEST 2008
cheusov%chen.chizhovka.net@localhost:/srv/obj/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
Portability is one of main Goal of the NetBSD project.
NetBSD itself is portable to huge amount of hardware platforms.
It declared that NetBSD base system can cross-compiled on other systems.
pkgsrc project is portable to many operating systems etc.
All this sounds amazing but sometimes reality is different.
http://www.opengroup.org/onlinepubs/009695399/utilities/awk.html:
Lexical Conventions
The token STRING shall represent a string constant. A string
constant shall begin with the character ' .' Within a string
constant, a backslash character shall be considered to begin an
escape sequence as specified in the table in the Base
Definitions volume of IEEE Std 1003.1-2001, Chapter 5, File
Format Notation ( '\\', '\a', '\b', '\f', '\n', '\r', '\t', '\v'
).
...
The problem with NetBSD awk is that it supports treats extra esacpe
sequences, that \<other_char> as plain <other_char>.
Example:
0 ~>/usr/bin/awk 'BEGIN {print "\."}'
.
0 ~>/usr/bin/awk 'BEGIN {print "\$"}'
$
0 ~>/usr/bin/awk 'BEGIN {print "\z"}'
z
0 ~>
I now at least two problems in NetBSD code caused by this extension.
kern/38766: makesyscalls.sh breaks build if mawk is used
Here building kernel failed under Linux and mawk is in use because
mawk treat \$ as \$ (not as $).
pkg/33410: pkgsrc problem with posix awk
Here pkgsrc passed `\.' , `\$' and `\/' to awk interpreter again
and pkgsrc might fail with mawk or other awk implementations.
Note: that days pkgsrc used native version of awk.
What others do:
mawk: treats \<other_char> as \<other_char>
gawk: prints warning message and treats as plain <other_char>
HP-UX /usr/bin/awk: treats \<other_char> as \<other_char>
0 ~>/usr/bin/awk 'BEGIN {print "\$"}'
$
0 ~>/usr/pkg/bin/mawk 'BEGIN {print "\$"}'
\$
0 ~>/usr/pkg/bin/gawk 'BEGIN {print "\$"}'
gawk: warning: escape sequence `\$' treated as plain `$'
$
0 ~>
I think gawk does right thing here and I'd like to see the same in NetBSD.
Even better - exit with error in this case ;) I personally vote for this.
In this case NetBSD code will be even more portable.
And those programs depeloped under NetBSD will have better portability.
>Fix:
Index: lex.c
===================================================================
RCS file: /pub/NetBSD-CVS/src/dist/nawk/lex.c,v
retrieving revision 1.7.4.1
diff -u -r1.7.4.1 lex.c
--- lex.c 3 Feb 2008 00:23:16 -0000 1.7.4.1
+++ lex.c 20 Jun 2008 20:04:46 -0000
@@ -431,7 +431,8 @@
break;
}
- default:
+ default:
+ WARNING ("warning: escape sequence \\`%c'
treated as plain `%c'", c, c);
*bp++ = c;
break;
}
Home |
Main Index |
Thread Index |
Old Index