Subject: bin/15412: join doesn't deal with '-e' option properly
To: None <gnats-bugs@gnats.netbsd.org>
From: Duncan McEwan <duncan@mcs.vuw.ac.nz>
List: netbsd-bugs
Date: 01/29/2002 15:29:50
>Number: 15412
>Category: bin
>Synopsis: join doesn't deal with '-e' option properly
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Jan 28 18:30:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: Duncan McEwan
>Release: NetBSD 1.5ZA, join.c,v 1.19 2000/06/10 19:21:05
>Organization:
Victoria University of Wellington, New Zealand
>Environment:
System: NetBSD shed11.mcs.vuw.ac.nz 1.5ZA NetBSD 1.5ZA (GEN_X) #0: Fri Jan 4 12:56:58 NZDT 2002 mark@turakirae.mcs.vuw.ac.nz:/mnt/SAVE/build.obj/sys/arch/i386/compile/GEN_X i386
Architecture: i386
Machine: i386
>Description:
The join command with options '-a1 -a2 -eZZZ' won't output the string
ZZZ in place of the non-existent fields in the case where a line from
one file didn't contain a matching join field with a line from the
other.
>How-To-Repeat:
Given "file1" containing
line1
line2
line4
and "file2" containing
line1
line3
The command "join -1 1 -2 1 -a1 -a2 -e ZZZ -o1.1,2.1 file1 file2"
produces
line1 line1
line2
line3
line4
On Solaris 2.8 and OSF 4.0 the same command produces what I believe
is the correct output.
line1 line1
line2 ZZZ
ZZZ line3
line4 ZZZ
>Fix:
There are two problems in join.c 1.19. Firstly, the code in
outoneline() doesn't call outfield() if olist[cnt].fileno != F->number.
The patch below fixes this by defining a "constant" noline and
calling outfield with it, which will always output the string contained
in the variable "empty" (if it's not NULL).
With this change the output produced becomes
line1 line1
line2 ZZZ
line3 ZZZ
line4 ZZZ
which is still not quite right :-(
This turned out to be due to the variable "input2" being initialised
incorrectly (the "number" field was being set to 1 rather than 2).
The patch below fixes both of these problems.
After working all this out I decided (too late!) to check the freebsd
cvs repository. They have already fixed both problems and their fix
for the first was different to mine, so you may prefer to use theirs
for compatibility.
*** join.c.prev Sun Jun 11 07:21:05 2000
--- join.c Tue Jan 29 15:14:49 2002
***************
*** 76,81 ****
--- 76,83 ----
u_long fieldalloc; /* line field(s) allocated count */
} LINE;
+ LINE noline = {"", 0, 0, 0, 0}; /* arg to outfield if no line to output */
+
typedef struct {
FILE *fp; /* file descriptor */
u_long joinf; /* join field (-1, -2, -j) */
***************
*** 88,94 ****
u_long setalloc; /* set allocated count */
} INPUT;
INPUT input1 = { NULL, 0, 0, 1, NULL, -1, 0, 0, },
! input2 = { NULL, 0, 0, 1, NULL, -1, 0, 0, };
typedef struct {
u_long fileno; /* file number */
--- 90,96 ----
u_long setalloc; /* set allocated count */
} INPUT;
INPUT input1 = { NULL, 0, 0, 1, NULL, -1, 0, 0, },
! input2 = { NULL, 0, 0, 2, NULL, -1, 0, 0, };
typedef struct {
u_long fileno; /* file number */
***************
*** 433,438 ****
--- 435,450 ----
for (cnt = 0; cnt < olistcnt; ++cnt) {
if (olist[cnt].fileno == F->number)
outfield(lp, olist[cnt].fieldno);
+ else
+ /*
+ * because of the way "noline" is initialised
+ * this call to outfield will either produce
+ * no output or the contents of the variable
+ * "empty" (set by the -e option). I did it
+ * this way to avoid duplicating the code
+ * from outfield() here.
+ */
+ outfield(&noline, 1);
}
else
for (cnt = 0; cnt < lp->fieldcnt; ++cnt)
>Release-Note:
>Audit-Trail:
>Unformatted: