Subject: 1.3.2 seg fault in shared lib?
To: None <current-users@netbsd.org>
From: Simon J. Gerraty <sjg@quick.com.au>
List: current-users
Date: 03/08/1999 23:35:16
This is a bit odd, I'd appreciate a sanity check.
A tool using shared libs, gets a seg fault on return from a
function, yet the same tool using static linking works fine.
I've done the obvious things like make clean and make of all the libs
involved (other than libc) to ensure that the static/shared libs
contain the same code and bumped my stack limit up, run ldconfig,
...
Running under gdb and doing a stack trace just prior to return shows
no sign of stack corruption, but we get a seg fault at exactly the
same point every time...
If someone can think of another obvious avenue to check I'd appreciate
it.
In case it is of interest, ktrace output from the static tool shows:
17561 noid CALL read(0x6,0x32008,0x2000)
17561 noid GIO fd 6 read 8192 bytes
"
-- Extended Mosy format from
-- SMIC version 1.0.9, July 23, 1992.
...
[hmm must fix that, its actually a much enhanced SMIC :-)]
...
-- Extended Mosy OID tree
-- From: SNMPv2-SMI
ccitt 0 regPt
zeroDotZero ccitt.0 regPt
iso 1 regPt
org iso.3 regPt
dod org.6 regPt
internet dod.1 regPt
directory internet.1 regPt
mgmt internet.2 regPt
mib-2 mgmt.1 regPt
-- From: SNMPv2-MIB
system mib-2.1 regPt
...
17561 noid RET read 8192/0x2000
17561 noid CALL break(0x9f800)
17561 noid RET break 0
whereas in the dynamic version we get (for the last bit):
16935 noid RET read 8192/0x2000
16935 noid PSIG SIGSEGV SIG_DFL
16935 noid NAMI "noid.core"
$ ldd obj/noid
obj/noid:
-lsnmp2.0 => /usr/lib/libsnmp2.so.0.2 (0x4001a000)
-lsjg.1 => /usr/lib/libsjg.so.1.2 (0x40030000)
-ldmalloc.2 => /usr/lib/libdmalloc.so.2.0 (0x40044000)
-lc.12 => /usr/lib/libc.so.12.20 (0x4004a000)
$ obj/noid
Memory fault (core dumped)
: sjg:508; gdb obj/noid noid.core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-netbsd), Copyright 1996 Free Software Foundation, Inc...
Core was generated by `noid'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/libexec/ld.so...done.
Reading symbols from /usr/lib/libsnmp2.so.0.2...done.
Reading symbols from /usr/lib/libsjg.so.1.2...done.
Reading symbols from /usr/lib/libdmalloc.so.2.0...done.
Reading symbols from /usr/lib/libc.so.12.20...done.
#0 0x40026f7b in yyparse ()
at /u0/share/arch/NetBSD/i386/src/sjg/snmp/lib/snmp2/mosy.y:165
165 free($1);
(gdb) l 160
155 ;
156
157 mib_objid
158 : STRING STRING '.' NUMBER NL {
159 addNode($1,$2,$4,"OBJID","","");
160 free($1);
161 free($2);
162 }
163 | STRING STRING '.' NUMBER REGPT NL {
164 addNode($1,$2,$4,"OBJID","","");
[ here is where we die]
165 free($1);
166 free($2);
167 }
168 | STRING NUMBER NL {
169 addNode($1,"",$2,"OBJID","","");
170 free($1);
171 }
172 | STRING NUMBER REGPT NL {
173 MosyVersion = SMIC_EMOSY;
174 addNode($1,"",$2,"OBJID","","");
setting a break-point in addNode shows it is on return from dealing
with:
mgmt internet.2 regPt
that we die each time. libdmalloc is a facist beast that picks up
things like free()ing non-allocated mem or already free()'d mem
(except on Solaris where that is expected :-), and also checks for
overruns etc.
By the time we die, the same bit of code has been exercised for
zeroDotZero, iso, org, dod, internet and directory before being called
for mgmt.
A stack overrun is all I can think of, and I cannot find any evidence
for it. Input welcome.
BTW, the actual C code from y.tab.c is:
addNode(yyvsp[-5].s,yyvsp[-4].s,yyvsp[-2].i,"OBJID","","");
free(yyvsp[-5].s);
free(yyvsp[-4].s);
but even when no optimizer is used, gdb says:
Address of symbol "yyvsp" is unknown.
--sjg