Subject: Re: 1.3.2 seg fault in shared lib?
To: Simon J. Gerraty <sjg@quick.com.au>
From: Michael Graff <explorer@flame.org>
List: current-users
Date: 03/08/1999 19:25:36
Every time I've seen this happen you are really stomping on memory
somewhere, and in the static version the memory isn't as important
where you are stomping.

--Michael

"Simon J. Gerraty" <sjg@quick.com.au> writes:

> This is a bit odd, I'd appreciate a sanity check.
> 
> A tool using shared libs, gets a seg fault on return from a
> function, yet the same tool using static linking works fine.
> 
> I've done the obvious things like make clean and make of all the libs
> involved (other than libc) to ensure that the static/shared libs
> contain the same code and bumped my stack limit up, run ldconfig,
> ... 
> 
> Running under gdb and doing a stack trace just prior to return shows
> no sign of stack corruption, but we get a seg fault at exactly the
> same point every time...
> 
> If someone can think of another obvious avenue to check I'd appreciate
> it.
> 
> In case it is of interest, ktrace output from the static tool shows:
> 
>  17561 noid     CALL  read(0x6,0x32008,0x2000)
>  17561 noid     GIO   fd 6 read 8192 bytes
>        "
>         -- Extended Mosy format from
>         --   SMIC version 1.0.9, July 23, 1992.
> ...
> [hmm must fix that, its actually a much enhanced SMIC :-)]
> ...
>         -- Extended Mosy OID tree
>         -- From:   SNMPv2-SMI
>         ccitt                0                regPt
>         zeroDotZero          ccitt.0          regPt
>         iso                  1                regPt
>         org                  iso.3            regPt
>         dod                  org.6            regPt
>         internet             dod.1            regPt
>         directory            internet.1       regPt
>         mgmt                 internet.2       regPt
>         mib-2                mgmt.1           regPt
>         -- From:   SNMPv2-MIB
>         system               mib-2.1          regPt
> ...
>  17561 noid     RET   read 8192/0x2000
>  17561 noid     CALL  break(0x9f800)
>  17561 noid     RET   break 0
> 
> whereas in the dynamic version we get (for the last bit):
> 
>  16935 noid     RET   read 8192/0x2000
>  16935 noid     PSIG  SIGSEGV SIG_DFL
>  16935 noid     NAMI  "noid.core"
> 
> $ ldd obj/noid
> obj/noid:
>         -lsnmp2.0 => /usr/lib/libsnmp2.so.0.2 (0x4001a000)
>         -lsjg.1 => /usr/lib/libsjg.so.1.2 (0x40030000)
>         -ldmalloc.2 => /usr/lib/libdmalloc.so.2.0 (0x40044000)
>         -lc.12 => /usr/lib/libc.so.12.20 (0x4004a000)
> $ obj/noid
> Memory fault (core dumped) 
> : sjg:508; gdb obj/noid noid.core 
> GDB is free software and you are welcome to distribute copies of it
>  under certain conditions; type "show copying" to see the conditions.
> There is absolutely no warranty for GDB; type "show warranty" for details.
> GDB 4.16 (i386-netbsd), Copyright 1996 Free Software Foundation, Inc...
> Core was generated by `noid'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /usr/libexec/ld.so...done.
> Reading symbols from /usr/lib/libsnmp2.so.0.2...done.
> Reading symbols from /usr/lib/libsjg.so.1.2...done.
> Reading symbols from /usr/lib/libdmalloc.so.2.0...done.
> Reading symbols from /usr/lib/libc.so.12.20...done.
> #0  0x40026f7b in yyparse ()
>     at /u0/share/arch/NetBSD/i386/src/sjg/snmp/lib/snmp2/mosy.y:165
> 165                     free($1);
> (gdb) l 160
> 155             ;
> 156
> 157     mib_objid
> 158             : STRING STRING '.' NUMBER NL {
> 159                     addNode($1,$2,$4,"OBJID","","");
> 160                     free($1);
> 161                     free($2);
> 162             }
> 163             | STRING STRING '.' NUMBER REGPT NL {
> 164                     addNode($1,$2,$4,"OBJID","","");
> 
> [ here is where we die]
> 
> 165                     free($1);
> 166                     free($2);
> 167             }
> 168             | STRING NUMBER NL {
> 169                     addNode($1,"",$2,"OBJID","","");
> 170                     free($1);
> 171             }
> 172             | STRING NUMBER REGPT NL {
> 173                     MosyVersion = SMIC_EMOSY;
> 174                     addNode($1,"",$2,"OBJID","","");
> 
> setting a break-point in addNode shows it is on return from dealing
> with:
> 
>         mgmt                 internet.2       regPt
> 
> that we die each time.  libdmalloc is a facist beast that picks up
> things like free()ing non-allocated mem or already free()'d mem
> (except on Solaris where that is expected :-), and also checks for
> overruns etc.
> 
> By the time we die, the same bit of code has been exercised for
> zeroDotZero, iso, org, dod, internet and directory before being called
> for mgmt.  
> 
> A stack overrun is all I can think of, and I cannot find any evidence
> for it.  Input welcome.
> 
> BTW, the actual C code from y.tab.c is:
> 
> 		addNode(yyvsp[-5].s,yyvsp[-4].s,yyvsp[-2].i,"OBJID","","");
> 		free(yyvsp[-5].s);
> 		free(yyvsp[-4].s);
> 
> but even when no optimizer is used, gdb says:
> 
> Address of symbol "yyvsp" is unknown.
> 
> --sjg