Subject: port-powerpc/12938: shlib incompatibility 1.5.head vs 20010408-1.5.1_BETA
To: None <gnats-bugs@gnats.netbsd.org>
From: None <cagney@tpgi.com.au>
List: netbsd-bugs
Date: 05/14/2001 11:00:59
>Number: 12938
>Category: port-powerpc
>Synopsis: shlib incompatibility 1.5.head vs 20010408-1.5.1_BETA
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-powerpc-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 14 08:55:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:
>Release: NetBSD localhost 1.5V NetBSD 1.5V (NETLUX)
>Organization:
>Environment:
System: NetBSD localhost 1.5V NetBSD 1.5V (NETLUX) #0: Sat May 12 23:09:17 EDT 2001 boor@localhost:/usr/trunk.src/sys/arch/macppc/compile/NETLUX macppc
G4 Ti, the middle of the range version.
>Description:
When trying to upgrade a powerpc system from
20010408-1.5.1_BETA to the head of the 1.5 branch it develops
apparently random SIGSEG (11). (see how-to-repeat).
I've seen the effect with cc, ranlib, ar, printf, nm but that
is probably because they all used when building.
Once it has started happening for ar and ranlib, it is
reproducable vis (using how-to-repeat):
# rm /tmp/obj/lib/libcrypto/*.a
# make build-install-lib (don't ask)
(cd /usr/src/lib && make MKSHARE=no dependall && make MKSHARE=no install)
....
dependall ===> libcrypto
building standard crypto library
ranlib libcrypto.a
*** Signal 11
However, for cc it is more sensative. Given it has something
to do with exec and shared library loading I'm not suprised.
It doesn't appear to be affected by machine load.
The cause of the SIGSEG is always the same (see below).
If you cd to the relevant directory and run the commads
dumping core from there, the problem goes away ... vis:
bash-2.04# cd lib/libcrypto/
bash-2.04# rm obj/*.a
bash-2.04# make dependall
building standard crypto library
ranlib libcrypto.a
building profiled crypto library
ranlib libcrypto_p.a
building shared object crypto library
ranlib libcrypto_pic.a
Examining the core dump.
In the below I'm looking at an unstripped ranlib built /
installed / linked against the head-of-1.5. The same behavour
occures using a 20010408-1.5.1_BETA ranlib dynamically linked
against head-of-1.5. This was simply the easiest way to get
an unstripped binary.
# /home/scratch/WIP/mi/gdb/gdb ranlib /tmp/obj/lib/libcrypto/ranlib.core
(again don't ask - I've a fix to FSF GDB I need to check in :-)
GNU gdb 5.0 (MI_OUT)
This GDB was configured as "powerpc-apple-netbsd1.5V"...
Core was generated by `ranlib'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/libexec/ld.elf_so...done.
Loaded symbols for /usr/libexec/ld.elf_so
Reading symbols from /usr/lib/libbfd.so.3...done.
Loaded symbols for /usr/lib/libbfd.so.3
Reading symbols from /usr/lib/libc.so.12...done.
Loaded symbols for /usr/lib/libc.so.12
#0 0x419ab150 in _init ()
at /usr/src/lib/csu/powerpc/../common_elf/crtbegin.c:106
106 if (!initialized) {
(gdb) x/i $pc
0x419ab150 <_init+16>: lwz r0,0(r9)
(gdb) disassemble
Dump of assembler code for function _init:
0x419ab140 <_init>: stwu r1,-16(r1)
0x419ab144 <_init+4>: mflr r0
0x419ab148 <_init+8>: stw r0,20(r1)
0x419ab14c <_init+12>: lis r9,0
0x419ab150 <_init+16>: lwz r0,0(r9)
0x419ab154 <_init+20>: cmpwi r0,0
0x419ab158 <_init+24>: bne 0x419ab174 <_init+52>
0x419ab15c <_init+28>: li r0,1
0x419ab160 <_init+32>: stw r0,0(r9)
0x419ab164 <_init+36>: lis r9,0
0x419ab168 <_init+40>: lwz r0,0(r9)
0x419ab16c <_init+44>: mtlr r0
0x419ab170 <_init+48>: blrl
0x419ab174 <_init+52>: lwz r0,20(r1)
0x419ab178 <_init+56>: mtlr r0
0x419ab17c <_init+60>: addi r1,r1,16
0x419ab180 <_init+64>: blr
End of assembler dump.
Note the sequence:
0x419ab14c <_init+12>: lis r9,0
0x419ab150 <_init+16>: lwz r0,0(r9)
and compare that to the executable:
# /home/scratch/WIP/mi/gdb/gdb ranlib
(gdb) disassemble _init
...
0x18059c4 <_init+12>: lis r9,388
0x18059c8 <_init+16>: lwz r0,30900(r9)
>How-To-Repeat:
The obvious thing to do is to give the below a wirl and if it
works fine on a similar system conclude that it is something
to do with my hardware and not the kernel et.al.
# gzcat base.tgz | ( cd / && tar --unlink -xpf - )
# gzcat comp.tgz | ( cd / && tar --unlink -xpf - )
# cd /usr/src && make build
....
building standard crypto library
ranlib libcrypto.a
*** Signal 1
>Fix:
Workaround: per, how-to-repeat, revert to the old
libraries/binaries.
>Release-Note:
>Audit-Trail:
>Unformatted: