Subject: sh core dumps
To: None <port-sparc@netbsd.org>
From: Valeriy E. Ushakov <uwe@ptc.spbu.ru>
List: port-sparc
Date: 10/20/2005 00:47:00
[Starting a new thread to disentangle this from the -mcpu discussion]
It seem that I can reliably reproduce the problem with devel/gmake
after running make there I can cd to work/make-3.80 and trigger the
bug by running ./config.status. That gives me one or sometimes two sh
core files (i run with kern.defcorename=%n.%p.core). Both are from
backticked invocation of sed. As we get sh.core, not sed.core that
should happen in the vforked child before exec.
[While I don't have older cores around, but IIRC they were similar in
that the sh.core was from a child vforked to run a backticked command
or parens subshell]
<root@krups:/usr/pkgsrc/devel/gmake/work/make-3.80> (1042) ./config.status
config.status: creating Makefile
config.status: creating glob/Makefile
config.status: creating po/Makefile.in
config.status: creating config/Makefile
config.status: creating doc/Makefile
config.status: creating build.sh
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
[1] Segmentation fault (core dumped) sed -n -e "/^DEP...
[1] Segmentation fault (core dumped) sed -n -e "/^DEP...
config.status: executing default-1 commands
config.status: creating po/POTFILES
config.status: creating po/Makefile
I've instrumented memfault handler to print some additional info and
for those cores I always get
sh[1116]: SEGV(map): addr=0xe804068c type=9
> sfsr=38e<PERR=0,LVL=3,AT=4,FT=3,FAV,OW>
AT=4 - store user data
FT=3 - privelege violation
The fault address 0xe804068c is in the kernel (this is krups, so the
kernel is at e8000000). That address is in the middle of nd6_ioctl().
The backtrace and register contents is always the same.
The instruction at pc looks totally innocent.
(gdb) x/7i $pc-20
0x17400 <argstr+324>: cmp %l6, 0
0x17404 <argstr+328>: sethi %hi(0x31000), %l3
0x17408 <argstr+332>: be 0x17434 <argstr+376>
0x1740c <argstr+336>: sethi %hi(0x30c00), %l7
0x17410 <argstr+340>: ld [ %l3 + 0x310 ], %g1
0x17414 <argstr+344>: add %g1, -1, %g1 # <-- pc
0x17418 <argstr+348>: cmp %g1, 0 # <-- npc
(gdb) bt
#0 0x00017414 in argstr ()
#1 0x000171c4 in expandarg ()
#2 0x0001461c in evalcommand ()
#3 0x0001396c in evaltree ()
#4 0x000142ec in evalbackcmd ()
#5 0x00017c70 in expbackq ()
#6 0x000174ac in argstr ()
#7 0x000171c4 in expandarg ()
#8 0x00014738 in evalcommand ()
#9 0x0001396c in evaltree ()
#10 0x00013954 in evaltree ()
#11 0x00013904 in evaltree ()
#12 0x00013904 in evaltree ()
#13 0x00013904 in evaltree ()
#14 0x00013904 in evaltree ()
#15 0x00013cc0 in evalfor ()
#16 0x00013a90 in evaltree ()
#17 0x00013954 in evaltree ()
#18 0x00013e20 in evalcase ()
#19 0x00013aa4 in evaltree ()
#20 0x00013954 in evaltree ()
#21 0x00013cc0 in evalfor ()
#22 0x00013a90 in evaltree ()
#23 0x0001edb8 in cmdloop ()
#24 0x0001eaa4 in main ()
#25 0x00011954 in ___start ()
(gdb) i r
g0 0x0 0
g1 0xe804068c -402389364
g2 0xe7ffde88 -402661752
g3 0x140 320
g4 0xe7ffde90 -402661744
g5 0xff6c606a -9674646
g6 0x0 0
g7 0x0 0
o0 0x426af 272047
o1 0x344 836
o2 0x192a8 103080
o3 0xf2d2bfb0 -221069392
o4 0x44 68
o5 0x0 0
sp 0xe7ffde28 3892305448
o7 0x196e0 104160
l0 0x81000000 -2130706432
l1 0x81 129
l2 0x81 129
l3 0x31000 200704
l4 0x0 0
l5 0x0 0
l6 0x1 1
l7 0x30c00 199680
i0 0x3d726 251686
i1 0x3 3
i2 0xfffffffc -4
i3 0xf24443fc -230407172
i4 0x0 0
i5 0x1 1
fp 0xe7ffde90 3892305552
i7 0x171bc 94652
y 0xb773 46963
psr 0x4900087 76546183 icc:N--C, pil:0, s:1, ps:0, et:0, cwp:7
wim 0x0 0
tbr 0x0 0
pc 0x17414 95252
npc 0x17418 95256
fpsr 0x0 0 rd:N, tem:0, ns:0, ver:0, ftt:0, qne:0, fcc:=, aexc:0, cexc:0
cpsr 0x0 0
g1 looks suspicious (== fault address). Preceding instructions that
loads g1 pick the data from
(gdb) p/x $l3+0x310
$1 = 0x31310
(gdb) x/x $l3+0x310
0x31310 <sstrnleft>: 0x0000013f
Any attempts to ktrace or run config.status under gdb make the bug
hide.
Any ideas? I have no theory as to what might possibly cause this. On
one hand, it seems like a timing issue, as I sometimes get second core
and sometimes I don't. On the other hand, the backtrace and registers
are always the same (from invocation to invocation, and in both cores
from the same invocation if there are two).
SY, Uwe
--
uwe@ptc.spbu.ru | Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/ | Ist zu Grunde gehen