Subject: mmap not working right on NetBSD/sparc 1.5? - segfault on write
To: None <port-sparc@netbsd.org>
From: Greg Troxel <gdt@ir.bbn.com>
List: port-sparc
Date: 12/08/2000 19:24:47
I am trying to run coda and having trouble with liblwp, their thread
package.  The message below addresses 1.5_BETA2, but the problem is
still present with 1.5 release and the GENERIC kernel from the
release.  I've now had the problem consistently on two machines - the
other is an IPX that has never had any sign of flakiness.

Short summary:  lwp mmap's some memory for a stack, and then tries to
write into it.  The first write segfaults.  However, if I break in gdb
before the write, I can read/write the mmap'd stack from gdb with no
trouble.

Could this have anything to do with the recent register window issues?

Any help/clues would be appreciated.

------- Forwarded Message

Message-Id: <200011281704.MAA31241@telemann.coda.cs.cmu.edu>
From: Greg Troxel <gdt@ir.bbn.com>
To: codalist@telemann.coda.cs.cmu.edu
Subject: lwp broken on NetBSD/sparc 1.5_BETA2 ?
Date: Tue, 28 Nov 2000 12:04:27 -0500

I have a sparc ELC running NetBSD 1.5_BETA2.  I have a slight
suspicion that the hardware is not 100% ok (POST failure with no
explanation, occasional cc core dumps when compiling really huge
files), but I have built cvs, emacs, perl, kth-krb4, arla etc. and am
running X, so it is at least 99.999% ok.  (I can use arla to
read/write afs servers at MIT, etc.)  This problem is repeatable,
which none of the other wierdnesses are.  I can try this on another
sparc sometime.  [I have, and it has the same symptoms.]

I am trying to build coda, using the latest CVS.  I am having a number
of problems that have not occurred when doing the same under NetBSD/i386
1.4.2 or FreeBSD/i386 {3.3,4.2-betaish}.

[trimmed]

1) testlwp-static dumps core.  The problem is in Initialize_Stack.  It
appears to have successfully mmap()d a stack at 0x4500000, and with
gdb I can read and write this memory space.  I have appended a bunch
of gdb output.  I recompiled that file without -O2, but I see no
important differences.  The instruction which loses is a stb trying to
write a 0 to 0x4500000.  However, I'm not enough of a sparc weenie to
know if I'm getting a delayed segfault from a prior instruction.
However, it loses on this instruction with or without -O2.

[trimmed]
 

(gdb) disass Initialize_Stack 
Dump of assembler code for function Initialize_Stack:
0x14a70 <Initialize_Stack>:     save  %sp, -112, %sp
0x14a74 <Initialize_Stack+4>:   st  %i0, [ %fp + 0x44 ]
0x14a78 <Initialize_Stack+8>:   st  %i1, [ %fp + 0x48 ]
0x14a7c <Initialize_Stack+12>:  sethi  %hi(0x25800), %o0
0x14a80 <Initialize_Stack+16>:  
    ld  [ %o0 + 0x20 ], %o1     ! 0x25820 <lwp_stackUseEnabled>
0x14a84 <Initialize_Stack+20>:  cmp  %o1, 0
0x14a88 <Initialize_Stack+24>:  be  0x14ae4 <Initialize_Stack+116>
0x14a8c <Initialize_Stack+28>:  nop 
0x14a90 <Initialize_Stack+32>:  clr  [ %fp + -12 ]
0x14a94 <Initialize_Stack+36>:  ld  [ %fp + -12 ], %o0
0x14a98 <Initialize_Stack+40>:  ld  [ %fp + 0x48 ], %o1
0x14a9c <Initialize_Stack+44>:  cmp  %o0, %o1
0x14aa0 <Initialize_Stack+48>:  bl  0x14ab0 <Initialize_Stack+64>
0x14aa4 <Initialize_Stack+52>:  nop 
0x14aa8 <Initialize_Stack+56>:  b  0x14adc <Initialize_Stack+108>
0x14aac <Initialize_Stack+60>:  nop 
0x14ab0 <Initialize_Stack+64>:  ld  [ %fp + 0x44 ], %o0
0x14ab4 <Initialize_Stack+68>:  ld  [ %fp + -12 ], %o1
0x14ab8 <Initialize_Stack+72>:  add  %o0, %o1, %o0
0x14abc <Initialize_Stack+76>:  ldub  [ %fp + -9 ], %o1
0x14ac0 <Initialize_Stack+80>:  and  %o1, -1, %o2
0x14ac4 <Initialize_Stack+84>:  stb  %o2, [ %o0 ]
0x14ac8 <Initialize_Stack+88>:  ld  [ %fp + -12 ], %o0
0x14acc <Initialize_Stack+92>:  add  %o0, 1, %o1
0x14ad0 <Initialize_Stack+96>:  st  %o1, [ %fp + -12 ]
0x14ad4 <Initialize_Stack+100>: b  0x14a94 <Initialize_Stack+36>
0x14ad8 <Initialize_Stack+104>: nop 
0x14adc <Initialize_Stack+108>: b  0x14af4 <Initialize_Stack+132>
0x14ae0 <Initialize_Stack+112>: nop 
0x14ae4 <Initialize_Stack+116>: ld  [ %fp + 0x44 ], %o0
0x14ae8 <Initialize_Stack+120>: sethi  %hi(0xbadbac00), %o2
0x14aec <Initialize_Stack+124>: or  %o2, 0x1ba, %o1     ! 0xbadbadba
0x14af0 <Initialize_Stack+128>: st  %o1, [ %o0 ]
0x14af4 <Initialize_Stack+132>: ret 
0x14af8 <Initialize_Stack+136>: restore 
End of assembler dump.
(gdb) i local
i = 0
# this was done after the segfault.  Note that o2 is 0 and o0 is stackbase.
(gdb) i reg
g0             0x0      0
g1             0x100d4eec       269307628
g2             0x0      0
g3             0x0      0
g4             0x0      0
g5             0x0      0
g6             0x0      0
g7             0xffffffff       -1
o0             0x45000000       1157627904
o1             0x0      0
o2             0x0      0
o3             0x1000   4096
o4             0x3      3
o5             0x1002   4098
sp             0xeffff580       -268438144
o7             0x25844  153668
l0             0x90400087       -1874853753
l1             0x100c5ed8       269246168
l2             0x100c5edc       269246172
l3             0xfc1    4033
l4             0x1      1
l5             0x1      1
l6             0xf1cb3000       -238342144
l7             0x100f3158       269431128
i0             0x45000000       1157627904
i1             0x1000   4096
i2             0x1      1
i3             0x0      0
i4             0x0      0
i5             0x1000   4096
fp             0xeffff5f0       -268438032
i7             0x13350  78672
y              0x3000   12288
psr            0x90900086       -1869610874
wim            0x0      0
tbr            0x0      0
pc             0x14ac4  84676
npc            0x14ac8  84680
fpsr           0x0      0
cpsr           0x0      0

(gdb) bt
#0  0x14ac4 in Initialize_Stack (stackptr=0x45000000 "", stacksize=4096)
    at lwp.c:1111
#1  0x13358 in LWP_CreateProcess (ep=0x10da0 <OtherProcess>, stacksize=4096, 
    priority=0, parm=0x0, name=0x257f0 "OtherProcess", pid=0xeffff6d0)
    at lwp.c:606
#2  0x10e74 in main (argc=1, argv=0xeffff7c4) at testlwp.c:82
#3  0x10a58 in ___start ()

(gdb) print stackptr 
$4 = 0x45000000 ""
(gdb) print stackptr[0]
$5 = 0 '\000'
(gdb) set stackptr[0] = 1
(gdb) print stackptr[0]
$6 = 1 '\001'


------- End of Forwarded Message