Subject: kern/8383: LFS can reproducibly be wedged in post-1.4.1 release branch
To: None <gnats-bugs@gnats.netbsd.org>
From: None <Havard.Eidnes@runit.sintef.no>
List: netbsd-bugs
Date: 09/12/1999 07:44:38
>Number: 8383
>Category: kern
>Synopsis: LFS can reproducibly be wedged in post-1.4.1 release branch
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Sep 12 07:20:01 1999
>Last-Modified:
>Originator: Havard Eidnes
>Organization:
RUNIT AS
>Release: netbsd-1-4 branch as of Sat Sep 11 1999
>Environment:
AlphaPC164, netbsd-1-4 branch (i.e. post-1.4.1)
System: NetBSD albatross.runit.sintef.no 1.4.1 NetBSD 1.4.1 (ALBATROSS) #3: Sat Sep 11 20:41:18 MEST 1999 he@albatross.runit.sintef.no:/usr/src/sys/arch/alpha/compile/ALBATROSS alpha
>Description:
Relatively simple use of LFS can cause the system to wedge
solidly. In my test case repeated use of "bonnie" on a
freshly constructed LFS partition causes this problem after
a few runs.
In one of my earlier tests, the I/O on the LFS file system
stopped, but I could still execute other commands on the other
ssh sessions I had open towards the target host, and when that
happened the output of "ps axlww" looked like this:
albatross% ps axlww
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
0 0 0 0 -18 0 0 31104 schedu DLs ?? 0:00.00 (swapper)
0 1 0 0 10 0 592 328 wait Is ?? 0:00.01 init
0 2 0 0 -18 0 0 31104 daemon DL ?? 0:00.00 (pagedaemon)
0 3 0 0 -18 0 0 31104 reaper DL ?? 0:00.00 (reaper)
0 118 1 0 2 0 176 648 select Ss ?? 0:00.10 syslogd
0 122 1 0 2 0 104 688 select Ss ?? 0:00.23 portmap -l
0 126 1 0 10 0 100664 336 mfsidl Is ?? 0:00.02 mount_mfs -s 200000 -o nosuid -o nodev /dev/wd0b /tmp
0 159 1 0 -5 0 32 504 lfs se Ds ?? 0:02.94 update 30
0 161 1 0 10 0 392 808 nanosl Is ?? 0:00.02 cron
0 164 1 0 2 0 176 800 select Ss ?? 0:00.16 inetd -l
13841 184 1 36 104 20 14896 15568 - RN ?? 38:49.93 ./setiathome
0 186 1 0 2 0 640 1040 select Is ?? 0:00.37 /usr/pkg/sbin/sshd (sshd1)
0 193 186 0 2 0 848 1352 select I ?? 0:00.24 /usr/pkg/sbin/sshd (sshd1)
0 199 186 0 2 0 848 1352 select S ?? 0:00.52 /usr/pkg/sbin/sshd (sshd1)
0 213 1 0 10 0 88 584 wait Is ?? 0:00.00 /usr/libexec/lfs_cleanerd /spare3
0 214 213 0 -5 0 376 872 getblk D ?? 0:06.97 /usr/libexec/lfs_cleanerd /spare3
0 236 186 0 2 0 848 1352 select S ?? 0:00.15 /usr/pkg/sbin/sshd (sshd1)
13841 195 193 0 18 0 1056 1472 pause Is p0 0:00.08 -tcsh
0 205 195 0 18 0 888 448 pause I p0 0:00.08 -csh
0 367 205 0 -5 0 80 696 lfs th D+ p0 1:24.17 bonnie -s 1000
13841 201 199 0 10 0 1056 1504 ppwait Ds p1 0:00.06 -tcsh
13841 443 201 0 28 0 736 352 - R+ p1 0:00.00 ps -axlww
13841 444 201 0 28 0 1056 1504 - RV p1 0:00.00 -tcsh
13841 238 236 4 18 0 1056 1496 pause Is p2 0:00.05 -tcsh
13841 242 238 0 3 0 656 1328 ttyin S+ p2 0:01.08 systat vm
0 188 1 7 3 0 112 824 ttyin Is+ C0 0:00.02 /usr/libexec/getty std.9600 console
albatross%
During the above test the file system under test was on an
ncr860 controller, and eventually the disk driver got sick and
started spewing
assertion "cp" failed: file "../../../../dev/pci/ncr.c", line 6704
assertion "cp" failed: file "../../../../dev/pci/ncr.c", line 6704
sd1(ncr0:1:0): COMMAND FAILED (4 28) @0xfffffe00000b3000.
sd1(ncr0:1:0): COMMAND FAILED (4 28) @0xfffffe00000b3800.
messages all over the console.
In a repeat test where I used a Qlogic ISP1040B controller and
another disk on this same system, I did not get to the period
of "no I/O on LFS file system but rest of system still works",
but rather the whole system just wedged. However, the driver
did not stuff up as above.
I took a crash dump in both instances, and after the second
crash dump I did this digging (as partly directed by Konrad
Schröder, partly doing some trial/error):
Script started on Sun Sep 12 14:30:07 1999
snylteveps% ssh albatross
Last login: Sun Sep 12 14:32:35 1999 from 129.241.196.66
NetBSD 1.4.1 (ALBATROSS) #3: Sat Sep 11 20:41:18 MEST 1999
Welcome to NetBSD!
No mail.
albatross% su
albatross# cd /var/crash
albatross# ps axlww -Mnetbsd.1.core -N netbsd.1
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
0 0 0 0 -18 0 0 0 schedu DLs ?? 0:00.01 (swapper)
0 1 0 0 10 0 608 0 wait Is ?? 0:00.02 (init)
0 2 0 0 -18 0 0 0 daemon DL ?? 0:00.00 (pagedaemon)
0 3 0 0 -18 0 0 0 reaper DL ?? 0:00.01 (reaper)
0 118 0 0 2 0 176 0 - Rs ?? 0:00.85 (syslogd)
0 122 0 0 -14 0 104 0 inode Ds ?? 0:04.35 (portmap)
0 126 0 0 10 0 100664 0 mfsidl Is ?? 0:00.08 (mount_mfs)
0 158 0 0 -5 0 32 0 getnew Ds ?? 0:15.68 (update)
0 160 0 0 -14 0 392 0 inode Ds ?? 0:00.16 (cron)
0 163 0 0 2 0 176 0 select Is ?? 0:03.16 (inetd)
13841 184 0 27 -5 20 14912 0 getnew DN ?? 948:44.79 (setiathome)
0 185 0 3 2 0 640 0 select Is ?? 0:01.74 (sshd1)
0 188 0 0 2 0 848 0 select I ?? 0:00.11 (sshd1)
0 239 0 0 2 0 848 0 select I ?? 0:01.49 (sshd1)
0 246 0 0 2 0 848 0 select I ?? 0:00.22 (sshd1)
0 1151 0 0 2 0 848 0 select I ?? 0:00.08 (sshd1)
0 5298 0 0 2 0 848 0 select I ?? 0:00.26 (sshd1)
0 5307 0 0 2 0 848 0 select I ?? 0:00.22 (sshd1)
0 5331 0 0 10 0 88 0 wait Is ?? 0:00.00 (lfs_cleanerd)
0 5332 0 0 28 0 1032 0 segmen I ?? 0:21.74 (lfs_cleanerd)
0 5361 0 0 2 0 848 0 select I ?? 0:00.49 (sshd1)
0 5675 0 0 -5 0 128 0 getnew D ?? 0:00.00 (portmap)
0 5676 0 0 -5 0 176 0 getnew D ?? 0:00.00 (inetd)
0 5678 0 0 -14 0 112 0 inode D ?? 0:00.00 (portmap)
13841 190 0 0 18 0 1056 0 pause Is p0 0:00.07 (tcsh)
0 252 0 6 3 0 888 0 ttyin I+ p0 0:00.05 (csh)
13841 241 0 0 18 0 1056 0 pause Is p1 0:00.05 (tcsh)
13841 245 0 0 -5 0 656 0 getnew D+ p1 0:22.37 (systat)
13841 248 0 0 18 0 1072 0 pause Is p2 0:00.08 (tcsh)
0 288 0 7 3 0 936 0 ttyin I+ p2 0:00.04 (csh)
13841 1153 0 2 3 0 1056 0 ttyin Is+ p3 0:00.06 (tcsh)
13841 5309 0 0 18 0 1056 0 pause Is p4 0:00.05 (tcsh)
0 5313 0 0 18 0 888 0 pause I p4 0:00.07 (csh)
0 5623 0 8 -5 0 80 0 getnew D+ p4 3:11.84 (bonnie)
13841 5300 0 1 18 0 1056 0 pause Is p5 0:00.05 (tcsh)
13841 5304 0 0 -14 0 656 0 inode D+ p5 0:01.63 (systat)
13841 5363 0 0 10 0 1056 0 ppwait Ds p6 0:00.10 (tcsh)
13841 5677 0 0 -5 0 1056 0 getnew DV+ p6 0:00.00 (tcsh)
0 5360 0 0 3 0 112 0 ttyin Is+ C0 0:00.01 (getty)
albatross# gdb -q netbsd.1
(no debugging symbols found)...(gdb)
(gdb) target kcore netbsd.1.core
#0 0xfffffc0000473958 in dumpsys ()
(gdb) where
#0 0xfffffc0000473958 in dumpsys ()
#1 0xfffffc00004735a0 in cpu_reboot ()
#2 0xfffffc0000311b9c in db_reboot_cmd ()
#3 0xfffffc0000311614 in db_command ()
#4 0xfffffc00003118cc in db_command_loop ()
#5 0xfffffc0000315ad4 in db_trap ()
#6 0xfffffc000048091c in ddb_trap ()
#7 0xfffffc000047b5fc in trap ()
#8 0xfffffc0000300484 in XentIF ()
#9 0xfffffc000030c9c8 in comintr ()
warning: Hit beginning of text section without finding
warning: enclosing function for address 0x4
This warning occurs if you are debugging a function without any symbols
(for example, in a stripped executable). In that case, you may wish to
increase the size of the search with the `set heuristic-fence-post' command.
Otherwise, you told GDB there was a function where there isn't one, or
(more likely) you have encountered a bug in GDB.
(gdb) p curproc
$1 = 0
(gdb) help proc
Set current context from proc address
(gdb) ^Z
Suspended
albatross# locate gdbscri
/usr/src/sys/gdbscripts
/usr/src/sys/gdbscripts/bdump
/usr/src/sys/gdbscripts/CVS
/usr/src/sys/gdbscripts/CVS/Entries
/usr/src/sys/gdbscripts/CVS/Repository
/usr/src/sys/gdbscripts/CVS/Root
/usr/src/sys/gdbscripts/CVS/Tag
/usr/src/sys/gdbscripts/kdump
/usr/src/sys/gdbscripts/pgrpdump
/usr/src/sys/gdbscripts/stats
/usr/src/sys/gdbscripts/vchain
/usr/src/sys/gdbscripts/vdump
/usr/src/sys/gdbscripts/xps
albatross# fg
gdb /var/crash/netbsd.1
(gdb)
(gdb) source xps
(gdb) xps
Attempt to extract a component of a value that is not a structure.
(gdb) p allproc
$1 = 133653960
(gdb) p allproc.lh_first
Attempt to extract a component of a value that is not a structure.
(gdb) quit
albatross# pwd
/usr/src/sys/gdbscripts
albatross# cd conf
/sys/arch/alpha/conf
albatross# ed ALBATROSS
9373
...
a
makeoptions DEBUG="-g"
.
w
9396
q
albatross# config ALBATROSS
Don't forget to run "make depend"
albatross# cd compile/!$
cd compile/ALBATROSS
/sys/arch/alpha/compile/ALBATROSS
albatross# ls -l list*
-rw-r--r-- 1 root wheel 84186 Aug 10 10:42 list.1
-rw-r--r-- 1 root wheel 3268 Aug 22 15:58 list.2
-rw-r--r-- 1 root wheel 14632 Sep 11 19:25 list.3
-rw-r--r-- 1 root wheel 89736 Sep 11 20:41 list.4
albatross# (nice make clean all; time) >& list.5 & tail -f list.5
[2] 309
...
ld -N -Ttext fffffc0000300000 -e __transfer -G 4 -X -o netbsd ${SYSTEM_OBJ} vers.o
text data bss dec hex filename
1854492 142960 223936 2221388 21e54c netbsd
cp netbsd netbsd.gdb
strip -g -X -x netbsd
453.7u 62.9s 9:39.00 89.2% 0+0k 4433+17199io 544pf+0w
^C
[2] - Done ( nice make clean all; time ) >& list.5
albatross# pwd
/usr/src/sys/arch/alpha/compile/ALBATROSS
albatross# ls -l netbsd*
-rwxr-xr-x 1 root wheel 2181479 Sep 12 14:56 netbsd*
-rwxr-xr-x 1 root wheel 12640465 Sep 12 14:56 netbsd.gdb*
albatross# cd /usr/src/sys/gdbscripts
albatross# gdb -q /sys/arch/alpha/compile/ALBATROSS/netbsd.gdb
(gdb) target kcore /var/crash/netbsd.1.core
#0 0xfffffc0000473958 in dumpsys ()
at ../../../../arch/alpha/alpha/machdep.c:1444
1444 savectx(&dumppcb);
(gdb) source xps
(gdb) xps
proc pid flag st wchan comm
0xfffffc0007f765c8 5678 4 3 0xfffffc0006379c68 portmap (inode)
0xfffffc0007f76010 5677 16 3 0xfffffc000050b8e8 tcsh (getnewbuf)
0xfffffc0007f76998 5676 4 3 0xfffffc000050b8e8 inetd (getnewbuf)
0xfffffc0007f767b0 5675 4 3 0xfffffc000050b8e8 portmap (getnewbuf)
0xfffffc00064e5ca0 5623 4006 3 0xfffffc000050b8e8 bonnie (getnewbuf)
0xfffffc0007f761f8 5363 4006 3 0xfffffc0007f761f8 tcsh (ppwait)
0xfffffc0007f763e0 5361 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc000237e5b8 5360 4086 3 0xfffffc000235e010 getty (ttyin)
0xfffffc00064e56e8 5332 84 3 0xfffffe0000128840 lfs_cleanerd (segment)
0xfffffc00064e5ab8 5331 84 3 0xfffffc00064e5ab8 lfs_cleanerd (wait)
0xfffffc00064e5318 5313 4086 3 0xfffffe00107761a0 csh (pause)
0xfffffc00064e47a8 5309 4086 3 0xfffffe00107661a0 tcsh (pause)
0xfffffc00064e5500 5307 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc00064e4d60 5304 4106 3 0xfffffc00049cdb08 systat (inode)
0xfffffc00064e58d0 5300 4086 3 0xfffffe001076e1a0 tcsh (pause)
0xfffffc00064e4b78 5298 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc00064e4f48 1153 4086 3 0xfffffc000235e6c8 tcsh (ttyin)
0xfffffc00064e5130 1151 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc00064e4990 288 4086 3 0xfffffc000235e570 csh (ttyin)
0xfffffc00064e45c0 252 4086 3 0xfffffc000235e2c0 csh (ttyin)
0xfffffc00064e43d8 248 4086 3 0xfffffe001074e1a0 tcsh (pause)
0xfffffc00064e4008 246 84 3 0xfffffc000050a7c8 sshd1 (select)
---Type <return> to continue, or q <return> to quit---
0xfffffc00064e41f0 245 4106 3 0xfffffc000050b8e8 systat (getnewbuf)
0xfffffc000237f6e0 241 4086 3 0xfffffe00107421a0 tcsh (pause)
0xfffffc000237fc98 239 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc000237fab0 190 4086 3 0xfffffe00107321a0 tcsh (pause)
0xfffffc000237ed58 188 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc000237f8c8 185 84 3 0xfffffc000050a7c8 sshd1 (select)
0xfffffc000237f4f8 184 4005 3 0xfffffc000050b8e8 setiathome (getnewbuf)
0xfffffc000237f310 163 84 3 0xfffffc000050a7c8 inetd (select)
0xfffffc000237f128 160 104 3 0xfffffc000237c058 cron (inode)
0xfffffc000237e988 158 4 3 0xfffffc000050b8e8 update (getnewbuf)
0xfffffc000237ef40 126 84 3 0xfffffc00056f2388 mount_mfs (mfsidl)
0xfffffc000237e7a0 122 4 3 0xfffffc0005f38648 portmap (inode)
0xfffffc000237eb70 118 84 2 0 syslogd (select)
0xfffffc000237e3d0 3 20204 3 0xfffffc0000509a38 reaper (reaper)
0xfffffc000237e1e8 2 20204 3 0xfffffc000051c5e8 pagedaemon (daemon_slp)
0xfffffc000237e000 1 4084 3 0xfffffc000237e000 init (wait)
0xfffffc00005074f8 0 20204 3 0xfffffc00005074f8 swapper (scheduler)
proc pid flag st wchan comm
(gdb) proc 0xfffffc00064e5ab8
(gdb) where
#0 0xfffffc0000336948 in mi_switch () at ../../../../kern/kern_synch.c:632
#1 0xfffffc0000335fc0 in tsleep (ident=0x0, priority=288,
wmesg=0xfffffc000048d0e5 "wait", timo=0)
at ../../../../kern/kern_synch.c:370
#2 0xfffffc000032b45c in sys_wait4 (q=0xfffffc00064e5ab8,
v=0xfffffe0010785e88, retval=0xfffffe0010785ed8)
at ../../../../kern/kern_exit.c:508
#3 0xfffffc000047bbcc in syscall (code=7, framep=0xfffffe0010785ef8)
at ../../../../arch/alpha/alpha/trap.c:664
#4 0xfffffc0000300540 in XentSys ()
at ../../../../arch/alpha/alpha/locore.s:581
warning: Hit heuristic-fence-post without finding
warning: enclosing function for address 0x16024bd88
This warning occurs if you are debugging a function without any symbols
(for example, in a stripped executable). In that case, you may wish to
increase the size of the search with the `set heuristic-fence-post' command.
Otherwise, you told GDB there was a function where there isn't one, or
(more likely) you have encountered a bug in GDB.
(gdb) proc 0xfffffc00064e56e8
(gdb) where
#0 0xfffffc0000336948 in mi_switch () at ../../../../kern/kern_synch.c:632
#1 0xfffffc0000335fc0 in tsleep (ident=0x0, priority=306,
wmesg=0xfffffc0000494209 "segment", timo=307377)
at ../../../../kern/kern_synch.c:370
#2 0xfffffc000040dcc0 in lfs_segwait (p=0x0, v=0x0, retval=0x0)
at ../../../../ufs/lfs/lfs_syscalls.c:862
#3 0xfffffc000047bbcc in syscall (code=187, framep=0xfffffe001077def8)
at ../../../../arch/alpha/alpha/trap.c:664
#4 0xfffffc0000300540 in XentSys ()
at ../../../../arch/alpha/alpha/locore.s:581
warning: Hit heuristic-fence-post without finding
warning: enclosing function for address 0x120001c18
(gdb) up
#1 0xfffffc0000335fc0 in tsleep (ident=0x0, priority=306,
wmesg=0xfffffc0000494209 "segment", timo=307377)
at ../../../../kern/kern_synch.c:370
370 mi_switch();
(gdb) up
#2 0xfffffc000040dcc0 in lfs_segwait (p=0x0, v=0x0, retval=0x0)
at ../../../../ufs/lfs/lfs_syscalls.c:862
862 error = tsleep(addr, PCATCH | PUSER, "segment", timeout);
(gdb) l
857 timeout = hzto(&atv);
858 splx(s);
859 } else
860 timeout = 0;
861
862 error = tsleep(addr, PCATCH | PUSER, "segment", timeout);
863 return (error == ERESTART ? EINTR : 0);
864 }
865
866 /*
(gdb) up
#3 0xfffffc000047bbcc in syscall (code=187, framep=0xfffffe001077def8)
at ../../../../arch/alpha/alpha/trap.c:664
664 error = (*callp->sy_call)(p, args + hidden, rval);
(gdb) l
659 scdebug_call(p, code, args + hidden);
660 #endif
661 if (error == 0) {
662 rval[0] = 0;
663 rval[1] = 0;
664 error = (*callp->sy_call)(p, args + hidden, rval);
665 }
666
667 switch (error) {
668 case 0:
(gdb) up
#4 0xfffffc0000300540 in XentSys ()
at ../../../../arch/alpha/alpha/locore.s:581
../../../../arch/alpha/alpha/locore.s:581: No such file or directory.
Current language: auto; currently asm
(gdb) quit
albatross# ^Dexit
albatross% ^Dlogout
Connection to albatross closed.
snylteveps% ^Dexit
Script done on Sun Sep 12 15:10:06 1999
The session doing the test follows:
Script started on Sun Sep 12 13:11:21 1999
snylteveps% ssh albatross
Last login: Sun Sep 12 13:14:41 1999 from 129.241.196.66
NetBSD 1.4.1 (ALBATROSS) #3: Sat Sep 11 20:41:18 MEST 1999
Welcome to NetBSD!
No mail.
albatross% su
albatross# df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/wd0a 49711 27202 20023 57% /
/dev/wd0d 1485749 926050 485411 65% /usr
/dev/wd0f 507750 2932 479430 0% /var
mfs:126 96655 2 91820 0% /tmp
/dev/wd0g 6059635 1169479 4587174 20% /local
kernfs 1 1 0 100% /kern
/dev/sd0g 6122066 1 5815961 0% /spare1
/dev/sd2c 4058244 1 3855330 0% /spare2
/dev/sd0a 49840 20543 26805 43% /altroot
/dev/sd0d 1488395 146002 1267973 10% /altroot/usr
/dev/sd0f 507928 200 482331 0% /altroot/var
albatross# ed /etc/fstab
361
/spare1/
/dev/sd0g /spare1 ffs rw 1 2
s/^/# /p
# /dev/sd0g /spare1 ffs rw 1 2
w
363
q
albatross# umount /spare1
albatross# newfs_lfs /dev/rsd0g
Using 1 single indirect block(s) for inode 1
albatross# mount -t lfs /dev/sd0g /spare1
albatross# cd /spare1
albatross# df .
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/sd0g 6317056 233 5685117 0% /spare1
albatross# mount | egrep spare1
/dev/sd0g on /spare1 type lfs (local)
albatross# bonnie -s 1000
File './Bonnie.5338', size: 1048576000
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
1000 4267 34.7 4354 20.1 3234 17.1 9895 53.8 10035 13.5 111.1 4.2
albatross# df .
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/sd0g 6317056 9662 5675688 0% /spare1
albatross# ls -l
total 129
-rw------- 1 root wheel 122880 Sep 12 13:35 ifile
drwx------ 2 root wheel 512 Sep 12 13:16 lost+found/
albatross# du .
1 ./lost+found
130 .
albatross# bonnie -s 2000
File './Bonnie.5450', size: 2097152000
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
2000 3819 30.9 4402 20.6 2861 15.4 10369 58.2 10405 14.4 103.3 3.4
albatross# df .
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/sd0g 6317056 8052 5677298 0% /spare1
albatross# du .
1 ./lost+found
130 .
albatross# ls -ls
total 129
128 -rw------- 1 root wheel 122880 Sep 12 14:11 ifile
1 drwx------ 2 root wheel 512 Sep 12 13:16 lost+found/
albatross# bonnie -s 3000
File './Bonnie.5619', size: -1149239296
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...^C
albatross# bonnie -s 2000
File './Bonnie.5623', size: 2097152000
Writing with putc()...done
Rewriting...^C^C^CRead from remote host albatross: Connection reset by peer
Connection to albatross closed.
snylteveps% ^Dexit
Script done on Sun Sep 12 14:27:20 1999
and the console output was
Script started on Sun Sep 12 13:10:06 1999
snylteveps% ssh junk1
Last login: Fri Sep 10 18:23:02 1999 from vader.runit.sint
NetBSD 1.3H (GENERIC) #4: Tue Oct 6 23:45:26 MEST 1998
Welcome to NetBSD!
No mail.
% tip tty1
can't open log file /var/log/aculog.
connected
NetBSD/alpha (albatross.runit.sintef.no) (console)
login: Sep 12 13:15:14 albatross su: he to root on /dev/ttyp4
root
Last login: Thu Aug 12 22:46:44 1999 on console
Sep 12 13:20:55 albatross login: ROOT LOGIN (root) ON console
Copyright (c) 1996, 1997, 1998, 1999
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1980, 1983, 1986, 1988, 1990, 1991, 1993, 1994
The Regents of the University of California. All rights reserved.
NetBSD 1.4.1 (ALBATROSS) #3: Sat Sep 11 20:41:18 MEST 1999
Welcome to NetBSD!
You have new mail.
Sep 12 13:20:55 albatross login: ROOT LOGIN (root) ON console
Terminal type? [vt100]
Don't login as root, use the su command.
albatross# Sep 12 13:21:06 albatross portmap[5358]: connect from 129.241.100.134 to getport(rstatd)
albatross# exit
albatross# logout
NetBSD/alpha (albatross.runit.sintef.no) (console)
login: Sep 12 13:30:54 albatross su: he to root on /dev/ttyp6
NetBSD/alpha (albatross.runit.sintef.no) (console)
login: de0: abnormal interrupt: transmit underflow (raising TX threshold to 96|256)
~Stopped at Debugger+0x4: ret zero,(ra)
db> trace
db> reboot 104
dumping to dev 4,1 offset 525039
dump 256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 succeeded
rebooting...
halted CPU 0
halt code = 5
HALT instruction executed
PC = fffffc0000300140
CPU 0 booting
(boot dqa0.0.0.11.0 -flags a)
block 0 of dqa0.0.0.11.0 is a valid boot block
reading 14 blocks from dqa0.0.0.11.0
bootstrap code read in
base = 180000, image_start = 0, image_bytes = 1c00
initializing HWRPB at 2000
initializing page table at 172000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
NetBSD/Alpha 1.4 FFS Primary Bootstrap
Jumping to entry point...
NetBSD/alpha 1.4 Secondary Bootstrap, Revision 1.10
(ross@netbsd.org, May 8 11:34:00 PDT 1999)
VMS PAL rev: 0x1000900010114
OSF PAL rev: 0x1000800020116
Switch to OSF PAL code succeeded.
Boot flags: a
Loading netbsd...
1997456+223936+125472+[66833]
Entering netbsd at 0xfffffc0000301100...
[ preserving 193296 bytes of netbsd ELF symbol table ]
Copyright (c) 1996, 1997, 1998, 1999
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 1.4.1 (ALBATROSS) #3: Sat Sep 11 20:41:18 MEST 1999
he@albatross.runit.sintef.no:/usr/src/sys/arch/alpha/compile/ALBATROSS
Digital AlphaPC 164 500 MHz
8192 byte page size, 1 processor.
real mem = 268435456 (2531328 reserved for PROM, 265904128 used by NetBSD)
avail mem = 230752256
using 3245 buffers containing 26583040 bytes of memory
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21164A-2 (pass 2)
cia0 at mainbus0: DECchip 2117x Core Logic Chipset (ALCOR/ALCOR2), pass 3
cia0: extended capabilities: 21<DWEN,BWEN>
cia0: using BWX for PCI config access
pci0 at cia0 bus 0
pci0: i/o enabled, memory enabled
isp0 at pci0 dev 5 function 0
isp0: interrupting at eb164 irq 2
isp0: Ultra Mode Capable
isp0: Board Revision 1040B, loaded F/W Revision 7.55
isp0: Last F/W revision was 7.55
isp0: driver initiated bus reset
scsibus0 at isp0: 16 targets, 8 luns per target
isp0: Target 0 Async Mode
sd0 at scsibus0 targ 0 lun 0: <IBM, DGHS09U, 0350> SCSI3 0/direct fixed
sd0: 8748MB, 8152 cyl, 10 head, 219 sec, 512 bytes/sect x 17916240 sectors
ncr0 at pci0 dev 7 function 0: ncr 53c860 fast20 scsi
ncr0: interrupting at eb164 irq 1
ncr0: minsync=12, maxsync=137, maxoffs=8, 16 dwords burst, normal dma fifo
ncr0: single-ended, open drain IRQ driver
ncr0: restart (scsi reset).
scsibus1 at ncr0: 8 targets, 8 luns per target
sd1 at scsibus1 targ 1 lun 0: <DEC, RZ29B (C) DEC, 0007> SCSI2 0/direct fixed
sd1(ncr0:1:0): 10.0 MB/s (100 ns, offset 8)
sd1: 4091MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
sd2 at scsibus1 targ 3 lun 0: <DEC, RZ29B (C) DEC, 0007> SCSI2 0/direct fixed
sd2(ncr0:3:0): 10.0 MB/s (100 ns, offset 8)
sd2: 4091MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
sio0 at pci0 dev 8 function 0: Intel 82378ZB System I/O (SIO) (rev. 0x43)
de0 at pci0 dev 9 function 0
de0: interrupting at eb164 irq 3
de0: 21140 [10-100Mb/s] pass 1.2
de0: address 00:c0:f0:16:11:ca
de0: enabling 10baseT port
pciide0 at pci0 dev 11 function 0: CMD Technology PCI0646
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-370910>
wd0: drive supports 16-sector pio transfers, lba addressing
wd0: 8693MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 17803440 sectors
wd0: 32-bits data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4
pciide0: secondary channel wired to compatibility mode
pciide0: secondary channel ignored (disabled)
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
isa0 at sio0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
lpt0 at isa0 port 0x3bc-0x3bf irq 7
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
root on wd0a dumps on wd0b
root file system type: ffs
swapctl: adding /dev/wd0b as swap device at priority 0
Automatic boot in progress: starting file system checks.
/dev/rwd0a: 884 files, 27202 used, 22509 free (245 frags, 2783 blocks, 0.5% fragmentation)
/dev/rwd0a: MARKING FILE SYSTEM CLEAN
/dev/rsd0a: 861 files, 20543 used, 29297 free (57 frags, 3655 blocks, 0.1% fragmentation)
/dev/rsd0a: MARKING FILE SYSTEM CLEAN
isp0: Target 0 at 20MHz Max Offset 8, 16 bit wide, Tagged Queueing Enabled
/dev/rsd0d: 8986 files, 146002 used, 1342393 free (6489 frags, 166988 blocks, 0.4% fragmentation)
/dev/rsd0d: MARKING FILE SYSTEM CLEAN
/dev/rsd0f: 87 files, 200 used, 507728 free (184 frags, 63443 blocks, 0.0% fragmentation)
/dev/rsd0f: MARKING FILE SYSTEM CLEAN
/dev/rsd2c: 1 files, 1 used, 4058243 free (11 frags, 507279 blocks, 0.0% fragmentation)
/dev/rsd2c: MARKING FILE SYSTEM CLEAN
/dev/rwd0d: 95090 files, 926050 used, 559699 free (35587 frags, 65514 blocks, 2.4% fragmentation)
/dev/rwd0d: MARKING FILE SYSTEM CLEAN
/dev/rwd0f: 295 files, 2913 used, 504837 free (349 frags, 63061 blocks, 0.1% fragmentation)
/dev/rwd0f: MARKING FILE SYSTEM CLEAN
/dev/rwd0g: UNREF FILE I=7944 OWNER=he MODE=100644
/dev/rwd0g: SIZE=406 MTIME=Sep 12 05:35 1999 (CLEARED)
/dev/rwd0g: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED)
/dev/rwd0g: BLK(S) MISSING IN BIT MAPS (SALVAGED)
/dev/rwd0g: SUMMARY INFORMATION BAD (SALVAGED)
/dev/rwd0g: 64 files, 1169478 used, 4890157 free (69 frags, 611261 blocks, 0.0% fragmentation)
/dev/rwd0g: MARKING FILE SYSTEM CLEAN
setting tty flags
starting network
hostname: albatross.runit.sintef.no
configuring network interfaces:de0: enabling Full Duplex 100baseTX port
de0.
add net default: gateway 129.241.249.7
adding interface aliases:
starting system logger
checking for core dump...
savecore: reboot
savecore: system went down at Sun Sep 12 14:27:07 1999
savecore: writing core to /var/crash/netbsd.1.core
Sep 12 14:29:49 albatross savecore: reboot
savecore: writing kernel to /var/crash/netbsd.1
starting rpc daemons: portmap.
starting nfs daemons:.
setting securelevel: kern.securelevel: 0 -> 1
checking quotas: done.
building databases...
clearing /tmp
updating motd.
standard daemons: update cron.
starting network daemons: inetd.
starting local daemons: sshd.
Sun Sep 12 14:30:36 MEST 1999
NetBSD/alpha (albatross.runit.sintef.no) (console)
login: ~.
Connection to junk1 closed.
snylteveps% ^Dexit
Script done on Sun Sep 12 14:28:33 1999
>How-To-Repeat:
See above.
>Fix:
Sorry, too much "Black Magic" for me. However, I'll be happy
to assist in digging out any additional information from the
crash dumps and the file systems (which are still untouched
after the tests), and to test kernel modifications to fix this
problem.
>Audit-Trail:
>Unformatted: