Subject: kern/5916: running a kernel with memory disk support spoils subsequent kernel on sparc
To: None <gnats-bugs@gnats.netbsd.org>
From: None <jbernard@ox.mines.edu>
List: netbsd-bugs
Date: 08/05/1998 13:54:16
>Number: 5916
>Category: kern
>Synopsis: running a kernel with memory disk support spoils subsequent kernel on sparc
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Aug 5 13:05:00 1998
>Last-Modified:
>Originator: Jim Bernard
>Organization:
Speaking for myself
>Release: July 26, 1998
>Environment:
System: NetBSD spud 1.3F NetBSD 1.3F (SPUD) #0: Tue Jul 28 15:43:32 MDT 1998 jim@roo:/home/local/compile/sys/arch/sparc/compile/SPUD sparc
>Description:
If a kernel with memory disk support is booted on a sparc (1),
later boots of other kernels exhibit strange behavior, at least
with respect to execution of shell scripts. A workaround is to
cycle power before booting the other kernel (it is not sufficient
to do a PROM "reset").
The specific failure seen is that scripts (including shell startup
scripts) evidently attempt to execute some (but not necessarily
all) variable-setting statements as commands (seen with both sh
and bash), which then fail.
This is likely to be more than a little disturbing to users who
install using a floppy boot disk and then expect the subsequently
installed system to work correctly (unless they happen to turn off
the power before booting the installed system).
>How-To-Repeat:
* For the system on which this was observed, dmesg reports
(though I expect the problem is not unique to this system):
real mem = 8335360
avail mem = 6238208
using 101 buffers containing 413696 bytes of memory
bootpath: /sbus0/esp0/sd@0,0
mainbus0 (root): Sun 4/60
cpu0 at mainbus0: MB86900/1A or L64801 @ 20 MHz, WTL3170/2 FPU
cpu0: 64K byte write-through, 16 bytes/line, sw flush: cache enabled
memreg0 at mainbus0 ioaddr 0xf4000000
clock0 at mainbus0 ioaddr 0xf2000000: mk48t02 (eeprom)
timer0 at mainbus0 ioaddr 0xf3000000 ipl 10 delay constant 7
auxreg0 at mainbus0 ioaddr 0xf7400000
zs0 at mainbus0 ioaddr 0xf1000000 ipl 12 softpri 6
zstty0 at zs0 channel 0
zstty1 at zs0 channel 1
zs1 at mainbus0 ioaddr 0xf0000000 ipl 12 softpri 6
kbd0 at zs1 channel 0 (console)
ms0 at zs1 channel 1
fdc0 at mainbus0 ioaddr 0xf7200000 ipl 11 softpri 4: chip 82072
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
audioamd0 at mainbus0 ioaddr 0xf7201000 ipl 13 softpri 4
audio0 at audioamd0
sbus0 at mainbus0 ioaddr 0xf8000000: clock = 25 MHz
dma0 at sbus0 slot 0 offset 0x400000: rev 1
esp0 at sbus0 slot 0 offset 0x800000 level 3: ESP100, 25MHz, SCSI ID 7
scsibus0 at esp0: 8 targets
sd0 at scsibus0 targ 3 lun 0: <HITACHI, DK515C, CP15> SCSI1 0/direct fixed
sd0: 639MB, 1361 cyl, 14 head, 68 sec, 512 bytes/sect x 1309896 sectors
st0 at scsibus0 targ 4 lun 0: <ARCHIVE, VIPER 1500 21247, 2.2G> SCSI2 1/sequential removable
st0: drive empty
le0 at sbus0 slot 0 offset 0xc00000 level 5: address 08:00:20:08:b8:e7
le0: 8 receive buffers, 2 transmit buffers
bwtwo0 at sbus0 slot 3 offset 0x0 level 7: SUNW,501-1455, 1152 x 900 (console)
bwtwo0: attached to /dev/fb
root on sd0a dumps on sd0b
root file system type: ffs
* Boot any kernel containing memory disk support (with or without
an actual ramdisk image present); an INSTALL kernel will do (I
can provide a floppy image, compiled from July 26, 1998 sources),
or just add the MD support to your usual kernel config file by
adding:
options MEMORY_DISK_HOOKS
options MEMORY_DISK_IS_ROOT # force root on memory disk
options MEMORY_DISK_SERVER=0 # no userspace memory disk support
options MINIROOTSIZE=3168 # 1.44M * 1.1
pseudo-device md 1 # memory disk device (ramdisk)
and build a kernel with that.
* Shut down the MD kernel and boot a normal kernel without cycling
power (i.e., do a prom "reset" or just "boot").
* Log in as root with shell /bin/csh, no .cshrc or .login files
present (they can be present, but removing them simplifies the
situation). I also moved .profile out of the way.
* Try to execute a script, e.g.:
#! /bin/sh
date
* Watch that fail with, e.g.:
SHELL=/bin/csh: Can't open SHELL=/bin/csh
or
HOME=/root: Can't open HOME=/root
(The exact error may vary, depending on unknown factors, but
it always refers to an environment variable assignment, and it
always appears to be complaining about opening the assignment.)
* To show what's happening more clearly, here are ktrace outputs
from execution of the script above (named "xxx") after booting
a regular kernel from a power-off state, and after booting the
same kernel after having previously booted a kernel generated
from the same config file but with the MD-support lines shown
above added. The first difference is flagged.
[After boot from power-off state:]
210 ktrace RET ktrace 0
210 ktrace CALL execve(0xeffffce7,0xeffffcb8,0xeffffcc0)
210 ktrace NAMI "./xxx"
210 ktrace NAMI "/bin/sh"
210 sh EMUL "netbsd"
210 sh RET execve JUSTRETURN
210 sh CALL getpid
210 sh RET getpid 210/0xd2
210 sh CALL geteuid
210 sh RET geteuid 0
210 sh CALL __sysctl(0xeffffa00,0x2,0x5dd30,0xeffff9fc,0,0)
210 sh RET __sysctl 0
210 sh CALL break(0x5e9ac)
210 sh RET break 0
210 sh CALL break(0x5effc)
210 sh RET break 0
210 sh CALL break(0x5fffc)
210 sh RET break 0
210 sh CALL open(0xeffffce8,0,0x3d)
^^^^^^^^^^
210 sh NAMI "./xxx"
210 sh RET open 3
210 sh CALL fcntl(0x3,0,0xa)
210 sh RET fcntl 10/0xa
(remainder omitted)
[After booting same kernel after previously running MD kernel:]
237 ktrace RET ktrace 0
237 ktrace CALL execve(0xeffffce7,0xeffffcb8,0xeffffcc0)
237 ktrace NAMI "./xxx"
237 ktrace NAMI "/bin/sh"
237 sh EMUL "netbsd"
237 sh RET execve JUSTRETURN
237 sh CALL getpid
237 sh RET getpid 237/0xed
237 sh CALL geteuid
237 sh RET geteuid 0
237 sh CALL __sysctl(0xeffffa00,0x2,0x5dd30,0xeffff9fc,0,0)
237 sh RET __sysctl 0
237 sh CALL break(0x5e9ac)
237 sh RET break 0
237 sh CALL break(0x5effc)
237 sh RET break 0
237 sh CALL break(0x5fffc)
237 sh RET break 0
237 sh CALL open(0xeffffcf9,0,0x3d)
^^^^^^^^^^
237 sh NAMI "SHELL=/bin/csh"
237 sh RET open -1 errno 2 No such file or directory
237 sh CALL break(0x60ffc)
237 sh RET break 0
237 sh CALL write(0x2,0x60000,0x2a)
237 sh GIO fd 2 wrote 42 bytes
"SHELL=/bin/csh: Can't open SHELL=/bin/csh
"
237 sh RET write 42/0x2a
237 sh CALL exit(0x2)
>Fix:
None known, but cycling power is a workaround.
>Audit-Trail:
>Unformatted: