NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
fssconfig/raidframe/dump-related crashes
My 7-stable/amd64 server crashes nearly every night while my backup
routine is in progress. There's no backtrace and no crash dump is saved,
but the console reads:
> ohci1: 1 scheduling overruns
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: WARNING: addr 0x012c0000 not found
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: WARNING: addr 0x01d50000 not found
> ohci1: WARNING: addr 0x01d70000 not found
> ohci1: WARNING: addr 0x01d60000 not found
> ohci1: WARNING: addr 0x012c0000 not found
> ohci1: WARNING: addr 0x01cf0000 not found
> ohci1: 44 scheduling overruns
> [more of these]
> ohci1: 46 scheduling overruns
before it reboots (sometimes hangs instead).
Probably, these messages are /not/ traces of the root cause, since the
machine will also crash with a kernel with no ohci support compiled in
whatsoever - the crashes are silent, then.
It happens while dump(8)ing an in-filesystem fss(4)-snapshot of an empty-ish
FFSv1 (fslevel 4) filesystem sitting on a raid(4)-1 with two components.
There should not, conceptually, be a problem with dumping a fss device,
right?
The command my script runs to create the snapshot is
# fssconfig -cx fss0 /stor /stor/snapshot
and the dump
# dump -$lvl -uant -h 0 -L "$nam" -f - /dev/rfss0 >/tmp/dumpfifo
where /tmp/dumpfifo is a fifo from which
# gzip -1 </tmp/dumpfifo >/var/tmp/dump.gz
reads. (I don't remember the reason for going via a fifo, but there
was one...)
Any suggestions where I could start looking? So far, I've tried running
a DEBUG kernel but that didn't provide additional information.
The filesystem is clean as far as fsck_ffs is concerned, too.
Here's some information on the filesystem:
# mount -v | grep /stor
/dev/raid0g on /stor type ffs (log, noatime, local, fsid: 0x1206/0x78b, reads: sync 8489 async 0, writes: sync 0 async 1791)
# df -h /stor
Filesystem Size Used Avail %Cap Mounted on
/dev/raid0g 416G 19G 376G 4% /stor
# dumpfs -s /stor
file system: /dev/rraid0g
format FFSv1
endian little-endian
magic 11954 time Fri Mar 11 05:55:54 2016
superblock location 8192 id [ 564b7b58 793a9223 ]
cylgrp dynamic inodes 4.4BSD sblock FFSv2 fslevel 4
nbfree 12996876 ndir 56002 nifree 26708703 nffree 4718
ncg 580 size 109891568 blocks 109028517
bsize 32768 shift 15 mask 0xffff8000
fsize 4096 shift 12 mask 0xfffff000
frag 8 shift 3 fsbtodb 3
bpg 23684 fpg 189472 ipg 47104
minfree 5% optim time maxcontig 2 maxbpg 8192
symlinklen 60 contigsumsize 2
maxfilesize 0x004002001005ffff
nindir 8192 inopb 256
avgfilesize 16384 avgfpdir 64
sblkno 8 cblkno 16 iblkno 24 dblkno 1496
sbsize 4096 cgsize 32768
csaddr 1496 cssize 12288
cgrotor 0 fmod 0 ronly 0 clean 0x02
wapbl version 0x1 location 2 flags 0x0
wapbl loc0 439587072 loc1 131072 loc2 512 loc3 3
flags wapbl
fsmnt /stor
volname swuid 0
# raidctl -sv raid0
Components:
/dev/wd0a: optimal
/dev/wd1a: optimal
No spares.
Component label for /dev/wd0a:
Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
Version: 2, Serial Number: 2015111701, Mod Counter: 1213
Clean: No, Status: 0
sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 913211264
RAID Level: 1
Autoconfig: Yes
Root partition: Force
Last configured as: raid0
Component label for /dev/wd1a:
Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
Version: 2, Serial Number: 2015111701, Mod Counter: 1213
Clean: No, Status: 0
sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 913211264
RAID Level: 1
Autoconfig: Yes
Root partition: Force
Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
# exit
Home |
Main Index |
Thread Index |
Old Index