NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/38471: Raidframe crashes on reconstruction of RAID5 (5 disks @ 298GB)
>Number: 38471
>Category: bin
>Synopsis: Raidframe crashes on reconstruction of RAID5 (5 disks @ 298GB)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 20 19:20:00 +0000 2008
>Originator: Thomas Feddersen
>Release: 4.0_BETA2
>Organization:
Dipl.-Ing Thomas Feddersen, Beratender Ingenieur
>Environment:
NetBSD bremen 4.0_BETA2 NetBSD 4.0_BETA2 (XEN3_DOM0) #0: Thu Mar 1 04:57:05
PST 2007
builds@wb33:/home/builds/ab/netbsd-4/i386/200703010002Z-obj/home/builds/ab/netbsd-4/src/sys/arch/i386/compile/XEN3_DOM0
i386
>Description:
One disk of the raid-set has sectors pending.
Apr 20 18:09:26 bremen smartd[201]: Device: /dev/wd5d, 1 Currently unreadable
(pending) sectors
Apr 20 18:09:26 bremen smartd[201]: Device: /dev/wd5d, 1 Offline uncorrectable
sectors
I attempted an in-place reconstruction of the drive in question, but
reconstruction lets the system drop into debugger.
Second attempt by adding a spare component and reconstructing to the spare has
the same effect.
The problem occurs with XEN3 (128MB total memory) or i386 (1GB total Memory)
kernel likewise.
Need assistance to restore the redundancy of the raid set.
>How-To-Repeat:
set up a raid5 set with 5 drives of 298 GB and fail one drive:
bremen# raidctl -s raid0
Components:
/dev/wd1a: optimal
/dev/wd2a: optimal
/dev/wd3a: optimal
/dev/wd4a: optimal
/dev/wd5a: failed
No spares.
Component label for /dev/wd1a:
Row: 0, Column: 0, Num Rows: 1, Num Columns: 5
Version: 2, Serial Number: 2007020501, Mod Counter: 1543
Clean: No, Status: 0
sectPerSU: 16, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 625142320
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid0
Component label for /dev/wd2a:
Row: 0, Column: 1, Num Rows: 1, Num Columns: 5
Version: 2, Serial Number: 2007020501, Mod Counter: 1543
Clean: No, Status: 0
sectPerSU: 16, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 625142320
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid0
Component label for /dev/wd3a:
Row: 0, Column: 2, Num Rows: 1, Num Columns: 5
Version: 2, Serial Number: 2007020501, Mod Counter: 1543
Clean: No, Status: 0
sectPerSU: 16, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 625142320
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid0
Component label for /dev/wd4a:
Row: 0, Column: 3, Num Rows: 1, Num Columns: 5
Version: 2, Serial Number: 2007020501, Mod Counter: 1543
Clean: No, Status: 0
sectPerSU: 16, SUsPerPU: 1, SUsPerRU: 1
Queue size: 100, blocksize: 512, numBlocks: 625142320
RAID Level: 5
Autoconfig: Yes
Root partition: No
Last configured as: raid0
/dev/wd5a status is: failed. Skipping label.
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
Assign plenty of swapspace:
bremen# swapctl -lh
Device Size Used Avail Capacity Priority
/dev/wd0b 129M 0B 129M 0% 0
/home/fed/swapfile 4.9G 0B 4.9G 0% 0
Total 5.0G 0B 5.0G
Give the command for in-place reconstruction:
bremen# raidctl -R /dev/wd5a raid0
the system will drop into debugger (green letters on console):
raid0: initiating in-place reconstruction on column 4
panic: malloc: out of space in kmem_map
stopped in pid 885.1 (raid_reconip) at netbsd:cpu_Debugger+0x4: popl
%
ebp
db>
the only way out is to reboot the system and operate the raid set in degraded
mode
>Fix:
not known.
according to thread "NetBSD-users: Problem with raidframe under NetBSD-3 and
NetBSD-4" the system must be taken out of operation.
Home |
Main Index |
Thread Index |
Old Index