Subject: Re: Why my life is sucking.
To: Herb Peyerl <hpeyerl@beer.org>
From: Greywolf <greywolf@starwolf.com>
List: current-users
Date: 01/15/2001 13:29:10
On Mon, 15 Jan 2001, Herb Peyerl wrote:
# So, I've been experiencing some "problems" in my life as owner of netbsd
# systems. First things first, 'lager', my life-on-disk machine is a Sparc
# running 1.4C. It's had impressive uptimes and hasn't had many problems
# except for dying disks. Since I can't afford to be buying replacement
# SCSI disks, I decided to buy a PC with cheap IDE disks.
#
# I bought an ABIT with 800Mhz Athlon and 2 45G IBM IDE disks that I
# intended to raid1 together. In conjunction with my DLT drive on
# an Adaptec controller for backups. Coupled with a PC Weasel, it was
# supposed to improve my quality of life.
#
# This whole situation has turned bad on me and I have no idea how to
# proceed and I'm losing confidence in NetBSD as a system.
Coming from one of the long-time users of NetBSD, I see this as a bad
thing.
This means that either more and more fringe cases are coming into the
mainstream, or stuff is getting broken and not thoroughly tested.
Either way, it's not favourable.
# The first problem came when I raid1'd the two partitions together. Every-
# thing performs admirably except when it comes to extracting something
# like pkgsrc.tar.gz. mkdir(2) calls consume 3500 I/O's and take 17 seconds
# to complete, most of the time:
#
# nlager# time mkdir foo
# 0.0u 0.6s 0:17.72 3.8% 0+0k 3687+7io 0pf+0w
Gah!
Have you, perchance, tried running with the ccd instead, or do we intend
to phase out ccd in favour of raid?
[I thought ccd required less overhead than raid did, but I guess you can't
boot off a ccd...]
# I've ruled out 'bad disk' because the work is all being done on either one
# or the other of the two disks. ie: it makes no difference.
# The kernel in question is a GENERIC 1.5 with the raid stuff linked in. The
# userland is also generic 1.5 from ftp.netbsd.org.
#
# I've discussed this with Greg and he has no further ideas. He's sanity
# checked my configuration however.
#
# The second problem came when I installed a 10G disk and tried to duplicate
# the OS and /home onto the disk using "dump | restore". I actually used
# "restore -i" because I wanted to exclude my massive mp3 library.
#
# Here's a transcript. Note: /mnt is a freshly newfs'd 10G partition.
I'll assume this has already been send-pr'd...
[snip]
# Changing volumes on pipe input?
# abort? [yn] n
# Changing volumes on pipe input?
# abort? [yn] n
# Changing volumes on pipe input?
# abort? [yn] n
# Changing volumes on pipe input?
# abort? [yn]
# ^Crestore > setmodes
I can't tell if the bug is in dump or restore or in the ffs code in general.
I'm absolutely baffled as to why a mkdir would require 3500 I/O calls!
[snip]
# I've duplicated this 3 times and each time the same files don't get copied. I
# illustrate with /sbin as an example but the lossage is everywhere. My 1G
# /home partition is different by about 100MB between /home and /mnt/home.
#
# Also, from dmesg, everytime I mount a filesystem:
#
# Non-unique normal route, mask not entered<3>Non-unique normal route, mask not entered<3>Non-unique normal route, mask not entered
That looks almost as though, for some reason, you're trying to do a loopback
mount on the fs, which I'm sure isn't what you had in mind.
# Not sure what that's about.
Lots of shots in the dark; hoping one of them will have struck a familiarity
bit.
I've got my own sun4m woes, but I'm about to sanity-check my kernel config
first.
In any case, I'd like to see this stuff fixed and for NetBSD to continue.
It'll be a sad day when I have to relegate my SS5 to Linux.
--*greywolf;