Subject: Re: softdep crash - more info
To: Brian Gregor <bgregor@BUPHY.bu.edu>
From: enami tsugutomo <enami@sm.sony.co.jp>
List: port-i386
Date: 10/04/2001 14:42:29
> The PC has the sources for 1.5.2 installed too. I mounted
> them over NFS on a Sparc Classic running 1.5.2 kernel,
> 1.5.1 world for a 'make build'. Here 's the debugger message
> and stack trace:
>
> panic: softdep_pageiodone: resid < 0, vp 0xd7f97944 lbn 0x0 pcbp
> 0xd812e000
> Stopped in pid 129 (nfsd) at cpu_Debugger+0x1: ret
> db> t
> cpu_Debugger(c0290420,d7f97944,0,d812e000,c003ef08) at cpu_Debugger+0x1
> softdep_pageiodone(c083ef08,c08ef08,1,d7be6a50,c056bb3c) at
> softdep_pageiodone+0x159
The story is:
1. Some data are appended to file and EOF pasts the fragment boundary
as a result.
2. ffs code extends the fragment and tell softdep code the new size of
fragment.
3. ffs code flush up to old end of fragment. Upon I/O completion,
softdep code notices that.
4. ffs code copies the data (i.e., modify page) and flush it. Upon
I/O completion, softdep code notices that.
5. Now softdep code panics since ``the size told at step 2.'' !=
``actuall data transfered at step 3. and step 4.''. They are
usually different since there is overlap around old end of
fragment.
I'm not sure why this begin to occur recently even though the code at
step 3. exists while ago.
enami.