Re: CVS commit: src/sys

To: Jaromír Doleček <jaromir.dolecek%gmail.com@localhost>
Subject: Re: CVS commit: src/sys
From: "J. Hannken-Illjes" <hannken%eis.cs.tu-bs.de@localhost>
Date: Mon, 7 Nov 2016 10:25:35 +0100

> On 07 Nov 2016, at 00:11, Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
> 
>> On Fri, Nov 04, 2016 at 04:44:10PM +0100, J. Hannken-Illjes wrote:
>>> - This change results in "panic: ffs_blkfree_common: freeing free block"
>>>  if I put a file system under stress (*1).
>>> 
>>> - I suppose not zeroing the blocks to be freed before freeing them
>>>  makes the life of fsck harder.
>>> 
>>> - Running "brelse(bp, BC_INVAL)" doesn't look OK.
> 
> The brelse(bp, BC_INVAL) call was there before as well, but the
> condition changed and is wrong.
> 
> I can repeat the problem with your script and the packaged fsx (thanks
> Thomas). Whipped up a patch to what looked wrong there, and it no
> longer panics for me. Patch is attached. I'll test further and commit
> it tomorrow.

Index: ffs_inode.c
===================================================================
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_inode.c,v
retrieving revision 1.118
diff -u -p -r1.118 ffs_inode.c
--- ffs_inode.c	28 Oct 2016 20:38:12 -0000	1.118
+++ ffs_inode.c	6 Nov 2016 23:09:15 -0000
@@ -675,18 +675,18 @@ ffs_indirtrunc(struct inode *ip, daddr_t
 	 * Recursively free blocks on the now last partial indirect block.
 	 */
 	if (level > SINGLE && lastbn >= 0) {
-		nb = RBAP(ip, last);
+		last = lastbn % factor;
+		nb = RBAP(ip, i);
 		if (nb != 0) {
 			error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
-					       lastbn % factor, level - 1,
-					       countp);
+					       last, level - 1, countp);
 			if (error)
 				goto out;
 		}
 	}
 
 out:
-	if (RBAP(ip, 0) == 0) {
+	if (lastbn < 0 && error == 0) {
 		/* all freed, release without writing back */
 		brelse(bp, BC_INVAL);
 	} else {

The first part should not be necessary.  After the loop we should have
"i == last" -- from a quick look "i < last" is impossible.

The second part is right and responsible for most of the panics.

Your patch still doesn't address my second observation, if the
machine crashs inside ffs_truncate we leave the file system in a 
state fsck can't handle.  The blocks we freed get attached to other
nodes before they get cleared from our on-disk copy leading to
duplicate blocks.

The attached diff:

- Brings back the non-wapbl behaviour to first clear the allocations
  in the on-disk node and then deallocating them.

- Fixes the condition to brelse() on exit like your diff does.

- Makes sure a block that is completely deallocated and NOT written
  to disk gets deallocated even if this deallocation would fail.

Please review.

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig (Germany)

Attachment: ffs_inode.c.diff
Description: Binary data

Follow-Ups:
- Re: CVS commit: src/sys
  - From: Jaromír Doleček

References:
- Re: CVS commit: src/sys
  - From: J. Hannken-Illjes
- Re: CVS commit: src/sys
  - From: Thomas Klausner
- Re: CVS commit: src/sys
  - From: Jaromír Doleček

Prev by Date: re: CVS commit: src/external/gpl3/gdb/dist/gdb
Next by Date: Re: CVS commit: src/external/gpl3/gdb/dist/gdb
Previous by Thread: Re: CVS commit: src/sys
Next by Thread: Re: CVS commit: src/sys
Indexes:

Home | Main Index | Thread Index | Old Index