tech-userlevel: mv(1) and signals

Subject: mv(1) and signals
To: None <tech-userlevel@netbsd.org>
From: None <jklowden@schemamania.org>
List: tech-userlevel
Date: 10/17/2006 16:55:33
Hi,

Today I tried to move a directory across file systems with mv(1).
The source filesystem was ffs; the target was smbfs.  The directory
tree had hundreds of large files.

Then I changed my mind and pressed ^C.

I did that because I noticed that it was copying one subtree of
useless information (the .OLD directory).  To save time, I deleted
the source files.  A moment later, I had an inkling of trouble when
I got messages from mv, which I thought I had already killed with
my ^C:

  mv: dat/.OLD/f_company.1.dat.err: No such file or directory 
  mv: dat/.OLD/f_security.1.asc: No such file or directory

It's not every day I get messages from a dead process....

When I tried to delete the target .OLD directory that  mv had
created, I got a permission in error; when I tried to delete it
from the Windows, it said the file [sic] was in use by another
process.  Hmm.

Back in NetBSD, ps(1) told me this:

$ ps | grep mv 13718 p7 D      0:23.08 
mv -PRp dat /usr/users/home[...]

Hmm.  Those aren't documented options, and I hadn't typed them.

No amount of 'kill -9' had any effect. In fact, process 13718 kept
right on trucking, moving the other files.  I saw them appear on
the target.

Now, I understand that mv(1) is supposed to be atomic, and I know
from the manpage moving files across filesystems employs magic
under the covers.  But  think atomic is different from irreversible,
and I think irreversible is suboptimal.

Without looking at the code, I would guess that mv(1) is masking
signals, perhaps in an effort to achieve atomicity.  Yet of course
it can't make a non-atomic action (moving a set of files over a
network) atomic, and preventing kill(8) from terminating the task
looks to be more annoying than anything else.

From my reading of the standard
(http://www.opengroup.org/onlinepubs/009695399/utilities/mv.html),
there's no atomic requirement.  It says only that errors should be
handled in such a way that either the source or target tree should
be intact (as in, don't delete the source tree until it's been
copied completely).

As far as I can tell, the only effect of my ^C was to prevent
deletion of the source tree.  When the process finished, I seemed
to have two complete copies.  Sort of an undocumented feature, in
a way.

Does anyone else think ^C should kill mv(1), first time, every
time?

--jkl