Re: CVS commit: src/sys/kern

To: Robert Elz <kre%munnari.oz.au@localhost>
Subject: Re: CVS commit: src/sys/kern
From: Christos Zoulas <christos%zoulas.com@localhost>
Date: Tue, 04 Oct 2022 10:09:35 -0400

On 2022-10-01 3:39 pm, Robert Elz wrote:

Date:        Sat, 1 Oct 2022 13:00:04 -0400

[stuff deleted]

Even when it is called, the code is:

	fp->f_flag = flags & FMASK;

where FMASK is (from fcntl.h)

#define FMASK           (FREAD|FWRITE|FCNTLFLAGS|FEXEC)

and

                         FDIRECT|FNOSIGPIPE)

which all looks exactly as it should be to me - and note that O_CLOEXEC

(which has no Fxxxx equivalent name - it doesn't need one) is notthere.


So, fp->f_flag isn't set at all (is probably 0), and even if it were
O_CLOEXEC would not be there, and should not be.


Thanks for pointing that out, I always forget the O_CLOEXEC is special
in that regard. I wish it was not, but it is difficult to fix.


For the vnode actually being opened, none of this matters, as the open
lasts as long (actually not as long) as the open() sys call - when that
returns, the device being opened has been closed again, so what the
file that refers to it looks like (or would have looked like) really
doesn't matter at all, it never becomes visible.   That's my guess as
to why the open_setfp() call is missing in that case.

But what got forgotten (or deliberately was not done) was anything to
affect the modes of the clone which is opened.

  | What does it mean when the open specifies O_CLOEXEC
  | and ff->ff_exclose is false? Can that happen? Is that desirable?

It is what is currently happening whenever we open a cloning device.
(Or that is what it looks like to me).   Desirable, no idea, I didn't
define the semantics of cloning device opens, nor which of the open
flags should apply to the clone that is created.

| I am fine with the locking to stay where it is. I guess it isprobably

  | not needed after dup/clone, since the underlying vnode is shared...

Assuming you mean dup(2) (and dup2()) and clone(2) then no - those have
no way to pass the relevant locking flags, if you have a fd and want to
apply a lock, fcntl() is what does that (and the fcntl operations that
duplicate fds do not also apply locks).

The only issue is O_EXLOCK and O_SHLOCK (and O_CLOEXEC) for cloneddevices.


I suspect it makes sense for O_CLOEXEC to be applied in that case, it
makes little sense for open("/dev/ptmx", O_RDWR|O_CLOEXEC) to succeed,

returning an open fd which does not have cloexec set (which is theissue,

along with O_NONBLOCK) which started all of this discussion.

The locking flags I am less sure about.   I don't see how they can fail

to succeed if applied, as the vnode for the device has just beencreated,nothing else can possibly have any kind of lock on it. Whetherthere's

any benefit in applying the lock so that the node is locked for later,
I don't know - but it certainly should do no harm to do that.

It seems clear to me that what we need is (something like)

Index: vfs_syscalls.c
===================================================================
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.555
diff -u -r1.555 vfs_syscalls.c
--- vfs_syscalls.c	12 Feb 2022 15:51:29 -0000	1.555
+++ vfs_syscalls.c	1 Oct 2022 19:27:15 -0000
@@ -1763,6 +1763,9 @@
 		error = fd_dupopen(dupfd, dupfd_move, flags, &indx);
 		if (error)
 			return error;
+		error = open_setfp(l, fp, XXXvp, indx, flags);
+		if (error)
+			return error;
 		*fd = indx;
 	} else {
 		error = open_setfp(l, fp, vp, indx, flags);

where XXXvp needs to be extracted from somewhere (it isn't vp, asvp==NULL)

except that what follows in the else case is ...

                if (error)
                        return error;
                VOP_UNLOCK(vp);
                *fd = indx;

That VOP_UNLOCK(vp) is what is bothering me, It tells me thatopen_setfp()is expecting to be called with vp locked - but in the first case (thecloningcase) there is no VOP_UNLOCK() call (and what's more, fd_dupopen()cannotdo it, as it has no vp arg). That means, I believe, that whenvn_open()returns in the normal case, vp is returned locked, but in the cloningcase

the vnode that was created for the clone is not locked.

I'm not sure what is the right way to find the vnode, or how it should
properly be locked so open_setfp() can do its thing.   If I knew all of
that I would have made an attempt at fixing this already.   We need
someone who really understands what is happening here, and the right

way to handle it all (which very likely is nothing like I justsuggested).


The question is how to find the vnode? Perhaps it is easiest to fail the

open call if O_EXLOCK or O_SHLOCK are specified in a cloning open? Atleast

we will not silently ignore them?

Best,

christos

References:
- Re: CVS commit: src/sys/kern
  - From: Christos Zoulas
- Re: CVS commit: src/sys/kern
  - From: Christos Zoulas
- Re: CVS commit: src/sys/kern
  - From: Christos Zoulas
- Re: CVS commit: src/sys/kern
  - From: Christos Zoulas
- Re: CVS commit: src/sys/kern
  - From: Robert Elz
- Re: CVS commit: src/sys/kern
  - From: Robert Elz
- Re: CVS commit: src/sys/kern
  - From: Robert Elz
- Re: CVS commit: src/sys/kern
  - From: Robert Elz
- Re: CVS commit: src/sys/kern
  - From: Robert Elz

Prev by Date: Re: CVS commit: src/sys/dev
Next by Date: Re: CVS commit: src/sys/kern
Previous by Thread: Re: CVS commit: src/sys/kern
Next by Thread: Re: CVS commit: src/sys/kern
Indexes:

Home | Main Index | Thread Index | Old Index