OK, I now have a third way of handling the problem.
To recap the three options (refer to the attachments)
1. fd_getfile
This option calls fd_getfile() each time it needs to access the
activity log, and calls fd_putfile() after each use. This way,
the additional reference on the file descriptor lasts only for a
short period of time, and does not exist at any time that it can
be passed to fd_close().
The biggest drawback here is that the user-land application can
close() and re-open() this fd, possibly referring to a different
file; this will not be visible to filemon, which will write new
event records to any file that happens to be opened on this fd.
2. exithook
This option extends the existing exithook() mechanism to have
multiple "phases", one of which can happen before the process
exit code calls fd_free() (which in turn calls fd_close() for all
open file descriptors). The exithook registered by filemon finds
any usages of filemon and resets the activity-log, which releases
the extra reference to the log fd.
This option works well for normal process exit (including signal),
but does not resolve the problem if the application itself calls
close() on the log fd. In that situation, the process will still
hang.
Additionally, setting this up correctly is awkward, due to the
order in which kernel components are initialized. (Modules of
CLASS_DRIVER get loaded and initialized before exec_init() can
set up the hook mechanisms, so we need to use a config_finalizer
to establish the exit hook.)
3. filemon-fd_close
This solution introduces a new, filemon-specific callback in
fd_close() (but only if the filemon module is loaded or built-in).
Each time a file descriptor is passed to fd_close, the callback
is invoked. The callback checks each usage of /dev/filemon and
if that usage is logging activity to the file being closed, the
activity-log is reset, releasing the extra reference. Thus,
after we return to fd_close() the reference count is normal and
the file gets properly closed.
The only drawback I see here is the additional overhead of the
callback, on every call to fd_close(). The code catches every
occurrence of the "hang" that I can find, and handles it cleanly.
I still need to think about the fd_getfile2()/fclose() approach to see if it
meets our needs, but the comments that prohibit calling fork() would seem to
preclude this mechanism.
The "detach file from userland" approach suggested by kre would also likely
work, but I'm reluctant to change the semantics of filemon.
+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
!DSPAM:568f3f46191303231618539!