NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57259: ucom serial ports cannot be re-opened "too quickly" with O_NONBLOCK
The following reply was made to PR kern/57259; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: thorpej%me.com@localhost
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/57259: ucom serial ports cannot be re-opened "too quickly" with O_NONBLOCK
Date: Sat, 4 Mar 2023 22:15:31 +0000
The problem you're encountering is unrelated to the d_cancel logic.
By the time close(2) returns, sc_closing is set to false. The logic
you quoted is only applicable when open(2) and close(2) happen
concurrently. You could rip that logic back out if you want a crashy
kernel that makes you twitch at the thought of removing a USB device
or running two programs that touch a tty at the same time, but it
wouldn't help with the problem at hand.
The problem at hand is that a delay which used to happen in close(2)
(with no opportunity for a nonblocking variant),
/*
* Hang up if necessary. Wait a bit, so the other side has time to
* notice even if we immediately open the port again.
*/
if (ISSET(tp->t_cflag, HUPCL)) {
ucom_dtr(sc, 0);
/* XXX will only timeout */
(void) kpause(ttclos, false, hz, NULL);
}
is now implemented by sleeping only for the _remainder_ of the 1sec
delay in open(2):
/*
* If the previous use had set DTR on close, wait up to one
* second for the other side to notice we hung up. After
* sleeping, the tty may have been revoked, so restart the
* whole operation.
*
* XXX The wchan is not ttclose but maybe should be.
*/
if (timerisset(&sc->sc_hup_time)) {
struct timeval now, delta;
int ms, ticks;
microuptime(&now);
if (timercmp(&now, &sc->sc_hup_time, <)) {
timersub(&sc->sc_hup_time, &now, &delta);
ms =3D MIN(INT_MAX - 1000, delta.tv_sec*1000);
ms +=3D howmany(delta.tv_usec, 1000);
ticks =3D MAX(1, MIN(INT_MAX, mstohz(ms)));
error =3D cv_timedwait_sig(&sc->sc_statecv, &sc->sc_lock,
ticks);
mutex_exit(&sc->sc_lock);
return error ? error : ERESTART;
}
}
When cv_timedwait_sig times out, it returns EWOULDBLOCK. This is a
little confusing, both for cv_timedwait_sig(9) (which needs to be
replaced by a better API) and for open(2) on a tty (with or without
O_NONBLOCK). Perhaps it should map EWOULDBLOCK to ERESTART, or be
moved back to the close path -- I don't have strong feelings about it.
But this logic is not unique to ucom(4) -- it was first introduced in
com(4) and plcom(4) a year and a half ago:
Module Name: src
Committed By: jmcneill
Date: Mon Oct 11 18:39:06 UTC 2021
Modified Files:
src/sys/dev/ic: com.c
Log Message:
com: speed up close with HUPCL set
Instead of incurring a 1s penalty on close of a com device with HUPCL se=
t,
defer the sleep until the next open, and only sleep if necessary.
This has a side effect of making `ttyflags -a` with a default install not
pause for 1s for every non-console com device, which happens every boot
via /etc/rc.d/ttys.
https://mail-index.netbsd.org/source-changes/2021/10/11/msg132966.html
Module Name: src
Committed By: jmcneill
Date: Sun Oct 17 22:34:17 UTC 2021
Modified Files:
src/sys/arch/evbarm/dev: plcom.c plcomvar.h
Log Message:
plcom: speed up close with HUPCL set
Instead of incurring a 1s penalty on close of a plcom device with HUPCL =
set,
defer the sleep until the next open, and only sleep if necessary.
https://mail-index.netbsd.org/source-changes/2021/10/17/msg133107.html
I later later copied it in ucom(4) while doing other renovations to
make ucom(4) safe to yank concurrently with open/close and I/O:
Module Name: src
Committed By: riastradh
Date: Mon Mar 28 12:42:37 UTC 2022
Modified Files:
src/sys/dev/usb: ucom.c
Log Message:
ucom(4): Rework open/close/attach/detach logic.
- Defer sleep after hangup until open.
No need to make close hang; we just need to make sure some time has
passed before we next try to open.
This changes the wchan for the sleep. Oh well.
- Use .d_cfdriver/devtounit/cancel to resolve races between attach,
detach, open, close, and revoke.
- Use a separate .sc_closing flag instead of a UCOM_CLOSING state.
ucomcancel/ucomclose owns this flag, and it may be set in any state
(except UCOM_DEAD). UCOM_OPENING remains owned by ucomopen, which
might be interrupted by cancel/close.
- Rework error branches in ucomopen. Much simpler this way.
- Nix unnecessary reference counting.
https://mail-index.netbsd.org/source-changes/2022/03/28/msg137744.html
If you want to move the sleep back to close(2), I have no objection,
but it would be nice to find another way to avoid the unnecessary
boot-time delay that jmcneill had made the change to avoid.
If you want more information about the d_cancel/close split and why it
is so important for making things safe to yank and/or revoke
concurrently with open/close, I gave a talk last year at EuroBSDcon
about it:
https://www.NetBSD.org/gallery/presentations/riastradh/eurobsdcon2022/opend=
etach.pdf
(I acknowledge the medium is suboptimal for reference purposes, in
contrast to, say, fixing everything in the driver(9) man page, but my
round tuit supply for rototilling driver(9) has run short and there
have been other more pressing issues in netbsd-10 which have been
consuming my spoons.)
There is a large class of bugs in tty drivers, leading to potential
deadlocks and/or crashes if open and close happen concurrently, that
could likely be eliminated by just by setting .d_cancel =3D ttycancel in
the cdevsw. I haven't committed that change mostly because I don't
have hardware for anything other than com(4), plcom(4), and ucom(4),
and I got sidetracked by trying to make tty locking less of a
catastrophe -- which ideally would also allow us to remove the code
you quoted and leave ucom(4) with just .d_cancel =3D ttycancel.
Home |
Main Index |
Thread Index |
Old Index