Hello Taylor, On Sat, Jan 20, 2024 at 07:07:57PM +0000, Taylor R Campbell wrote: > PR lib/57828 (https://gnats.NetBSD.org/57828) proposes changing an > edge case of the strtoi(3) function so that if ERANGE and ENOTSUP are > both applicable then it should fail with ERANGE rather than ENOTSUP. > > This is the case when the number is out of range, and there's trailing > garbage -- e.g., s="42z", min=3, max=7. > > > The rationale is that the caller can trivially detect the ENOTSUP > condition (test end[0] != '\0'), but not the ERANGE condition (short > of parsing the number again, which would defeat the purpose). And > applications may find the out-of-range case more significant than the > trailing garbage case. > > For example, libutil's pidfile_read(3) tries to parse a pid on a > single line, so any file whose content matches /[0-9]+\n/ with the > resulting number in the interval [1, INT_MAX] is accepted, and one > might hope that anything else should be rejected: > > https://nxr.netbsd.org/xref/src/lib/libutil/pidfile.c?r=1.16#132 > > But if the pidfile contains, e.g., `999999999999999999999999999\n', > pidfile_read(3) as implemented will accept this -- and interpret it as > pid 2147483647. If it's `-999999999999999999999999999\n', > pidfile_read(3) will accept it as pid 1. > > > Counterarguments: > - We've had this semantics since NetBSD 7, so changing it is risky -- > we'd have to audit all the existing code that uses it. I have audited all existing code in NetBSD and in Debian. That should cover a large amount of free software. Small personal projects and closed source projects are not covered, though. From my investigation, this change would fix many bugs, and it would only break one specific case: the implementation of strtonum(3) --which BTW, has an easy fix, and we anticipated it--. > - There are copies of the code in various places and they might > diverge. > - I searched github for /[^a-zA-Z0-9_]strtoi[^a-zA-Z0-9_]/ and came up > with almost 16k results, so auditing that myself isn't going to > happen unless we can narrow it down a lot. What I did on NetBSD and Debian was to search projects that contain 'strto[iu]\>' _and_ either ERANGE or ENOTSUP in the same file. That should be good enough. It might forget some corner case, like a wrapper around strto[iu], but that's all I could do. > - If you need to act on the trailing stuff, maybe it's better to just > use a real lexer instead of relying on edge cases of error cases. > - If you rely on this change, your code won't work in any of the > deployed implementations of strtoi anyway. How many widespread implementations of strtoi(3) are there? I know of NetBSD's and libbsd's. I have a patch for libbsd already prepared. > > > Thoughts? > > The attached patch implements the proposed change in libc and the > automatic tests. > From a64349a5ac209d9967ba3c5d6a62aa243f44f603 Mon Sep 17 00:00:00 2001 > From: Taylor R Campbell <riastradh%NetBSD.org@localhost> > Date: Sat, 20 Jan 2024 18:17:27 +0000 > Subject: [PATCH 1/2] strtoi(3): Test for ERANGE before ENOTSUP. > > Expect failure because that's not what we implemented. > > PR lib/57828 > --- > tests/lib/libc/stdlib/t_strtoi.c | 14 ++++++-------- > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/tests/lib/libc/stdlib/t_strtoi.c b/tests/lib/libc/stdlib/t_strtoi.c > index 3301058a5be1..093b71abe04a 100644 > --- a/tests/lib/libc/stdlib/t_strtoi.c > +++ b/tests/lib/libc/stdlib/t_strtoi.c > @@ -313,14 +313,9 @@ ATF_TC_BODY(strtoi_erroredges, tc) > { "+5z", 5, 0, "z", 3, 7, ENOTSUP }, > { "+6z", 6, 0, "z", 3, 7, ENOTSUP }, > { "+7z", 7, 0, "z", 3, 7, ENOTSUP }, > - /* > - * PR lib/57828 suggests these should yield ERANGE, not > - * ENOTSUP, because ENOTSUP can be discovered > - * separately by the caller anyway. > - */ > - { "-42z", 3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ > - { "2z", 3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ > - { "42z", 7, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ > + { "-42z", 3, 0, "z", 3, 7, ERANGE }, > + { "2z", 3, 0, "z", 3, 7, ERANGE }, > + { "42z", 7, 0, "z", 3, 7, ERANGE }, > }; > > intmax_t rv; > @@ -330,6 +325,9 @@ ATF_TC_BODY(strtoi_erroredges, tc) > > for (i = 0; i < __arraycount(t); i++) { > > + if (t[i].rstatus == ERANGE) > + atf_tc_expect_fail("PR lib/57828"); > + > errno = 0; > rv = strtoi(t[i].str, &end, t[i].base, t[i].lo, t[i].hi, &e); > > > From ee1b6083ea09f569e95ae5137b5fe870beb9d017 Mon Sep 17 00:00:00 2001 > From: Taylor R Campbell <riastradh%NetBSD.org@localhost> > Date: Sat, 20 Jan 2024 18:19:34 +0000 > Subject: [PATCH 2/2] strtoi(3): Return ERANGE before ENOTSUP if both are > applicable. > > Caller of strtoi can trivially detect the ENOTSUP case themselves by > checking whether end[0] == '\0', but they can't trivially detect the > ERANGE case (short of doing their own parsing which would defeat the > purpose of using strtoi(3)). > > PR lib/57828 Isn't an alternative patch already installed? It's the first time I look at a CVS project, but I see this: <http://cvsweb.netbsd.org/bsdweb.cgi/src/common/lib/libc/stdlib/_strtoi.h?only_with_tag=MAIN> Have a lovely day, Alex > --- > common/lib/libc/stdlib/_strtoi.h | 9 ++++++--- > tests/lib/libc/stdlib/t_strtoi.c | 3 --- > 2 files changed, 6 insertions(+), 6 deletions(-) -- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
Attachment:
signature.asc
Description: PGP signature