tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sh(1) read: add LINE_MAX safeguard and "-n" option



    Date:        Tue, 24 Sep 2024 09:09:29 -0400
    From:        Greg Troxel <gdt%lexort.com@localhost>
    Message-ID:  <rmih6a5nqpi.fsf%s1.lexort.com@localhost>

  | Sure, but the problem is that if you have a file which is e.g one line
  | (single \n at end) that is 10 MB, read from it is unreasonable, and it's
  | difficult to deal with this in portable code.

Yes.   That's just a limitation of what portable code (using sh's read anyway)
can provide, if it weren't for the cost of running non-builtin processes, then
there would be other easy ways to handle this.

  | If there were a limit which was well under 1 MB, but well over anything
  | reasonably in a bona fide text file, it would finesse the issue.

That's what the -n option is supposed to achieve, and while not portable
(and cannot be it seems, as different shells have different definitions
of what it actually does -- even ignorning zsh where it is a different
thing entirely).   That's reasonable to add, if we can work out what it
should really mean, that can happen.

Then on any system with our (once updated) shell, or bash, or mksh, or ksh93
which all have -n options similar enough for your purpose, you can implement
whatever limit you like, without it affecting other uses of read in sh.
On other systems just fall back to sed/dd/... and accept the cost (that
script probably isn't often used on such systems I'd expect.)

Part of the reason that things are costly now, is that read is required
(normally anyway) to read 1 byte at a time, so if you're doing a read
which consumes a MB, that's a million (plus) system calls...   That's not
cheap!   That's kind of inherent in the definition of read and how it is
required to leave the state of the fd it reads from.

kre

ps: in that script (the one in question) the read calls (or at least the
ones related to this) should certainly be using read's -r option.




Home | Main Index | Thread Index | Old Index