tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: sh(1) read: add LINE_MAX safeguard and "-n" option
Date: Fri, 27 Sep 2024 15:04:18 +0200
From: tlaronde%kergis.com@localhost
Message-ID: <ZvatUpCvWtk7LrDS%kergis.com@localhost>
| If I understand correctly your view, the explanation could be
| something around this (I mean for the idea; for the way it is
| expressed in some kind of english...):
This what I came up with (no -N option has been implemented, I don't
see the point at the minute - that can be revisited later if someone
can demonstrate a meaningful use for it).
In the description of the -z option, either just the brackets, or the
brackets and all contained text, will end up being deleted, depending
upon which way the option ends up working.
I did add the -b option (turns out to be easy, and actually helpful to
avoid the tty needing to be put into raw mode, losing erase/kill
processing in most cases).
I also added the PS2 output (required by POSIX) when obtaining a
continuation line from stdin as a terminal, which we never bothered
with before.
Comments appreciated (other than about it being just ascii, with no
extra formatting visible - the actual man page doesn't have that
limitation). I am not particularly happy with the wording for -n.
The final paragraph is about (just slightly modified) all that remains
from the existing man page (sh(1)) description of read.
kre
read [-brz] [-d delim] [-n max] [-p prompt] variable [...]
The read command reads a record from its standard input (by
default one line) splits that record as if by field splitting,
and assigns the results to the named variable arguments, as
detailed below.
The options are as follows:
-b Do buffered reads, rather than reading one byte
at a time. Use of this option might result in
reading more bytes from standard input than the
read utility actually processes, causing some
data from standard input to be unavailable to any
subsequent utility that expects to obtain them.
-d delim End the read when the first byte of delim is
obtained from standard input. Specifying "" as
delim causes the nul character (`\0') to be the
end delimiter. The default is <newline> (`\n').
-n max read will read no more than max bytes from stan-
dard input. The default is unlimited. If the
end delim has not been encountered within max
bytes, read will act as if one immediately fol-
lowed the max'th byte, without attempting to
obtain it. However, even if the -r option is not
given and the final byte actually read were the
escape character (not itself escaped), no more
bytes will be read, and that escape character
would simply be removed as descibed below.
-p prompt If the standard input is a terminal, then prompt
is written to standard error before the read com-
mences. If more lines of data are requred in
that case, the normal PS2 prompt is written as
each subsequent line is to be obtained.
-r Reduced processsing of the input. No escape
characters are recognised, and line continuation
is not performed. See below.
-z If a nul character (`\0') is found in the input,
other than when acting as the delimiter, an error
is [normally] generated. [This option disables
that error, the nul is simply ignored.]
If the read is from a terminal device, and the -p option was
given, prompt is printed on standard error. Then a record, termi-
nated by the first character of delim if the -d option was given,
or a <newline> (`\n') character otherwise, but no longer than max
bytes if the -n option was given, is read from the standard input.
If the -b option is not given, no data from standard input beyond
the end delimiter, or the max bytes that may be read, are
obtained.
If the -r option not was given, and the two character sequence `\'
`\n' is encountered, those two characters are simply deleted, and
provided that max bytes have not yet been obtained, and the end
delimiter has yet to be encountered, more input is obtained, with
the first character of the following line placed in the input
where the deleted `\' had been. This allows logical lines longer
than the maximum line length permitted for text files to be pro-
cessed. The two removed characters are still counted for the pur-
poses of the max input limit.
If the -r flag was not given, the <backslash> character (`\')
character is then treated as an escape character, the character
following it is always treated as a normal, insignificant, data
character, and is never treated as the end delimiter nor as an IFS
character for field splitting.
After field splitting has completed, but before data has been
assigned to any variables, all escape characters are removed.
Note that the two character sequence `\' `\' can be used to enter
the escape character as data, the first acts as the escape charac-
ter, the second becomes just a normal data character.
The ending delimiter, if encountered, and not escaped, is deleted
from the record which is then split as described in the field
splitting section of the Word Expansions section above. The
pieces are assigned to the variables in order. If there are more
pieces than variables, the remaining pieces (along with the char-
acters in IFS that separated them) are all assigned to the last
variable. If there are more variables than pieces, the remaining
variables are assigned the null string. The read built-in utility
will indicate success unless EOF, or a read error, is encountered
on input, or there is a usage error (unknown option, etc) in which
case failure is returned.
Home |
Main Index |
Thread Index |
Old Index