tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: patch(1) max line length



    Date:        Fri, 12 Jul 2024 01:17:57 +0000
    From:        Emmanuel Dreyfus <manu%netbsd.org@localhost>
    Message-ID:  <ZpCERabNWYm4CQgN%homeworld.netbsd.org@localhost>

  | I just encoutered a patch(1) limitation when using it on minified json 
  | files from Wordpress. The lines can span more than the maximum of what
  | patch(1) can cope, which is INT16_MAX. Here is a test for taht:

While I don't object to the change (and core dumps are never good) what
you're doing is actually unspecified behaviour.

POSIX (latest, ie: the new - but aside from the section number, I think
this has been the same a long time) says (from XBD):

3.387 Text File

            A file that contains characters organized into zero or more lines.
	    The lines do not contain NUL characters and none can exceed
            {LINE_MAX} bytes in length, including the <newline> character.
            Although POSIX.1-2024 does not distinguish between text files
            and binary files (see the ISO C standard), many utilities only
	    produce predictable or meaningful output when operating on text
	    files. The standard utilities that have such restrictions always
	    specify ``text files'' in their STDIN or INPUT FILES sections.

LINE_MAX is typically around 1024.

The XCU specification for the patch utility says:

INPUT FILES
            Input files shall be text files.


I'd also note that if you're going to change things, "long" probably
isn't really big enough, that's just 32 bits (4GiB) on many ports,
and files, and hence "lines" in files, can be much larger than that,
so all you've really done is moved the goalpost.

Better would be to use size_t -- nothing (in memory) can be bigger than
that, by definition.   But you'd need to be aware of the switch from a
signed type to unsigned, which can affect how some code works.
Using ssize_t is also not really big enough.

kre



Home | Main Index | Thread Index | Old Index