tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: tar issue on netbsd-5
>> anything writing standard tar format has this limit, because [...]
> As I said in the last discussion of this topic and the corresponding
> PR, it strongly depends on which standard you are talking about. For
> ustar the limit is 8GB, for POSIX Interchange Format it is no
> problem.
Are you talking about the "pax Extended Header" spec, such as is found
in http://www.opengroup.org/onlinepubs/009695399/utilities/pax.html?
(Normally, I wouldn't take a pax spec as relevant to tar at all, but I
have a copy of that page saved alongside my tar source and marked as
being standards-relevant, and the format looks like an extension to
least-common-denominator tar format.)
Strictly, it's unimplementable on NetBSD, and probably assorted other
OSes, since it demands UTF-8 encoding for paths and link-to strings,
which means it demands that names be character strings, but NetBSD file
names are octet strings, not character strings. They look like
character strings, but aren't; the actual name is the octet string,
with conversion between characters and octets, when it happens,
happening elsewhere. If you doubt, suppose you call readdir() and find
that d_name[] contains 0xc1, 0xa5, 0xac, 0x00. Is that the single
Unicode character TAI LE LETTER AUE encoded in UTF-8, or is that the
three ISO-8859-1 characters A-acute, yen-sign, not-sign? Or is it due
to some bit of software generating file names based on some
non-character mechanism (such as representing the number 12364128, or
maybe 11009313, in base 254? Or maybe something else? Without some
way to tell, there's no way to generate UTF-8 for it - and, in the last
case, it's not even clear what the correct UTF-8 would be. (Anyone
happen to know what existing implementations do? I'm curious, but not
quite curious enough to find and build one to see.)
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Home |
Main Index |
Thread Index |
Old Index