Subject: Re: fmt and high bit characters
To: David Brownlee <abs@netbsd.org>
From: None <itojun@iijlab.net>
List: tech-userlevel
Date: 09/28/2000 23:55:00
> Will this help for the case when you are sharing files with
> someone from a different locale? - eg fix it for the current
> locale so isprint() does the right thing, but allow a '-8' overide
> (or similar) to allow all highbit characters through...
please also note that there are text encoding with multibyte 8bit chars
(like euc-jp).
for the moment, i think allowing 8bit chars to go through (either -8 or
without option) is sufficient workaround for western 8bit encodings.
stripping off all 8bit chars is, i think, not useful.
the right way is to have a full locale support and use that.
even in this case, we have some issues:
- if locale setting ($LANG) and the file encoding does not match,
strange output may result (anyway i can say that it is a pilot error).
- for non-western languages how should fmt behave?
for example, japanese/chinese/korean text does not have space between
chars.
itojun