Subject: Re: wc: filename: invalid byte sequence
To: None <tls@rek.tjls.com>
From: Roland Dowdeswell <elric@imrryr.org>
List: tech-userlevel
Date: 08/26/2007 17:37:29
On 1188094481 seconds since the Beginning of the UNIX epoch
Thor Lancelot Simon wrote:
>
>On Sat, Aug 25, 2007 at 04:14:20PM -0700, John Nemeth wrote:
>> On Jan 15, 4:13am, Thomas Klausner wrote:
>> }
>> } On the attached file, wc(1) on 4.99.30/amd64 reports "invalid byte
>> } sequence" quite often.
>> } I don't see why it should do that, the byte sequence is perfectly
>> } valid (for an mp3 file). I guess it's a bug in the wide character
>> } library or its usage by wc. Should I send-pr?
>>
>> wc is designed to work with text files, not binary files.
>
>Really? How fascinating! When did the designer of wc tell you this?
The man page says that it can count either bytes or characters. One
presumes that is the difference:
-c The number of bytes in each input file is written to the standard
output.
-m The number of characters in each input file is written to the
standard output.
So, use wc -c.
Presumably, when it is counting either words, lines or characters it will
have to try to process bytes in the current locale.
--
Roland Dowdeswell http://www.Imrryr.ORG/~elric/