UTF-8 [was Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt]

To: ietf-ssh%NetBSD.org@localhost
Subject: UTF-8 [was Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt]
From: Mouse <mouse%Rodents-Montreal.ORG@localhost>
Date: Fri, 16 Dec 2016 07:38:38 -0500 (EST)

> What=E2=80=99s inherently broken in using UTF-8...?

Different characters occupy different amounts of space.

(Some) characters are larger than one addressing unit (most machines).

There are octet sequences which are not valid UTF-8 character
sequences.  This results in text tools that break on small amounts of
non-UTF-8 text mixed into the text they're handling.  (This is not
really a problem with UTF-8 proper - there are also octets that are not
valid 8859-1 text, for example - but a problem with how it's
implemented; in my experience UTF-8 text tools break when faced with
non-UTF-8 octet sequences, whereas single-octet text tools usually
don't break when faced with invalid octets.)

Some characters have multiple distinct encodings.  (Okay, that too is
not really UTF-8 proper - it's actually Unicode.)

I've seen it said (by the git documentation) that transcoding from some
character sets like 8859-1 to UTF-8 is not a reversible operation.
This seems dubious to me, but, if true, it would be another, and fairly
strong, strike against UTF-8 in my opinion.

That's just what come to mind immediately.  I don't use UTF-8 myself if
I can help it (when I run into something using it my major concern is
how to make it stop doing so), so it's entirely possible there are
others I'm just not aware of.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

References:
- FW: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
  - From: Daniel Migault
- Re: FW: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
  - From: Mouse
- Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
  - From: denis bider (Bitvise)

Prev by Date: Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
Next by Date: Universal 2nd Factor (U2F) Authentication for Secure Shell?
Previous by Thread: Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
Next by Thread: Re: New Version Notification - draft-sgtatham-secsh-iutf8-05.txt
Indexes:

Home | Main Index | Thread Index | Old Index