IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: How to treat utf8 text with overlong utf8 sequences?



Niels Möller <nisse%lysator.liu.se@localhost> wrote:
> What do you think about sending overlong / "non-minimum form" utf8
> sequences in various utf8 strings in the protocol?

Should be illegal, definitely.

> RFC 2279 does not address these questions, as far as I can see.

I think it does, if only obliquely. Section 2 says:

|   1) Determine the number of octets required from the character value
|      and the first column of the table above.  It is important to note
|      that the rows of the table are mutually exclusive, i.e. there is
|      only one valid way to encode a given UCS-4 character.

I think that adequately justifies considering any overlong sequence
to be completely invalid.

> I'm tempted to treat any use of overlong or otherwise invalid utf8
> strings that I receive from the remote end as a protocol error.
> 
> * Do you think that is a reasonable thing to do?

Definitely.

> * Does it violate the ssh specification?

I don't think so.

> * Will it cause any interoperability problems in practice?

It _shouldn't_; but failure to disallow overlong sequences could
cause security problems, therefore it's reasonable to consider any
implementation currently generating them to require fixing.

All IMO, of course.
-- 
Simon Tatham         "Happiness is having a large, warm, loving,
<anakin%pobox.com@localhost>    caring, close-knit family in another city."



Home | Main Index | Thread Index | Old Index