IETF-SSH archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: How to treat utf8 text with overlong utf8 sequences?

Niels Möller <> wrote:
> What do you think about sending overlong / "non-minimum form" utf8
> sequences in various utf8 strings in the protocol?

Should be illegal, definitely.

> RFC 2279 does not address these questions, as far as I can see.

I think it does, if only obliquely. Section 2 says:

|   1) Determine the number of octets required from the character value
|      and the first column of the table above.  It is important to note
|      that the rows of the table are mutually exclusive, i.e. there is
|      only one valid way to encode a given UCS-4 character.

I think that adequately justifies considering any overlong sequence
to be completely invalid.

> I'm tempted to treat any use of overlong or otherwise invalid utf8
> strings that I receive from the remote end as a protocol error.
> * Do you think that is a reasonable thing to do?


> * Does it violate the ssh specification?

I don't think so.

> * Will it cause any interoperability problems in practice?

It _shouldn't_; but failure to disallow overlong sequences could
cause security problems, therefore it's reasonable to consider any
implementation currently generating them to require fixing.

All IMO, of course.
Simon Tatham         "Happiness is having a large, warm, loving,
<>    caring, close-knit family in another city."

Home | Main Index | Thread Index | Old Index