Subject: Re: WWW query engine bug (was Query-PR)
To: None <current-users@NetBSD.ORG>
From: der Mouse <mouse@Collatz.McRCIM.McGill.EDU>
List: current-users
Date: 02/21/1996 06:55:30
>>> But the question is: how can you tell 'intentional' html from
>>> something that just looks like HTML?
>> Force people to insert <HTML>...</HTML> around their text if they
>> dont want tags to be converted to &, < and >.
> You didn't answer the second question: what impact does that have?
> i don't think that's workable, for several reasons:
> (1) PRs are sent as e-mail messages, and for the most part look
> like e-mail messages. How can you put that before your
> headers, so it will do the right thing with HTML in the
> headers? (e.g. an X-Organization: header...)
> (2) the PR machinery appears to mangle some submissions in ways
> that are not obvious to me, e.g. reordering some headers,
> etc. How are people supposed to set things up so that they
> work right?
Don't put <HTML> in the headers, then.
> (3) if the user does a 'long-range' <html>, perhaps one which
> is never closed, how does the scanner deal with that? some
> of the PRs are gigantic, and i think it's unreasonable to
> have to have it parse them completely before it processes
> any of them.
I don't see why there's any need to. Your scanner just has to keep a
bit saying whether it's inside an unclosed <HTML>...</HTML>, and if
it's not, just do mindless mapping of < to <, etc.
> (4) this still doesn't solve the problem! the user can _still_
> supply bad html!
Oh, sure. And within the marked-as-HTML portion of the text, you can
still do all the checks you used to do. This just solves the
text-that-happens-to-look-like-HTML problem. It doesn't do anything
about any others.
(Imagine what your code will do with a PR that includes C code to
generate HTML...it'll _really_ come out mangled. This sort of thing is
why I want some way to get just and only the PR, as little modified
from the bits on disk as possible. Perhaps I'm just weird.)
der Mouse
mouse@collatz.mcrcim.mcgill.edu