pkgsrc-Changes archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
CVS commit: pkgsrc/www/htmlcxx
Module Name: pkgsrc
Committed By: wiz
Date: Sun Feb 16 22:58:51 UTC 2014
Added Files:
pkgsrc/www/htmlcxx: DESCR Makefile PLIST buildlink3.mk distinfo
pkgsrc/www/htmlcxx/patches: patch-html_CharsetConverter.cc
patch-html_ci__string.h
Log Message:
Import htmlcxx-0.85 as www/htmlcxx.
htmlcxx is a simple non-validating CSS1 and HTML parser for C++.
Although there are several other HTML parsers available, htmlcxx
has some characteristics that make it unique:
* STL like navigation of DOM tree, using the excellent tree.hh library
from Kasper Peeters
* It is possible to reproduce exactly, character by character, the
original document from the parse tree
* Bundled css parser
* Optional parsing of attributes
* C++ code that looks like C++ (not so true anymore)
* Offsets of tags/elements in the original document are stored in
the nodes of the DOM tree
The parsing politics of htmlcxx were created trying to mimic Mozilla
Firefox behavior. So you should expect parse trees similar to those
create by Firefox. However, differently from Firefox, htmlcxx does
not insert non-existent stuff in your html. Therefore, serializing
the DOM tree gives exactly the same bytes contained in the original
HTML document.
To generate a diff of this commit:
cvs rdiff -u -r0 -r1.1 pkgsrc/www/htmlcxx/DESCR pkgsrc/www/htmlcxx/Makefile \
pkgsrc/www/htmlcxx/PLIST pkgsrc/www/htmlcxx/buildlink3.mk \
pkgsrc/www/htmlcxx/distinfo
cvs rdiff -u -r0 -r1.1 \
pkgsrc/www/htmlcxx/patches/patch-html_CharsetConverter.cc \
pkgsrc/www/htmlcxx/patches/patch-html_ci__string.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Home |
Main Index |
Thread Index |
Old Index