4 lines
264 B
Text
4 lines
264 B
Text
A fast implementation of the HTML 5 parsing spec for Python. Parsing is done in
|
|
C using a variant of the gumbo parser. The gumbo parse tree is then transformed
|
|
into an lxml tree, also in C, yielding parse times that can be a thirtieth of
|
|
the html5lib parse times.
|