5 lines
264 B
Text
5 lines
264 B
Text
|
A fast implementation of the HTML 5 parsing spec for Python. Parsing is done in
|
||
|
C using a variant of the gumbo parser. The gumbo parse tree is then transformed
|
||
|
into an lxml tree, also in C, yielding parse times that can be a thirtieth of
|
||
|
the html5lib parse times.
|