Tuesday, June 26, 2007

TODO: HTML Parser

Requirement: A continuation-based robust HTML parser that doesn't choke on all the crappy web documents out there. It would allow me to specify a qualifying filter to its generator method and iterate over qualifying nodes. The filter would be similar to an XPath expression, if not equivalent.

References:
Parsing HTML in Microsoft C#

Continuations for Curmudgeons (SAX section)