You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice if the parser was generic over a "sink" that gives users the ability to have a function called when a tag is visited (streaming parser). The sink could then decide what to do with the received tag.
Sometimes, one might not need to parse an entire HTML document, or other things that tl does by default.
We could provide default implementations, for example a sink that keeps track of ids and classes, and remembers them (in a map) so that ID lookups run in constant time (this already exists and can be enabled through ParserOptions::track_ids(), but a sink could be nicer).
AFAICT parsers like html5ever seem to do this.
The text was updated successfully, but these errors were encountered:
What do you think of adding a ParserOptions::skip_whitespace()? I noticed when parsing there were quite a few Raw(Bytes("\n\n")) that I'd like to ignore.
It would be nice if the parser was generic over a "sink" that gives users the ability to have a function called when a tag is visited (streaming parser). The sink could then decide what to do with the received tag.
Sometimes, one might not need to parse an entire HTML document, or other things that tl does by default.
We could provide default implementations, for example a sink that keeps track of ids and classes, and remembers them (in a map) so that ID lookups run in constant time (this already exists and can be enabled through
ParserOptions::track_ids()
, but a sink could be nicer).AFAICT parsers like html5ever seem to do this.
The text was updated successfully, but these errors were encountered: