Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make parser generic over sink #21

Open
y21 opened this issue Feb 1, 2022 · 1 comment
Open

Make parser generic over sink #21

y21 opened this issue Feb 1, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@y21
Copy link
Owner

y21 commented Feb 1, 2022

It would be nice if the parser was generic over a "sink" that gives users the ability to have a function called when a tag is visited (streaming parser). The sink could then decide what to do with the received tag.
Sometimes, one might not need to parse an entire HTML document, or other things that tl does by default.
We could provide default implementations, for example a sink that keeps track of ids and classes, and remembers them (in a map) so that ID lookups run in constant time (this already exists and can be enabled through ParserOptions::track_ids(), but a sink could be nicer).
AFAICT parsers like html5ever seem to do this.

@y21 y21 added the enhancement New feature or request label Feb 1, 2022
@dist1ll
Copy link

dist1ll commented Feb 4, 2022

What do you think of adding a ParserOptions::skip_whitespace()? I noticed when parsing there were quite a few Raw(Bytes("\n\n")) that I'd like to ignore.

Or should I rather create a sink in that case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants