-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query_selector not selecting children? #22
Comments
You're right, the query selector implementation is incomplete. In particular, tl/src/queryselector/selector.rs Lines 18 to 19 in 9a1538a
But Selector::matches doesn't handle that case (as well as .foo > .bar ) and simply returns false.tl/src/queryselector/selector.rs Line 72 in 9a1538a
I believe this was skipped in the implementation because of complexity at the time, and should be mentioned in the docs. Nodes don't have a reference or a handle to the parent, so implementing .foo > .bar is difficult. Also, the way the query selector works probably needs to be changed in some way to make this work, because right now there's not really a way to return a matched subnode from within the method.I guess a workaround for now until that's implemented is to run a separate query selector on those matches (or if a query selector is overkill, one could iterate over the children and find the a tag using query.children().all().find(...) with a tag name comparison).
On a more-or-less unrelated note, I found an off-by-one bug in the query selector API for HTML tags while looking into this. We've got lots of tests for all of the different methods of the public API and I'm surprised they didn't catch this. I just published 0.6.1 with a quick fix for this (see #23). |
As a temporary workaround, I currently have the following code: let items = dom.query_selector("li.example-class").unwrap();
let links: Vec<_> = items
.filter_map(|e| e.get(&parser).and_then(tl::Node::children))
.filter_map(|e| e.all(&parser)[1].as_tag())
.map(tl::HTMLTag::attributes)
.filter_map(|e| e.get("href").flatten().and_then(tl::Bytes::try_as_utf8_str))
.collect(); What I want to do is find the |
Ran also into the same problem for a fun project of mine. So I took a look at the code, thinking about maybe I can contribute a PR. But I'm not sure the current CSS selector parsing is prepared for this. E.g. the "OR" in the current implementation seems to 'local' while (at least in my understanding) in CSS it's an OR for the whole of the selector list. So, in my understanding it should be:
But I think the code would parse it as:
Correct me if I'm wrong. In the end I decided to put an index on top of your tl VDom, which caches all those relationships (https://github.com/kelko/html-streaming-editor/blob/main/src/html/mod.rs) and an own CSS selector parsing (using PEG, https://github.com/kelko/html-streaming-editor/blob/main/src/parsing/mod.rs#L66) & execution (https://github.com/kelko/html-streaming-editor/blob/main/src/css/mod.rs) I don't say it's pretty, but it seems to do the job - at least for my project. And maybe there is an idea or two in there which might help you as well. |
What's the status on this? |
Same problem. Wish there was a fix |
I'm pretty new to this so I might be misunderstanding how the query selector works.
I have the following HTML code:
I'd now expect the selector
li.test a
to return the innera
element.However, it returns nothing.
Rust code:
Opening the same HTML code in Firefox and running
$$("li.test a")
in the console returns a one-element array with the a node.The text was updated successfully, but these errors were encountered: