Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored grammar to remove need for second transformer #385

Conversation

sognetic
Copy link
Contributor

@sognetic sognetic commented Nov 6, 2023

Hi,

as promised (with some delay) I've had another think about the feature introduced in #100 which slowed down processing quite a bit by introducing an unnecessary transformer. I've removed this transformer again and refactored the grammar such that the MODEL_NAME terminal doesn't need to include the white space / semicolon anymore but just matches on a word boundary (\b). This simplifies the grammar quite a bit and speeds up parsing of the LHCb / Belle II decay files by 15%.
I've removed the tests corresponding to the removed transformer, I'm not sure if / how I should replace them with something in this PR. All other tests pass.

Let me know what you think!

EDIT: I don't know why the pre_commit check in the CI doesn't work, everything is okay locally.

@eduardo-rodrigues eduardo-rodrigues self-assigned this Nov 8, 2023
@eduardo-rodrigues eduardo-rodrigues added the enhancement New feature or request label Nov 8, 2023
@eduardo-rodrigues
Copy link
Member

Hello @sognetic, nice to hear back from you and thank you very much already for this new contribution 👍!

This PR pushed me to continue on task #357, see the latest #387 from today. One model in particular, rather unique in the way it span many lines, needed to be tested.

I will get to your PR early next week at the very latest. In the meantime can you rebase to (1) get some more important tests run and (2) get a fix for the pre-commit hook CI (failing for you as well abvove)? Thanks.

@sognetic
Copy link
Contributor Author

sognetic commented Nov 8, 2023

Hi @eduardo-rodrigues!
Wow, #357 seems like a huge effort but also extremely useful to corral the EvtGen model zoo. I've merged in the current master branch so hopefully that fixes the failing CI thing. Everything still passes locally.

I've also worked on a second feature which removes the need to maintain two lists of models in both dec/enums.py and the grammar definition itself. I'll open a PR with this once this one is merged since I've based it on the grammar changes already included here.

@eduardo-rodrigues
Copy link
Member

Sounds great!

Yeah, task #357 is a bit of a pain and repetitive work, but worth to ensure the parsing not only succeedes but also gives correctly parsed info. So far I've found 2-3 issues and made several improvements along the way. It does take time.

Copy link
Member

@eduardo-rodrigues eduardo-rodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant 👍!

I'm requesting trivial changes but that's trivial.

Copy link
Member

@eduardo-rodrigues eduardo-rodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this!

@eduardo-rodrigues eduardo-rodrigues merged commit d31b68a into scikit-hep:master Nov 9, 2023
9 checks passed
@sognetic sognetic deleted the model_alias_transformer_replacement branch November 9, 2023 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants