Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump llguidance to 0.6.5 #1092

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Bump llguidance to 0.6.5 #1092

wants to merge 3 commits into from

Conversation

hudson-ai
Copy link
Collaborator

Performance optimizations, multipleOf in JSON

Performance optimizations, multipleOf in JSON
@hudson-ai hudson-ai requested a review from mmoskal January 7, 2025 20:36
@mmoskal
Copy link
Collaborator

mmoskal commented Jan 7, 2025

We could do 0.6.2 - I added inline JSON in lark syntax, not really relevant for usage inside of guidance but doesn't hurt

@hudson-ai
Copy link
Collaborator Author

We could do 0.6.2 - I added inline JSON in lark syntax, not really relevant for usage inside of guidance but doesn't hurt

Sure! Currently getting a failure with gpt2 and phi2 in the tool call test though -- will take a look tomorrow:

FAILED tests/model_integration/library/test_gen.py::test_tool_call - ValueError: parser error: Too many items (limit 50000; mask); try avoiding single-byte/short lexemes

@mmoskal
Copy link
Collaborator

mmoskal commented Jan 8, 2025

Well, the error message seems right

    @guidance(
        stateless=True, dedent=False
    )  # stateless=True indicates this function does not depend on LLM generation state
    def number(lm):
        n = one_or_more(select(["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]))
        # Allow for negative or positive numbers
        return lm + select(["-" + n, n])

Maybe we can slap regex() or however it's called around it?

See also guidance-ai/llguidance#9

@hudson-ai
Copy link
Collaborator Author

@mmoskal just out of curiosity, how do we hit 50_000 items so quickly? It seems to happen after only a few tokens have been generated, e.g. Calculator(1

@mmoskal
Copy link
Collaborator

mmoskal commented Jan 16, 2025

Do the Tokenizers here have lots of number tokens? That would do it

@hudson-ai hudson-ai changed the title Bump llguidance to 0.6.1 Bump llguidance to 0.6.5 Jan 16, 2025
@hudson-ai
Copy link
Collaborator Author

I can change the test to use a regex, but I am reluctant to do so given that this test is almost directly from our docs. We (1) should change the docs and (2) should automatically simplify grammars into lexemes when we can -- will leave this broken for a minute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants