Improve query formatting & sizeing #23

jepler · 2023-10-09T14:51:03Z

huggingface transformers 4.34, which is quite new, has support for "chat templates" and can also tell you the size of a chat in tokens.

however, a lot of models don't have the required chat templates (yet?) and getting chat templates for some models (e.g., llama2) requires special permission even if a derived quantized model was not behind a signup wall.

Use this tech, or something like it, to replace the hard-coded query formatting of the llama_cpp backend and to improve the length of the query itself instead of having the hard-coded limit of 5 messages.

jepler mentioned this issue Nov 1, 2023

Use transformers chat templates #24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve query formatting & sizeing #23

Improve query formatting & sizeing #23

jepler commented Oct 9, 2023

Improve query formatting & sizeing #23

Improve query formatting & sizeing #23

Comments

jepler commented Oct 9, 2023