Perplexity and memory use comparisons would be useful #9

JohannesGaessler · 2023-08-17T15:53:08Z

Currently the README does not necessarily provide a like-for-like comparison because 4 bit quantizations can be of different quality depending on the implementation details. For example, in llama.cpp q4_0 is faster than q4_K_M but the quantization format is less efficient in terms of size. So it would be useful to include measurements for the memory usage as well as measure for the output quality (e.g. perplexity on a large corpus of text) to put the speed numbers into context.

junrushao · 2023-08-17T20:14:41Z

Will do

shwu-nyunai · 2024-01-31T12:57:30Z

+1 on perplexity. Any timeline on this?
thanks.

JohannesGaessler · 2024-01-31T13:28:13Z

I don't know about the timeline but by now llama.cpp has support for the calculation of the KL divergence relative to FP16, see ggerganov/llama.cpp#5076 . This would be a better metric for comparison than perplexity.

shwu-nyunai · 2024-02-01T06:32:39Z

sure. I am using the perplexity scores for a paper, hence I need ppl values.
Also, how to go about actually calculating the scores? I doubt I'll be able to directly run the llama script to get the scores on mlc models. Have been trying to find out a way to change mlc_chat but no progress so far.

If you have the scripts that you had used on mlc quantised models, it'd be of great help.

I'm trying to capture out the generated logits for a prompt input. no luck.

junrushao added the feature request label Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perplexity and memory use comparisons would be useful #9

Perplexity and memory use comparisons would be useful #9

JohannesGaessler commented Aug 17, 2023

junrushao commented Aug 17, 2023

shwu-nyunai commented Jan 31, 2024

JohannesGaessler commented Jan 31, 2024

shwu-nyunai commented Feb 1, 2024

Perplexity and memory use comparisons would be useful #9

Perplexity and memory use comparisons would be useful #9

Comments

JohannesGaessler commented Aug 17, 2023

junrushao commented Aug 17, 2023

shwu-nyunai commented Jan 31, 2024

JohannesGaessler commented Jan 31, 2024

shwu-nyunai commented Feb 1, 2024