Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b2868
[SYCL] rm wait() (#7233)
b2867
llama : rename jina tokenizers to v2 (#7249) * refactor: rename jina tokenizers to v2 * refactor: keep refactoring non-breaking
b2865
change default temperature of OAI compat API from 0 to 1 (#7226) * change default temperature of OAI compat API from 0 to 1 * make tests explicitly send temperature to OAI API
b2864
[SYCL] Add oneapi runtime dll files to win release package (#7241) * add oneapi running time dlls to release package * fix path * fix path * fix path * fix path * fix path --------- Co-authored-by: Zhang <[email protected]>
b2862
CUDA: add FP32 FlashAttention vector kernel (#7188) * CUDA: add FP32 FlashAttention vector kernel * fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
b2861
cmake : fix version cmp (#7227)
b2860
remove convert-lora-to-ggml.py (#7204)
b2859
metal : fix warnings (skipme) (#0)
b2854
fix system prompt handling (#7153)
b2852
sync : ggml ggml-ci