Releases · ggerganov/llama.cpp

13 May 22:39

948f4ec

b2868

[SYCL] rm wait() (#7233)

Assets 19

13 May 14:37

github-actions

b2867

9aa6724

b2867

llama : rename jina tokenizers to v2 (#7249)

* refactor: rename jina tokenizers to v2

* refactor: keep refactoring non-breaking

Assets 19

13 May 03:25

github-actions

b2865

e586ee4

b2865

change default temperature of OAI compat API from 0 to 1 (#7226)

* change default temperature of OAI compat API from 0 to 1

* make tests explicitly send temperature to OAI API

Assets 19

13 May 01:17

github-actions

b2864

cbf7589

b2864

[SYCL] Add oneapi runtime dll files to win release package (#7241)

* add oneapi running time dlls to release package

* fix path

* fix path

* fix path

* fix path

* fix path

---------

Co-authored-by: Zhang <[email protected]>

Assets 19

12 May 21:45

github-actions

b2862

dc685be

b2862

CUDA: add FP32 FlashAttention vector kernel (#7188)

* CUDA: add FP32 FlashAttention vector kernel

* fixup! CUDA: add FP32 FlashAttention vector kernel

* fixup! fixup! CUDA: add FP32 FlashAttention vector kernel

* fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel

Assets 19