-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
ggml-cpu: optimise s390x multiply extend instructions
ggml
changes relating to the ggml tensor library for machine learning
#20032
opened Mar 2, 2026 by
taronaeo
Loading…
cann: support flash attention for head dim not multiple of 16
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#20031
opened Mar 2, 2026 by
noemotiovon
Loading…
gguf-py: add type validation to GGUFWriter.add_key_value
python
python script changes
#20023
opened Mar 1, 2026 by
Scottcjn
Loading…
json-schema: handle typeless schema nodes as any-value
testing
Everything test related
#20021
opened Mar 1, 2026 by
Scottcjn
Loading…
feat: add --cache-only flag to skip model re-download
#20010
opened Mar 1, 2026 by
lonnie08
Loading…
common : fix common_chat_peg_parse for incomplete utf-8 sequence tail
testing
Everything test related
#19992
opened Feb 28, 2026 by
akreal
Loading…
build: fix various compiler warnings on Windows MinGW
examples
testing
Everything test related
#19990
opened Feb 28, 2026 by
jonathanjacksonswe
Loading…
vulkan: tune MMVQ for Intel Windows
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#19988
opened Feb 28, 2026 by
0cc4m
Loading…
ggml : add GGML_OP_ADD1 for metal
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#19987
opened Feb 28, 2026 by
aisk
Loading…
Add file existence and type checks for imatrix
examples
#19974
opened Feb 28, 2026 by
geoffmunn
Loading…
feat: add --threads-all option to llama-bench
examples
#19971
opened Feb 28, 2026 by
hobostay
Loading…
4 tasks done
server: batch checkpoints to support kvcache context truncation
examples
server
#19970
opened Feb 28, 2026 by
aagit
Loading…
Fix logic for retrieving schema items in json_schema_to_grammar.py
examples
python
python script changes
#19968
opened Feb 28, 2026 by
RayXu14
Loading…
ggml webgpu: fix workgroup dispatch limit for large batch sizes
ggml
changes relating to the ggml tensor library for machine learning
#19965
opened Feb 28, 2026 by
abhijitramesh
Loading…
Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19959
opened Feb 27, 2026 by
wallentri88
Loading…
tools : enable kvu in perplexity for hellaswag, winogrande, multiple-choice
examples
#19954
opened Feb 27, 2026 by
angt
Loading…
scripts : improve get-wikitext-2.sh
script
Script related
#19952
opened Feb 27, 2026 by
angt
Loading…
[New quant] Q3_PT
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
testing
Everything test related
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.