ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 15.2k
Star 96.3k

Code
Issues 426
Pull requests 742
Discussions
Actions
Projects
Wiki
Security 10
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 91 Milestones 0

New pull request New

742 Open 9,129 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

ggml-cpu: optimise s390x multiply extend instructions ggml

changes relating to the ggml tensor library for machine learning

#20032 opened Mar 2, 2026 by taronaeo

Loading…

cann: support flash attention for head dim not multiple of 16 Ascend NPU

issues specific to Ascend NPUs

ggml

changes relating to the ggml tensor library for machine learning

#20031 opened Mar 2, 2026 by noemotiovon

Loading…

gguf-py: add type validation to GGUFWriter.add_key_value python

python script changes

#20023 opened Mar 1, 2026 by Scottcjn

Loading…

json-schema: handle typeless schema nodes as any-value testing

Everything test related

#20021 opened Mar 1, 2026 by Scottcjn

Loading…

gguf: add big-endian magic "FUGG" for explicit endianness detection ggml

changes relating to the ggml tensor library for machine learning

python

python script changes

testing

Everything test related

#20019 opened Mar 1, 2026 by Scottcjn

Loading…

vulkan: add UMA zero-copy async transfers and fix event_record deferred memcpy handling ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#20018 opened Mar 1, 2026 by neilopet

Loading…

vulkan: add sparse OOM fallback for large UMA allocations and chunked staging fallback ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#20017 opened Mar 1, 2026 by neilopet

Loading…

feat: add --cache-only flag to skip model re-download

#20010 opened Mar 1, 2026 by lonnie08

Loading…

server: add Qwen3-Reranker instruction support examples python

python script changes

server

#20009 opened Mar 1, 2026 by schwebke

Loading…

webui: add PWA support examples server

#19995 opened Feb 28, 2026 by matous-volf

Loading…

common : fix common_chat_peg_parse for incomplete utf-8 sequence tail testing

Everything test related

#19992 opened Feb 28, 2026 by akreal

Loading…

build: fix various compiler warnings on Windows MinGW examples testing

Everything test related

#19990 opened Feb 28, 2026 by jonathanjacksonswe

Loading…

vulkan: tune MMVQ for Intel Windows ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#19988 opened Feb 28, 2026 by 0cc4m

Loading…

ggml : add GGML_OP_ADD1 for metal Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

ggml

changes relating to the ggml tensor library for machine learning

#19987 opened Feb 28, 2026 by aisk

Loading…

cli : add command and file auto-completion examples

#19985 opened Feb 28, 2026 by CISC

Loading…

Re-enable manual LoRA adapter free

#19983 opened Feb 28, 2026 by PopFlamingo

Loading…

Add file existence and type checks for imatrix examples

#19974 opened Feb 28, 2026 by geoffmunn

Loading…

feat: add --threads-all option to llama-bench examples

#19971 opened Feb 28, 2026 by hobostay

Loading…

4 tasks done

server: batch checkpoints to support kvcache context truncation examples server

#19970 opened Feb 28, 2026 by aagit

Loading…

Fix logic for retrieving schema items in json_schema_to_grammar.py examples python

python script changes

#19968 opened Feb 28, 2026 by RayXu14

Loading…

ggml webgpu: fix workgroup dispatch limit for large batch sizes ggml

changes relating to the ggml tensor library for machine learning

#19965 opened Feb 28, 2026 by abhijitramesh

Loading…

Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type documentation

Improvements or additions to documentation

ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#19959 opened Feb 27, 2026 by wallentri88

Loading…

tools : enable kvu in perplexity for hellaswag, winogrande, multiple-choice examples

#19954 opened Feb 27, 2026 by angt

Loading…

scripts : improve get-wikitext-2.sh script

Script related

#19952 opened Feb 27, 2026 by angt

Loading…

[New quant] Q3_PT examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

testing

Everything test related

#19941 opened Feb 26, 2026 by pwilkin • Draft

Previous 1 2 3 4 5 … 29 30 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!