ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 14.5k
Star 93k

Code
Issues 357
Pull requests 645
Discussions
Actions
Projects 1
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 88 Milestones 0

New pull request New

645 Open 8,565 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

convert_hf_to_gguf.py: refactor modify_tensors to call super python

python script changes

#18866 opened Jan 15, 2026 by am17an

Loading…

sampling : update outdated comment about has_sampled [no ci]

#18863 opened Jan 15, 2026 by danbev

Loading…

sampling : add support for saving/loading backend sampling state testing

Everything test related

#18862 opened Jan 15, 2026 by danbev • Draft

wasm, tests: fix ctests with emscripten build

Compilation issues

ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

#18861 opened Jan 15, 2026 by aviallon • Draft

ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) ggml

changes relating to the ggml tensor library for machine learning

#18860 opened Jan 15, 2026 by Alcpz

Loading…

ggml-cpu: add RVV vec dot kernels for quantization types ggml

changes relating to the ggml tensor library for machine learning

#18859 opened Jan 15, 2026 by rehan-10xengineer

Loading…

ggml-cpu: add q4_0 repack support for wasm ggml

changes relating to the ggml tensor library for machine learning

#18858 opened Jan 15, 2026 by aviallon • Draft

enforce response_format and json_schema for Kimi K2 testing

Everything test related

#18851 opened Jan 15, 2026 by akoumjian

Loading…

Deepseek v3.2 dense attention support from @fairydreaming python

python script changes

#18849 opened Jan 14, 2026 by createthis

Loading…

kv-cache : optimize KQ mask construction

#18842 opened Jan 14, 2026 by ggerganov • Draft

Changing default values of mmap and direct_io to false in llama-bench examples

#18841 opened Jan 14, 2026 by JTischbein

Loading…

# [RFC] Integrate sparse-ternary-fma for TQ2_0 quantization ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

#18836 opened Jan 14, 2026 by HyperFoldUK

Loading…

vulkan: Revert forced full subgroup for FlashAttention ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#18831 opened Jan 14, 2026 by rillomas • Draft

model: Add PaddleOCR-VL model support examples model

Model specific

python

python script changes

#18825 opened Jan 14, 2026 by megemini

Loading…

ggml-blas: hide warnings from included BLAS headers ggml

changes relating to the ggml tensor library for machine learning

#18818 opened Jan 13, 2026 by DaAwesomeP

Loading…

ggml-backend: Separate dynamic lib install and search paths, add relative search ggml

changes relating to the ggml tensor library for machine learning

#18817 opened Jan 13, 2026 by DaAwesomeP

Loading…

HIP: tune mmq/rocblas switching for RDNA4 ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#18816 opened Jan 13, 2026 by jiachengjason

Loading…

sampling : remove sampling branching in output_reserve

#18811 opened Jan 13, 2026 by danbev

Loading…

llama: fix integer type consistency in split helpers

#18798 opened Jan 13, 2026 by MaheshJakkala

Loading…

CANN: fix an issue where get_env was not fully renamed Ascend NPU

issues specific to Ascend NPUs

devops

improvements to build systems and github actions

ggml

changes relating to the ggml tensor library for machine learning

#18796 opened Jan 13, 2026 by noemotiovon

Loading…

Unified delta net handling for Qwen3Next and Kimi Linear models model

Model specific

#18792 opened Jan 12, 2026 by pwilkin

Loading…

server: fix memory reservations in populate_token_probs examples server

#18787 opened Jan 12, 2026 by l-austenfeld

Loading…

ggml-cpu: add RVV vec dot kernels for quantization types ggml

changes relating to the ggml tensor library for machine learning

#18784 opened Jan 12, 2026 by taimur-10x • Draft

webui : send both backend_sampling == false/true examples server

#18781 opened Jan 12, 2026 by ggerganov

Loading…

vocab: add tokenizer support for jina-embeddings-v2-base-zh python

python script changes

#18756 opened Jan 11, 2026 by o7si • Draft

Previous 1 2 3 4 5 … 25 26 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!