Inference parameters
Custom models
Recommended models
smollm2-360m-instruct-q8_0.gguf
HF repo: ngxson/SmolLM2-360M-Instruct-Q8_0-GGUF
Size: 368.5 MB
HF repo: ngxson/SmolLM2-360M-Instruct-Q8_0-GGUF
Size: 368.5 MB
llama-3.2-1b-instruct-q4_k_m.gguf
HF repo: hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Size: 770.3 MB
HF repo: hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Size: 770.3 MB
qwen2-1_5b-instruct-q4_k_m-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 940.4 MB
HF repo: ngxson/wllama-split-models
Size: 940.4 MB
smollm2-1.7b-instruct-q4_k_m.gguf
HF repo: ngxson/SmolLM2-1.7B-Instruct-Q4_K_M-GGUF
Size: 1006.7 MB
HF repo: ngxson/SmolLM2-1.7B-Instruct-Q4_K_M-GGUF
Size: 1006.7 MB
gemma-2-2b-it-abliterated-Q4_K_M-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 1.6 GB
HF repo: ngxson/wllama-split-models
Size: 1.6 GB
neuralreyna-mini-1.8b-v0.3.q4_k_m-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 1.1 GB
HF repo: ngxson/wllama-split-models
Size: 1.1 GB
Phi-3.1-mini-128k-instruct-Q3_K_M-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 1.8 GB
HF repo: ngxson/wllama-split-models
Size: 1.8 GB
meta-llama-3.1-8b-instruct-abliterated.Q2_K-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 3.0 GB
HF repo: ngxson/wllama-split-models
Size: 3.0 GB
Big model size, may not be able to load due to RAM limitation
Meta-Llama-3.1-8B-Instruct-Q2_K-(shards).gguf
HF repo: ngxson/wllama-split-models
Size: 3.0 GB
HF repo: ngxson/wllama-split-models
Size: 3.0 GB
Big model size, may not be able to load due to RAM limitation