§01·compatibility · /check
Gemma 4 26B MoE on RTX 3090
✓ runsllmactive30 series24GB VRAM
§02·benchmarks
| Task | Quant | Speed | VRAM | Works | Confidence | Source | Verified |
|---|---|---|---|---|---|---|---|
| llm | Q4_K | 3625.6prefill tokens/s | 24GB | ✓ | hardware-corner.net· web | 2026-05-15 | |
| llm | Q4_K | 119.4tokens/s | 24GB | ✓ | hardware-corner.net· web | 2026-05-15 |
§03·related recipes
- llmintermediate29GB+
Gemma 4 26B A4B-it on RTX 5090: Q8_0 Quality Tier via ggml-org GGUF + llama.cpp
- llmintermediate18GB+
Gemma 4 26B A4B-it on RTX 3090: Local Multimodal Chat via Q4_K_M GGUF + llama.cpp
- llmintermediate18GB+
Gemma 4 26B A4B-it on RTX 4090: Local Multimodal Chat via Q4_K_M GGUF + llama.cpp