§01·model · /models

Phi-4

llmactiveMIT

Phi-4 is Microsoft's dense 14-billion-parameter model (release 2024-12), the flagship of the Phi-4 family. Text-only, with a 16K-token context window, built on the phi3 architecture and trained heavily on curated and synthetic data for strong STEM, math and reasoning at a small size. Licensed MIT (commercial use permitted). Microsoft reports MMLU 84.8, GPQA 56.1, MATH 80.4, HumanEval 82.6 and MGSM 80.6 — competitive with much larger models on reasoning benchmarks. Microsoft publishes a first-party GGUF (microsoft/phi-4-gguf); community unsloth and bartowski builds add the conventional K_M quant ladder. Loads on current llama.cpp out of the box (Q4_K_M ~8.9 GB fits an 8-12 GB card; Q6_K/Q8_0 for 16 GB+). Early-2025 GGUFs had an EOS/chat-template bug now fixed — use a current build and a recent GGUF.

Download· 5 variants

huggingface.co ↗huggingface.co ↗

§02·GPUs that run this model

8 total

GPU	VRAM	Series	Works	Recipe
Apple M2 Pro	16GB	apple	~	recipe	check ↗
Apple M3 Max	48GB	apple	~	recipe	check ↗
RTX 3090	24GB	30	~	recipe	check ↗
RTX 4070	12GB	40	~	recipe	check ↗
RTX 4080	16GB	40	~	recipe	check ↗
RTX 4090	24GB	40	~	recipe	check ↗
RTX 5090	32GB	50	~	recipe	check ↗
RX 7800 XT	16GB	amd	~	recipe	check ↗

✓ benchmarked·~ runs via recipe (not benchmarked)·— untested·✕doesn't fit