self-hosted/ai
§01·model · /models

Mistral Small 3.2 24B

llmactiveApache-2.0

Mistral Small 3.2 24B (release 2506) is Mistral AI's dense 24-billion-parameter instruction-tuned model — the newest generalist in the Small line, superseding 3.1 (2503). Built on the Mistral 3 architecture with a Pixtral vision tower, it accepts text and images and answers in text, with a 128K-token context window and the Tekken tokenizer. Licensed Apache-2.0 (commercial use permitted). Mistral reports gains over 3.1 in instruction-following, function-calling and reduced repetition (Wildbench v2 65.3%, Arena Hard v2 43.1%), with strong STEM/coding scores (MMLU Pro 69.1%, MATH 69.4%, HumanEval Plus 92.9%) and coverage across 23 languages. No first-party GGUF ships; the community bartowski and unsloth GGUF builds run on current llama.cpp out of the box (Q4_K_M ~14.3 GB fits a 16 GB card; Q6_K/Q8_0 for 24 GB+). Vision input needs the separate mmproj projector file.

§02·GPUs that run this model
8 total
GPUVRAMSeriesBest speedMin VRAMWorksBenchmarksRecipe
Apple M2 Max64GBapple~0recipecheck ↗
Apple M3 Max48GBapple~0recipecheck ↗
RTX 309024GB30~0recipecheck ↗
RTX 3090 Ti24GB30~0recipecheck ↗
RTX 408016GB40~0recipecheck ↗
RTX 409024GB40~0recipecheck ↗
RTX 509032GB50~0recipecheck ↗
RX 7900 XTX24GBamd~0recipecheck ↗

benchmarked·~ runs via recipe (not benchmarked)· untested·doesn't fit