self-hosted/ai
§01·model · /models

Ornith 1.0 9B

llmactiveMIT

The small-rig member of DeepReinforce's open (MIT) Ornith 1.0 agentic-coding family — a ~9B dense model (Qwen3.5 + Gemma 4 lineage) with 262K context, <think> reasoning, and tool-calling. SWE-bench Verified 69.4. Runs locally via llama.cpp/Ollama from GGUF; Q4_K_M is ~5.63 GB (fits an 8 GB card) up to Q8_0 ~9.53 GB. The companion to Ornith 1.0 35B for cards below the 35B's 24 GB floor.

Download· 5 variants
§02·GPUs that run this model
19 total
GPUVRAMSeriesBest speedMin VRAMWorksBenchmarksRecipe
Apple M2 Pro16GBapple~0recipecheck ↗
RTX 306012GB30~0recipecheck ↗
RTX 3060 Ti8GB30~0recipecheck ↗
RTX 3080 Ti12GB30~0recipecheck ↗
RTX 40608GB40~0recipecheck ↗
RTX 4060 Ti 16GB16GB40~0recipecheck ↗
RTX 4060 Ti 8GB8GB40~0recipecheck ↗
RTX 407012GB40~0recipecheck ↗
RTX 4070 Super12GB40~0recipecheck ↗
RTX 4070 Ti12GB40~0recipecheck ↗
RTX 4070 Ti Super16GB40~0recipecheck ↗
RTX 408016GB40~0recipecheck ↗
RTX 4080 Super16GB40~0recipecheck ↗
RTX 50608GB50~0recipecheck ↗
RTX 5060 Ti16GB50~0recipecheck ↗
RTX 507012GB50~0recipecheck ↗
RTX 5070 Ti16GB50~0recipecheck ↗
RTX 508016GB50~0recipecheck ↗
RX 7800 XT16GBamd~0recipecheck ↗

benchmarked·~ runs via recipe (not benchmarked)· untested·doesn't fit