§01·model · /models
Ornith 1.0 35B
llmactiveMIT
DeepReinforce's open agentic-coding model — a 35B-parameter Mixture-of-Experts post-trained on top of Gemma 4 + Qwen 3.5 with a self-scaffolding RL method (the model learns to generate its own task harnesses during training). 262K context, <think> reasoning, and native tool-calling for agentic dev workflows. SWE-bench Verified 75.6 / Terminal-Bench 2.1 64.2 — state-of-the-art among similarly-sized open models. Runs locally via llama.cpp/Ollama from GGUF quants; Q4_K_M is ~21.2 GB — fits a 24 GB card but tight on context, since an MoE keeps all experts resident in VRAM. For 8-16 GB cards use the 9B build.
Download· 5 variants
§02·GPUs that run this model
8 total| GPU | VRAM | Series | Best speed | Min VRAM | Works | Benchmarks | Recipe | |
|---|---|---|---|---|---|---|---|---|
| Apple M2 Max | 64GB | apple | ~ | 0 | recipe | check ↗ | ||
| Apple M3 Max | 48GB | apple | ~ | 0 | recipe | check ↗ | ||
| Apple M4 Max | 48GB | apple | ~ | 0 | recipe | check ↗ | ||
| RTX 3090 | 24GB | 30 | ~ | 0 | recipe | check ↗ | ||
| RTX 3090 Ti | 24GB | 30 | ~ | 0 | recipe | check ↗ | ||
| RTX 4090 | 24GB | 40 | ~ | 0 | recipe | check ↗ | ||
| RTX 5090 | 32GB | 50 | ~ | 0 | recipe | check ↗ | ||
| RX 7900 XTX | 24GB | amd | ~ | 0 | recipe | check ↗ |
✓ benchmarked·~ runs via recipe (not benchmarked)·— untested·✕doesn't fit