Laguna XS 2.1
Poolside's Laguna XS 2.1 is a 33B-parameter Mixture-of-Experts LLM specialized for agentic coding and long-horizon software-engineering work. It routes 8 of 256 experts (plus a shared expert) per token — roughly 3B active parameters — but keeps all experts resident, so its memory footprint equals the full quantized weight file (Q4_K_M is 18.9 GiB, the smallest published GGUF, making it a 24 GB-class card model). It supports a 262,144-token (256K) context via YaRN scaling for whole-repository tasks. Vendor evals report SWE-bench Verified 70.9%, SWE-bench Multilingual 63.1%, SWE-Bench Pro (public) 47.6%, and Terminal-Bench 2.0 37.5%. Released under the permissive OpenMDW-1.1 license, with first-party GGUF quants (Q4_K_M, BF16) published by the same poolside org. Runs as a standard GGUF coding LLM under llama.cpp or Ollama, driven by an agent client (OpenHands / Aider) over an OpenAI-compatible API.
| GPU | VRAM | Series | Best speed | Min VRAM | Works | Benchmarks | Recipe | |
|---|---|---|---|---|---|---|---|---|
| Apple M2 Max | 64GB | apple | ~ | 0 | recipe | check ↗ | ||
| Apple M3 Max | 48GB | apple | ~ | 0 | recipe | check ↗ | ||
| RTX 3090 | 24GB | 30 | ~ | 0 | recipe | check ↗ | ||
| RTX 3090 Ti | 24GB | 30 | ~ | 0 | recipe | check ↗ | ||
| RTX 4090 | 24GB | 40 | ~ | 0 | recipe | check ↗ | ||
| RTX 5090 | 32GB | 50 | ~ | 0 | recipe | check ↗ | ||
| RX 7900 XTX | 24GB | amd | ~ | 0 | recipe | check ↗ |
✓ benchmarked·~ runs via recipe (not benchmarked)·— untested·✕doesn't fit