§01·spec · /gpus
RTX 3090
nvidia30 series24GB VRAM
§02·models that run on this GPU
17 total| Model | Vertical | Best speed | Min VRAM | Works | Benchmarks | |
|---|---|---|---|---|---|---|
| Qwen3 30B-A3B | llm | 153.6tokens/s | 24GB | ✓ | 2 | check ↗ |
| gpt-oss 20B | llm | 147.5tokens/s | 24GB | ✓ | 2 | check ↗ |
| Gemma 4 26B MoE | llm | 119.4tokens/s | 24GB | ✓ | 2 | check ↗ |
| Qwen3-8B | llm | 115.3tokens/s | 24GB | ✓ | 2 | check ↗ |
| Qwen3.5 35B | llm | 111.2tokens/s | 24GB | ✓ | 2 | check ↗ |
| Llama 3.1 7B | llm | 95tok/s | 24GB | ✓ | 1 | check ↗ |
| Qwen3 14B | llm | 70tokens/s | 24GB | ✓ | 2 | check ↗ |
| Llama 3.1 13B | llm | 55tok/s | 24GB | ✓ | 1 | check ↗ |
| Flux.1 Dev | image | 40s | 24GB | ✓ | 3 | check ↗ |
| Qwen3 32B | llm | 35.1tokens/s | 24GB | ✓ | 2 | check ↗ |
| Gemma4 31B | llm | 34.7tokens/s | 24GB | ✓ | 2 | check ↗ |
| Qwen3.5 27B | llm | 33.5tokens/s | 24GB | ✓ | 2 | check ↗ |
| Llama 3.1 34B | llm | 28tok/s | 24GB | ✓ | 1 | check ↗ |
| Llama 3.1 70B | llm | 10tok/s | 24GB | ✓ | 1 | check ↗ |
| Wan 2.2 14B | video | 7 | 24GB | ✓ | 1 | check ↗ |
| Stable Diffusion XL | image | 5.6s | 24GB | ✓ | 2 | check ↗ |
| Kokoro TTS | tts | 96GB | ✓ | 1 | check ↗ |
§03·tested recipes
showing 6 of 20- multimodalbeginner20GB+recipe
Gemma 4 E4B on RTX 3090: Multimodal Inference via BF16 (with 16 GB of Headroom to Spend)
- 3dintermediate12GB+recipe
Waypoint 1.5 on RTX 3090: Real-Time Interactive World Model at 720p, 30 FPS
- imagebeginner13GB+recipe
Flux.2 Klein 4B on RTX 3090: BFL-Recommended ~13 GB CPU-Offload Path for 4-Step Text-to-Image
- imageintermediate13GB+recipe
Qwen-Image on RTX 3090: 20B Text-to-Image via ComfyUI GGUF (Ampere Path — No FP8)
- videointermediate24GB+recipe
HunyuanVideo-1.5 on RTX 3090: 480p Step-Distilled Image-to-Video on a Razor-Thin 24 GB Envelope
- videointermediate24GB+recipe
Wan 2.2 T2V-A14B on RTX 3090: 720p text-to-video in ComfyUI with FP8 weights (Ampere)