§01·index · /recipes
Recipes
716 community-tested setups for running open-weights AI models on real consumer GPUs.page 6 of 8
- imageintermediate12GB+
ERNIE-Image-Turbo on RTX 4080: 8-step text-to-image via GGUF in ComfyUI
- imagebeginner7GB+
Anima 2B on RTX 4080: Native ComfyUI Anime Text-to-Image
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 4080: BFL-Recommended ~13 GB CPU-Offload Path for 4-Step Text-to-Image
- imageintermediate12GB+
SenseNova U1 (8B-MoT) on RTX 4080: VAE-Free Unified Image Gen + Understanding via Q4 GGUF
- imageintermediate16GB+
LongCat-Image (base T2I) on RTX 4080: Bilingual 6B Text-to-Image at 16 GB via ComfyUI GGUF
- imageintermediate13GB+
Qwen-Image on RTX 4080: 20B Text-to-Image via ComfyUI GGUF (Ada sm_89, 16 GB)
- imageintermediate16GB+
Chroma1-Base (V48) on RTX 4080: Uncensored 8.9B FLUX.1-Schnell De-Distillation via FP8-Scaled in ComfyUI
- imageintermediate10GB+
HiDream-O1-Image on RTX 4080: 2048×2048 Text-to-Image with FP8 in ComfyUI
- imageintermediate16GB+
Juggernaut Z on RTX 4080: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16 via Diffusers or ComfyUI
- imageintermediate16GB+
Z-Image Turbo on RTX 4080: 8-Step 1024x1024 Text-to-Image at BF16 with Diffusers or ComfyUI
- llmbeginner10GB+
Llama 3.1 8B on RTX 4080: Local Chat via Ollama or llama.cpp + Unsloth UD-Q4_K_XL GGUF
- llmbeginner16GB+
gpt-oss 20B on RTX 4080: MXFP4 chat at 136 tok/s via Ollama or vLLM
- llmbeginner16GB+
Qwen3-14B on RTX 4080: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner6GB+
Qwen3-8B on RTX 4080: Q4_K_M GGUF via Ollama or llama.cpp
- specializedintermediate3GB+
KiMoDo on RTX 5080: Text-to-3D-Motion Generation Guide
- specializedbeginner4GB+
SAM 3 on RTX 5080: Promptable Image and Video Segmentation
- llmbeginner10GB+
Llama 3.1 8B on RTX 5080: Local Chat via Ollama or llama.cpp + Unsloth UD-Q4_K_XL GGUF
- multimodalbeginner6GB+
Gemma 4 E4B on RTX 5080: Multimodal Inference via Q4_K_M GGUF (llama.cpp or Ollama — BF16 will not fit comfortably)
- multimodalintermediate4GB+
MiniMind-O on RTX 5080: 0.1B Omni Model with Headroom to Spare
- 3dadvanced16GB+
TRELLIS image-large on RTX 5080: Image-to-3D Mesh Generation at the 16 GB Floor
- 3dintermediate10GB+
Hunyuan3D-2.1 on RTX 5080: Image-to-Mesh 3D Generation (Shape-Only)
- 3dintermediate12GB+
Waypoint 1.5 on RTX 5080: Real-Time Interactive World Model at 720p
- ttsintermediate12GB+
ACE-Step 1.5 XL on RTX 5080: Text-to-Music Generation in ComfyUI
- ttsintermediate12GB+
MOSS-Audio 4B-Instruct on RTX 5080: local audio understanding in ~12 GB
- ttsintermediate8GB+
Foundation-1 on RTX 5080: Structured Music Sample Generation
- ttsintermediate10GB+
Voxtral Mini 3B on RTX 5080: local speech understanding in ~9.5 GB
- ttsintermediate8GB+
Qwen3-TTS 1.7B-Base on RTX 5080: Multilingual Voice Cloning in 10 Languages
- ttsintermediate4GB+
OmniVoice on RTX 5080: Zero-Shot Voice Cloning Across 646 Languages
- ttsbeginner8GB+
VoxCPM2 on RTX 5080: 30-Language 48kHz Voice Cloning in ~8 GB VRAM
- ttsintermediate5GB+
OpenAudio S1 Mini on RTX 5080: 13-Language Distilled TTS in ~5 GB VRAM
- ttsbeginner5GB+
VoxCPM-0.5B on RTX 5080: Zero-Shot Voice Cloning TTS in ~5 GB VRAM
- ttsbeginner2GB+
Kokoro TTS on RTX 5080: 82M-Parameter Text-to-Speech, 54 Voices, 13 GB Free to Colocate a Second Model
- videointermediate8GB+
Wan 2.2 TI2V-5B on RTX 5080: 720p Text/Image-to-Video in ComfyUI
- videointermediate8GB+
LightX2V on RTX 5080: 4-Step Text-to-Video with Distilled Wan2.1-14B via Blackwell-Native FP8
- videoadvanced16GB+
Sulphur 2 on RTX 5080: Uncensored LTX-2.3 Video via ComfyUI GGUF
- videoadvanced16GB+
LTX-2.3 on RTX 5080: 22B Audio-Video at the 16 GB Floor via GGUF + CPU-Offloaded Gemma
- imageintermediate16GB+
LongCat-Image (base T2I) on RTX 5080: Bilingual 6B Text-to-Image at 16 GB via ComfyUI GGUF
- imageintermediate12GB+
SenseNova U1 (8B-MoT) on RTX 5080: VAE-Free Unified Image Gen + Understanding via Q4 GGUF
- imageintermediate13GB+
Qwen-Image on RTX 5080: 20B Text-to-Image via ComfyUI GGUF (Blackwell sm_120, 16 GB)
- imageintermediate12GB+
ERNIE-Image-Turbo on RTX 5080: 8-step text-to-image via GGUF in ComfyUI
- imageintermediate14GB+
Chroma1-Base (V48) on RTX 5080: Uncensored 8.9B FLUX.1-Schnell De-Distillation via Blackwell-Native FP8 in ComfyUI
- imageintermediate10GB+
HiDream-O1-Image on RTX 5080: 2048×2048 Text-to-Image with FP8 in ComfyUI
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 5080: Blackwell-Native FP8 4-Step Text-to-Image at ~8.4 GB
- imageintermediate16GB+
Juggernaut Z on RTX 5080: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16
- imagebeginner7GB+
Anima 2B on RTX 5080: Native ComfyUI Anime Text-to-Image
- imageintermediate16GB+
Z-Image Turbo on RTX 5080: 8-Step 1024x1024 Text-to-Image at BF16 with Diffusers or ComfyUI
- llmbeginner16GB+
gpt-oss 20B on RTX 5080: MXFP4 Chat at 172 tok/s via Ollama or vLLM
- llmbeginner16GB+
Qwen3-14B on RTX 5080: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner6GB+
Qwen3-8B on RTX 5080: Q4_K_M GGUF via Ollama or llama.cpp
- imageintermediate20GB+
HiDream-O1-Image Full BF16 on RTX 3090 Ti: 2048×2048 Text-to-Image in ComfyUI
- imageintermediate18GB+
LongCat-Image (base T2I) on RTX 3090 Ti: Bilingual 6B Text-to-Image via diffusers BF16 + CPU Offload
- videointermediate14GB+
LightX2V on RTX 3090 Ti: 4-Step Text-to-Video with Distilled Wan2.1-14B via INT8 / BF16 Offload
- videointermediate14GB+
CogVideoX 1.5 5B on RTX 3090 Ti: 1360x768 Text-to-Video with Diffusers
- videointermediate24GB+
Wan 2.2 TI2V-5B on RTX 3090 Ti: 720p Text/Image-to-Video in ComfyUI
- videointermediate24GB+
Wan 2.2 T2V-A14B on RTX 3090 Ti: 720p text-to-video in ComfyUI with FP8 weights (Ampere)
- videointermediate22GB+
Mochi 1 on RTX 3090 Ti: 85-frame 480p Text-to-Video with Diffusers
- videointermediate24GB+
HunyuanVideo-1.5 on RTX 3090 Ti: 480p Step-Distilled Image-to-Video on the Same Razor-Thin 24 GB Envelope
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 3090 Ti: BFL-Recommended ~13 GB CPU-Offload Path for 4-Step Text-to-Image
- 3dintermediate12GB+
Waypoint 1.5 on RTX 3090 Ti: Real-Time Interactive World Model at 720p, ~32 FPS
- multimodalbeginner20GB+
Gemma 4 E4B on RTX 3090 Ti: Multimodal Inference via BF16 (with 16 GB of Headroom to Spend)
- llmintermediate18GB+
Gemma 4 26B A4B-it on RTX 3090 Ti: Local Multimodal Chat via Q4_K_M GGUF + llama.cpp
- llmbeginner12GB+
DeepSeek-R1-Distill-Qwen-14B on RTX 3090 Ti via Ollama Q4_K_M GGUF
- llmbeginner10GB+
Llama 3.1 8B on RTX 3090 Ti: Local Chat via llama.cpp + Unsloth UD-Q4_K_XL GGUF
- imageintermediate24GB+
Chroma1-Base (V48) on RTX 3090 Ti: Uncensored 8.9B FLUX.1-Schnell De-Distillation via Diffusers BF16
- imageintermediate13GB+
Qwen-Image on RTX 3090 Ti: 20B Text-to-Image via ComfyUI GGUF (Ampere Path — No FP8)
- imageintermediate16GB+
Juggernaut Z on RTX 3090 Ti: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16 via Diffusers or ComfyUI
- llmintermediate22GB+
Qwen3-32B on RTX 3090 Ti: UD-Q4_K_XL GGUF via llama.cpp
- llmbeginner10GB+
Qwen3-14B on RTX 3090 Ti: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner16GB+
gpt-oss 20B on RTX 3090 Ti: MXFP4 Chat at 160 tok/s via Ollama or vLLM
- imagebeginner16GB+
Z-Image Turbo on RTX 3090 Ti: 8-Step 1024x1024 Text-to-Image at BF16 in ~6.7s with Diffusers or ComfyUI
- llmbeginner6GB+
Qwen3-8B on RTX 3090 Ti: Q4_K_M GGUF with 18 GB of Headroom for Colocation or Long Context
- llmbeginner10GB+
Llama 3.1 8B on RTX 5060 Ti: Local Chat via Ollama or llama.cpp + Unsloth UD-Q4_K_XL GGUF
- 3dadvanced16GB+
TRELLIS image-large on RTX 5060 Ti: Image-to-3D Mesh Generation at the 16 GB Floor
- videoadvanced31GB+
Sulphur 2 on RTX 5090: Uncensored LTX-2.3 Video at fp8mixed, Native
- llmbeginner10GB+
Llama 3.1 8B on RTX 5090: Local Chat via llama.cpp + Unsloth UD-Q4_K_XL GGUF
- 3dadvanced24GB+
TRELLIS.2-4B on RTX 5090: First Consumer Card That Hits the 24 GB Floor for Image-to-3D
- 3dintermediate29GB+
Hunyuan3D-2.1 on RTX 5090: Image-to-Mesh + PBR Texture in One Pass
- 3dadvanced16GB+
TRELLIS image-large on RTX 5090: Image-to-3D Mesh Generation with the Blackwell Build Path
- 3dintermediate12GB+
Waypoint 1.5 on RTX 5090: Real-Time Interactive World Model at 720p, 72 FPS
- imageintermediate24GB+
ERNIE-Image-Turbo on RTX 5090: 8-Step Text-to-Image at BF16 in ComfyUI
- videointermediate28GB+
LTX-2.3 on RTX 5090: First Card That Hits the 32 GB Floor
- ttsbeginner3GB+
Kokoro TTS on RTX 5090: 82M-Parameter Text-to-Speech, 54 Voices, 30 GB Free for a Multi-Model Server
- multimodalbeginner20GB+
Gemma 4 E4B on RTX 5090: Multimodal Inference via BF16 (with 24 GB of Headroom to Spend)
- imageintermediate21GB+
Qwen-Image on RTX 5090: 20B Text-to-Image via ComfyUI FP8 (Blackwell Native Path)
- imagebeginner9GB+
Flux.2 Klein 4B on RTX 5090: FP8 1.2-Second Generation, Blackwell-Native Speed Win
- imageintermediate18GB+
LongCat-Image (base T2I) on RTX 5090: Bilingual 6B Text-to-Image via diffusers BF16 with 14 GB Headroom
- imageintermediate24GB+
Chroma1-Base (V48) on RTX 5090: Uncensored 8.9B FLUX.1-Schnell De-Distillation via Diffusers BF16
- imageintermediate17GB+
HiDream-O1-Image on RTX 5090: 2048×2048 Text-to-Image with MXFP8 Blackwell-Native Acceleration in ComfyUI
- llmintermediate29GB+
Qwen3-32B on RTX 5090: Q6_K_XL GGUF via llama.cpp (with AWQ-INT4 + 128K context alternative)
- llmintermediate29GB+
Gemma 4 26B A4B-it on RTX 5090: Q8_0 Quality Tier via ggml-org GGUF + llama.cpp
- llmbeginner10GB+
DeepSeek-R1-Distill-Qwen-14B on RTX 5090: 128K Reasoning Context Unlocked
- llmintermediate17GB+
Qwen3-14B on RTX 5090: FP8 via vLLM with Native Blackwell Acceleration
- llmbeginner6GB+
Qwen3-8B on RTX 5090: Q4_K_M GGUF with 26 GB of Headroom for Colocation, BF16, or Full 131K Context
- llmbeginner16GB+
gpt-oss 20B on RTX 5090: MXFP4 Chat at 298 tok/s via Ollama or vLLM
- videointermediate14GB+
HunyuanVideo-1.5 on RTX 5090: 480p Step-Distilled I2V with an 8 GB Headroom Unlock
- videointermediate22GB+
LightX2V on RTX 5090: 4-Step Text-to-Video with Distilled Wan2.1-14B (Blackwell-Native FP8 + Future NVFP4 Path)
- videointermediate24GB+
Wan 2.2 T2V-A14B on RTX 5090: 720p text-to-video in ComfyUI with FP8 scaled weights (Blackwell)
- videointermediate20GB+
CogVideoX 1.5 5B on RTX 5090: 1360x768 Text-to-Video, No-Sequential-Offload Path
- videointermediate22GB+
Mochi 1 on RTX 5090: 85-frame 480p Text-to-Video with Diffusers
- videointermediate8GB+
Wan 2.2 TI2V-5B on RTX 5090: 720p Text/Image-to-Video in ComfyUI