§01·index · /recipes
Recipes
716 community-tested setups for running open-weights AI models on real consumer GPUs.page 5 of 8
- imagebeginner7GB+
Anima 2B on RTX 5070 Ti: Native ComfyUI Anime Text-to-Image
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 5070 Ti: Blackwell-Native FP8 4-Step Text-to-Image at ~8.4 GB
- imageintermediate12GB+
SenseNova U1 (8B-MoT) on RTX 5070 Ti: VAE-Free Unified Image Gen + Understanding via Q4 GGUF
- imageintermediate16GB+
LongCat-Image (base T2I) on RTX 5070 Ti: Bilingual 6B Text-to-Image at 16 GB via ComfyUI GGUF
- imageintermediate13GB+
Qwen-Image on RTX 5070 Ti: 20B Text-to-Image via ComfyUI GGUF (Blackwell sm_120, 16 GB)
- imageintermediate14GB+
Chroma1-Base (V48) on RTX 5070 Ti: Uncensored 8.9B FLUX.1-Schnell De-Distillation via Blackwell-Native FP8 in ComfyUI
- imageintermediate10GB+
HiDream-O1-Image on RTX 5070 Ti: 2048×2048 Text-to-Image with FP8 in ComfyUI
- imageintermediate16GB+
Juggernaut Z on RTX 5070 Ti: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16
- imageintermediate16GB+
Z-Image Turbo on RTX 5070 Ti: 8-Step 1024x1024 Text-to-Image at BF16 with Diffusers or ComfyUI
- llmbeginner16GB+
gpt-oss 20B on RTX 5070 Ti: MXFP4 Chat at 156 tok/s via Ollama or vLLM
- llmbeginner16GB+
Qwen3-14B on RTX 5070 Ti: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner16GB+
Qwen3-8B on RTX 5070 Ti: Q4_K_M GGUF via Ollama or llama.cpp
- specializedintermediate3GB+
KiMoDo on RTX 4080 SUPER: Text-to-3D-Motion Generation Guide
- specializedbeginner4GB+
SAM 3 on RTX 4080 SUPER: Promptable Image and Video Segmentation
- multimodalintermediate4GB+
MiniMind-O on RTX 4080 SUPER: 0.1B Omni Model with Headroom to Spare
- 3dadvanced16GB+
TRELLIS image-large on RTX 4080 SUPER: Image-to-3D Mesh Generation at the 16 GB Floor
- 3dintermediate10GB+
Hunyuan3D-2.1 on RTX 4080 SUPER: Image-to-Mesh 3D Generation (Shape-Only)
- 3dintermediate12GB+
Waypoint 1.5 on RTX 4080 SUPER: Real-Time Interactive World Model at 720p
- ttsintermediate12GB+
ACE-Step 1.5 XL on RTX 4080 SUPER: Text-to-Music Generation in ComfyUI
- ttsintermediate8GB+
Qwen3-TTS 1.7B-Base on RTX 4080 SUPER: Multilingual Voice Cloning in 10 Languages with FlashAttention-2
- ttsintermediate12GB+
MOSS-Audio 4B-Instruct on RTX 4080 SUPER: local audio understanding in ~12 GB
- ttsintermediate8GB+
Foundation-1 on RTX 4080 SUPER: Structured Music Sample Generation
- ttsintermediate10GB+
Voxtral Mini 3B on RTX 4080 SUPER: local speech understanding in ~9.5 GB
- ttsintermediate4GB+
OmniVoice on RTX 4080 SUPER: Zero-Shot Voice Cloning Across 600+ Languages with Room to Spare
- ttsbeginner8GB+
VoxCPM2 on RTX 4080 SUPER: 30-Language 48kHz Voice Cloning with 8 GB to Spare
- ttsintermediate5GB+
OpenAudio S1 Mini on RTX 4080 SUPER: 13-Language Distilled TTS in ~5 GB VRAM
- ttsbeginner5GB+
VoxCPM-0.5B on RTX 4080 SUPER: Zero-Shot Voice Cloning TTS in ~5 GB VRAM
- ttsbeginner1GB+
Kokoro TTS on RTX 4080 SUPER: Universal 82M Voice Synthesis with 15 GB to Spare
- videointermediate8GB+
Wan 2.2 TI2V-5B on RTX 4080 SUPER: 720p Text/Image-to-Video in ComfyUI
- videointermediate8GB+
LightX2V on RTX 4080 SUPER: 4-Step Text-to-Video with Distilled Wan2.1-T2V-14B via FP8/INT8 + Offload
- videoadvanced16GB+
Sulphur 2 on RTX 4080 SUPER: Uncensored LTX-2.3 Video via ComfyUI GGUF
- videoadvanced16GB+
LTX-2.3 on RTX 4080 SUPER: 22B Audio-Video at the 16 GB Floor via GGUF + CPU-Offloaded Gemma
- imageintermediate12GB+
SenseNova U1 (8B-MoT) on RTX 4080 SUPER: VAE-Free Unified Image Gen + Understanding via Q4 GGUF
- imageintermediate12GB+
ERNIE-Image-Turbo on RTX 4080 SUPER: 8-step text-to-image via GGUF in ComfyUI
- imagebeginner7GB+
Anima 2B on RTX 4080 SUPER: Native ComfyUI Anime Text-to-Image
- imageintermediate16GB+
LongCat-Image (base T2I) on RTX 4080 SUPER: Bilingual 6B Text-to-Image at 16 GB via ComfyUI GGUF
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 4080 SUPER: BFL-Recommended ~13 GB CPU-Offload Path for 4-Step Text-to-Image
- imageintermediate13GB+
Qwen-Image on RTX 4080 SUPER: 20B Text-to-Image via ComfyUI GGUF (Ada sm_89, 16 GB)
- imageintermediate16GB+
Chroma1-Base (V48) on RTX 4080 SUPER: Uncensored 8.9B FLUX.1-Schnell De-Distillation via FP8-Scaled in ComfyUI
- imageintermediate10GB+
HiDream-O1-Image on RTX 4080 SUPER: 2048×2048 Text-to-Image with FP8 in ComfyUI
- imageintermediate16GB+
Juggernaut Z on RTX 4080 SUPER: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16 via Diffusers or ComfyUI
- imageintermediate16GB+
Z-Image Turbo on RTX 4080 SUPER: 8-Step 1024x1024 Text-to-Image at BF16 with Diffusers or ComfyUI
- llmbeginner10GB+
Llama 3.1 8B on RTX 4080 SUPER: Local Chat via Ollama or llama.cpp + Unsloth UD-Q4_K_XL GGUF
- llmbeginner16GB+
gpt-oss 20B on RTX 4080 SUPER: MXFP4 chat at 139 tok/s via Ollama or vLLM
- llmbeginner16GB+
Qwen3-14B on RTX 4080 SUPER: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner6GB+
Qwen3-8B on RTX 4080 SUPER: Q4_K_M GGUF via Ollama or llama.cpp
- specializedintermediate3GB+
KiMoDo on RTX 4070 Ti Super: Text-to-3D-Motion Generation Guide
- specializedbeginner4GB+
SAM 3 on RTX 4070 Ti Super: Promptable Image and Video Segmentation
- multimodalintermediate4GB+
MiniMind-O on RTX 4070 Ti Super: 0.1B Omni Model with Headroom to Spare
- 3dadvanced16GB+
TRELLIS image-large on RTX 4070 Ti SUPER: Image-to-3D Mesh Generation at the 16 GB Floor
- 3dintermediate10GB+
Hunyuan3D-2.1 on RTX 4070 Ti SUPER: Image-to-Mesh 3D Generation (Shape-Only)
- 3dintermediate12GB+
Waypoint 1.5 on RTX 4070 Ti SUPER: Real-Time Interactive World Model at 720p
- ttsintermediate12GB+
ACE-Step 1.5 XL on RTX 4070 Ti Super: Text-to-Music Generation in ComfyUI
- ttsintermediate12GB+
MOSS-Audio 4B-Instruct on RTX 4070 Ti SUPER: local audio understanding in ~12 GB
- ttsintermediate8GB+
Foundation-1 on RTX 4070 Ti SUPER: Structured Music Sample Generation
- ttsintermediate10GB+
Voxtral Mini 3B on RTX 4070 Ti SUPER: local speech understanding in ~9.5 GB
- ttsintermediate8GB+
Qwen3-TTS 1.7B-Base on RTX 4070 Ti SUPER: Multilingual Voice Cloning in 10 Languages with FlashAttention-2
- ttsbeginner5GB+
VoxCPM-0.5B on RTX 4070 Ti SUPER: Zero-Shot Voice Cloning TTS in ~5 GB VRAM
- ttsintermediate4GB+
OmniVoice on RTX 4070 Ti SUPER: Zero-Shot Voice Cloning Across 600+ Languages with Room to Spare
- ttsbeginner8GB+
VoxCPM2 on RTX 4070 Ti Super: 30-Language 48kHz Voice Cloning with 8 GB to Spare
- ttsintermediate5GB+
OpenAudio S1 Mini on RTX 4070 Ti Super: 13-Language Distilled TTS in ~5 GB VRAM
- ttsbeginner1GB+
Kokoro TTS on RTX 4070 Ti SUPER: Universal 82M Voice Synthesis with 15 GB to Spare
- videointermediate8GB+
Wan 2.2 TI2V-5B on RTX 4070 Ti SUPER: 720p Text/Image-to-Video in ComfyUI
- videointermediate8GB+
LightX2V on RTX 4070 Ti SUPER: 4-Step Text-to-Video with Distilled Wan2.1-T2V-14B via FP8/INT8 + Offload
- videoadvanced16GB+
Sulphur 2 on RTX 4070 Ti SUPER: Uncensored LTX-2.3 Video via ComfyUI GGUF
- videoadvanced16GB+
LTX-2.3 on RTX 4070 Ti SUPER: 22B Audio-Video at the 16 GB Floor via GGUF + CPU-Offloaded Gemma
- imageintermediate12GB+
SenseNova U1 (8B-MoT) on RTX 4070 Ti SUPER: VAE-Free Unified Image Gen + Understanding via Q4 GGUF
- imageintermediate12GB+
ERNIE-Image-Turbo on RTX 4070 Ti SUPER: 8-step text-to-image via GGUF in ComfyUI
- imagebeginner7GB+
Anima 2B on RTX 4070 Ti SUPER: Native ComfyUI Anime Text-to-Image
- imageintermediate16GB+
LongCat-Image (base T2I) on RTX 4070 Ti SUPER: Bilingual 6B Text-to-Image at 16 GB via ComfyUI GGUF
- imagebeginner13GB+
Flux.2 Klein 4B on RTX 4070 Ti SUPER: BFL-Recommended ~13 GB CPU-Offload Path for 4-Step Text-to-Image
- imageintermediate13GB+
Qwen-Image on RTX 4070 Ti SUPER: 20B Text-to-Image via ComfyUI GGUF (Ada sm_89, 16 GB)
- imageintermediate16GB+
Chroma1-Base (V48) on RTX 4070 Ti SUPER: Uncensored 8.9B FLUX.1-Schnell De-Distillation via FP8-Scaled in ComfyUI
- imageintermediate10GB+
HiDream-O1-Image on RTX 4070 Ti SUPER: 2048×2048 Text-to-Image with FP8 in ComfyUI
- imageintermediate16GB+
Juggernaut Z on RTX 4070 Ti SUPER: Cinematic Photoreal Fine-Tune of Z-Image Base at BF16 via Diffusers or ComfyUI
- imageintermediate16GB+
Z-Image Turbo on RTX 4070 Ti SUPER: 8-Step 1024x1024 Text-to-Image at BF16 with Diffusers or ComfyUI
- llmbeginner10GB+
Llama 3.1 8B on RTX 4070 Ti SUPER: Local Chat via Ollama or llama.cpp + Unsloth UD-Q4_K_XL GGUF
- llmbeginner16GB+
gpt-oss 20B on RTX 4070 Ti SUPER: MXFP4 chat at 129 tok/s via Ollama or vLLM
- llmbeginner16GB+
Qwen3-14B on RTX 4070 Ti SUPER: Q4_K_M GGUF via Ollama or llama.cpp
- llmbeginner6GB+
Qwen3-8B on RTX 4070 Ti SUPER: Q4_K_M GGUF via Ollama or llama.cpp
- specializedintermediate3GB+
KiMoDo on RTX 4080: Text-to-3D-Motion Generation Guide
- specializedbeginner4GB+
SAM 3 on RTX 4080: Promptable Image and Video Segmentation
- multimodalintermediate4GB+
MiniMind-O on RTX 4080: 0.1B Omni Model with Headroom to Spare
- 3dadvanced16GB+
TRELLIS image-large on RTX 4080: Image-to-3D Mesh Generation at the 16 GB Floor
- 3dintermediate10GB+
Hunyuan3D-2.1 on RTX 4080: Image-to-Mesh 3D Generation (Shape-Only)
- 3dintermediate12GB+
Waypoint 1.5 on RTX 4080: Real-Time Interactive World Model at 720p
- ttsintermediate12GB+
ACE-Step 1.5 XL on RTX 4080: Text-to-Music Generation in ComfyUI
- ttsintermediate8GB+
Qwen3-TTS 1.7B-Base on RTX 4080: Multilingual Voice Cloning in 10 Languages with FlashAttention-2
- ttsintermediate12GB+
MOSS-Audio 4B-Instruct on RTX 4080: local audio understanding in ~12 GB
- ttsintermediate8GB+
Foundation-1 on RTX 4080: Structured Music Sample Generation
- ttsintermediate10GB+
Voxtral Mini 3B on RTX 4080: local speech understanding in ~9.5 GB
- ttsbeginner8GB+
VoxCPM2 on RTX 4080: 30-Language 48kHz Voice Cloning with 8 GB to Spare
- ttsbeginner5GB+
VoxCPM-0.5B on RTX 4080: Zero-Shot Voice Cloning TTS in ~5 GB VRAM
- ttsintermediate5GB+
OpenAudio S1 Mini on RTX 4080: 13-Language Distilled TTS in ~5 GB VRAM
- ttsintermediate4GB+
OmniVoice on RTX 4080: Zero-Shot Voice Cloning Across 600+ Languages with Room to Spare
- ttsbeginner1GB+
Kokoro TTS on RTX 4080: Universal 82M Voice Synthesis with 15 GB to Spare
- videoadvanced16GB+
LTX-2.3 on RTX 4080: 22B Audio-Video at the 16 GB Floor via GGUF + CPU-Offloaded Gemma
- videointermediate8GB+
LightX2V on RTX 4080: 4-Step Text-to-Video with Distilled Wan2.1-T2V-14B via FP8/INT8 + Offload
- videoadvanced16GB+
Sulphur 2 on RTX 4080: Uncensored LTX-2.3 Video via ComfyUI GGUF
- videointermediate8GB+
Wan 2.2 TI2V-5B on RTX 4080: 720p Text/Image-to-Video in ComfyUI