Wan 2.2 TI2V-5B on RTX 4080 SUPER: 720p Text/Image-to-Video in ComfyUI

What You'll Build

A local ComfyUI pipeline that turns a text prompt (or a starting image) into a 5-second 720p video using the Wan 2.2 TI2V-5B model — the only Wan 2.2 variant the official repo documents as runnable on a single consumer-grade GPU. The recipe walks through the ComfyUI native workflow as the canonical path on the RTX 4080 SUPER, with a QuantStack Q8 GGUF alternative for tighter VRAM or colocation.

Hardware data: RTX 4080 SUPER (16GB VRAM, Ada sm_89) · 720p (1280×704 / 704×1280) at 24 fps via ComfyUI native offloading · See benchmark data

Why TI2V-5B and not the 14B variants? The Wan 2.2 family ships five variants: TI2V-5B (this recipe), T2V-A14B, I2V-A14B, S2V-14B, and Animate-14B. The four 14B-class variants are MoE models whose single-GPU command the official Wan-Video/Wan2.2 README documents with the note that it needs at least 80 GB VRAM — far past a 16 GB consumer card at native precision. Only TI2V-5B is positioned as a single-consumer-GPU target. The Wan-AI HF card describes it as a 5B dense model released alongside the larger MoE models. TI2V-5B is dense (one fused checkpoint, no high-noise / low-noise expert split), so the timestep-MoE plumbing the 14B-A14B siblings need does not apply here. The 14B variants need a different recipe entirely.

Requirements

Component	Minimum	Tested
GPU	8 GB VRAM (the official ComfyUI tutorial documents the 5B model fitting on 8 GB with ComfyUI native offloading)	RTX 4080 SUPER (16GB, Ada sm_89)
RAM	16 GB	32 GB+ recommended (offloading is RAM-heavy)
Storage	~17 GB (TI2V-5B FP16 weights 9.31 GB + UMT5-XXL FP8 text encoder 6.27 GB + Wan2.2-VAE 1.31 GB)	—
Software	ComfyUI (recent build with Wan 2.2 templates), Python 3.10+, PyTorch ≥ 2.4 (default cu124 stable wheel)	—

The RTX 4080 SUPER is an Ada Lovelace card (AD103, sm_89, 10240 CUDA cores, ~736 GB/s GDDR6X memory bandwidth, 320 W). Unlike Blackwell GPUs, no special CUDA-wheel selection is required — the default stable PyTorch wheel (cu124) already ships sm_89 kernels, and prebuilt FlashAttention wheels cover sm_89.

Installation

1. Install / update ComfyUI

Use a build new enough to expose the Wan 2.2 templates under Workflow → Browse Templates → Video → "Wan2.2 5B video generation". The official ComfyUI Wan 2.2 tutorial documents the 5B model fitting on 8 GB of VRAM with ComfyUI native offloading — on a 16 GB 4080 SUPER you have comfortable headroom on top of that 8 GB floor.

2. Download model files for the native workflow

Per the ComfyUI native workflow docs, download these three files from the Comfy-Org Wan 2.2 repackaged repo and place them in ComfyUI/models/:

# diffusion model → ComfyUI/models/diffusion_models/
wget https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors

# text encoder → ComfyUI/models/text_encoders/
wget https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors

# VAE → ComfyUI/models/vae/
wget https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan2.2_vae.safetensors

The resulting layout matches what the official template expects:

ComfyUI/models/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors
ComfyUI/models/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors
ComfyUI/models/vae/wan2.2_vae.safetensors

Open the Wan2.2 5B video generation template, set the positive prompt, and queue. The Wan22ImageToVideoLatent node exposes resolution (1280×704 or 704×1280) and frame count.

3. (Alternative) Install ComfyUI-GGUF and a Q8 quant

For lower peak VRAM headroom or to colocate another model on the card, use the community Q8 quant from QuantStack/Wan2.2-TI2V-5B-GGUF. The QuantStack repo's base_model is Wan-AI/Wan2.2-TI2V-5B — it is a direct GGUF conversion of the Wan-AI canonical card.

Install city96/ComfyUI-GGUF:

git clone https://github.com/city96/ComfyUI-GGUF ComfyUI/custom_nodes/ComfyUI-GGUF
pip install --upgrade gguf

Download the Q8_0 file (5.40 GB) and place it in the unet folder:

wget -P ComfyUI/models/unet \
  https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/resolve/main/Wan2.2-TI2V-5B-Q8_0.gguf

In the official template, swap the Load Diffusion Model node for Unet Loader (GGUF) and point it at the .gguf file. The text encoder and VAE remain unchanged. Per-tier sizes from the HF tree API: Q4_K_S 3.12 GB, Q5_K_M 3.81 GB, Q6_K 4.21 GB, Q8_0 5.40 GB — Q8_0 is generally indistinguishable from FP16 for this model.

Running

With the Wan2.2 5B video generation template loaded, enter a prompt, set resolution to 1280×704 for landscape or 704×1280 for portrait, set the frame count for the clip length you want (24 fps → 120 frames for a 5-second clip), and queue. The first render is slower due to model load; subsequent renders reuse the cached weights.

For image-to-video, drop a starting image into the LoadImage node wired into the template's Wan22ImageToVideoLatent input — TI2V is a unified text-and-image-to-video model, so the same workflow file handles both modes.

If you prefer the command-line route, the official repo documents this exact invocation for TI2V-5B:

git clone https://github.com/Wan-Video/Wan2.2.git
cd Wan2.2
pip install -r requirements.txt
python generate.py --task ti2v-5B --size 1280*704 \
  --ckpt_dir ./Wan2.2-TI2V-5B \
  --offload_model True --convert_model_dtype --t5_cpu \
  --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage"

The Wan-AI README annotates that command as targeting a GPU with at least 24 GB VRAM — it is tuned for a 24 GB card and, per Wan2.2 issue #90, OOMs at default 720p settings even on the 24 GB tier. On the 16 GB 4080 SUPER the ComfyUI native route (Installation step 2) is the reliable path, because ComfyUI's runtime offloader is more aggressive than the CLI's static three-flag offload setup and is the path the documented 8 GB working floor refers to.

Results

Speed: No first-party RTX 4080 SUPER measurement for TI2V-5B currently exists in the Wan-AI HF card, the official Wan2.2 README, or the backend benchmark data (/check/wan-2-2/rtx-4080-super returns no benchmark rows, verified at write time). The HF card's only published timing is the model-wide claim that TI2V-5B can generate a 5-second 720P video in under 9 minutes on a single consumer-grade GPU, and the README names the RTX 4090 (a higher-bandwidth Ada card) as the GPU class that single-GPU command targets. No published 4080 SUPER number exists to quote, so we do not extrapolate one. Report your measured 4080 SUPER timing via /contribute to land a first-party benchmark row for this pair.
VRAM usage: ~8 GB working floor on the ComfyUI native path with the runtime offloader engaged (per the ComfyUI tutorial), leaving ~8 GB headroom on the 16 GB 4080 SUPER. The native FP16 diffusion file (9.31 GB) plus UMT5-XXL FP8 text encoder (6.27 GB) plus Wan2.2-VAE (1.31 GB) sum to ~17 GB on disk; runtime peak with ComfyUI's offloader is well below that because weights stream rather than all loading resident. Live data: /check/wan-2-2/rtx-4080-super.
Quality notes: TI2V-5B output is 720p (1280×704 or 704×1280) at 24 fps; the Wan-AI README documents 720P generation at 24 FPS for this model. Clip length is configurable via frame count. The dense single-checkpoint architecture means quality is consistent across the canonical FP16 path; there is no per-expert quality-vs-speed dial for this variant.

For the full benchmark data, see /check/wan-2-2/rtx-4080-super.

Troubleshooting

CLI path OOMs at 720p (Wan2.2 issue #90)

The official generate.py CLI command is tuned for a 24 GB card and, per Wan2.2 issue #90, OOMs at default 720p settings even on 24 GB GPUs. Use the ComfyUI native workflow (Installation step 2) instead — it uses a different memory plan with a documented 8 GB working floor and is the reliable path on the 16 GB 4080 SUPER.

Out of memory at 720p

Make sure ComfyUI's native offloading is active (it is by default in recent builds — the official tutorial relies on it for the 8 GB minimum claim). If the FP16 path still presses against the 16 GB envelope while you have other models loaded, switch the diffusion model to a QuantStack Q8_0 GGUF (5.40 GB on disk) via the Unet Loader (GGUF) node — peak VRAM drops further and quality loss at Q8 is minimal.

No FP8 weight file for TI2V-5B

TI2V-5B is a dense single-checkpoint model — Wan-AI does not publish an FP8 weight path for it (unlike the 14B-A14B siblings, whose FP8-scaled experts ship via the Comfy-Org repackager). The canonical path is the FP16 safetensors file in Installation step 2, and the VRAM escape hatch is the GGUF quant ladder above — not FP8. Do not look for a *_fp8_scaled.safetensors file for TI2V-5B; it does not exist.

Want the 14B variants?

Per the official README, the 14B / A14B single-GPU commands need at least 80 GB VRAM — out of scope for a 16 GB card at native precision. Community GGUF quants of the 14B Wan variants exist but need a separate workflow; file a request on /contribute if you want a 14B-quantized recipe added once a stable 16 GB workflow lands.