How much VRAM does AnimateDiff need?

About 6 GB — the minimum this recipe targets.

How hard is this setup?

Intermediate — follow the steps above.

AnimateDiff on RTX 3060 Ti: Low-VRAM SD1.5 Animation on 8GB

What You'll Build

A short looping animation (16 frames, ~512×512) generated from a text prompt by layering the AnimateDiff motion module on top of any Stable Diffusion 1.5 checkpoint inside ComfyUI — the lightweight video option that fits an 8GB card, where the modern 22B native video DiTs (LTX-2, Wan, Hunyuan) do not.

Hardware data: RTX 3060 Ti (8GB VRAM) · SD1.5 path at 512×512 / 16 frames · See benchmark data

ℹ️ SD1.5 path only on 8GB. AnimateDiff also ships an SDXL motion adapter (mm_sdxl_v10_beta), but the canonical AnimateDiff README states SDXL "Inference usually requires ~13GB VRAM" for 1024×1024×16 — that does not fit an 8GB card. This recipe is scoped entirely to the SD1.5 motion modules.

Requirements

Component	Minimum	Tested
GPU	8GB VRAM (cross-attention optimization enabled)	RTX 3060 Ti (8GB)
RAM	16GB+ (model weights offload to system RAM under low-VRAM mode)	—
Storage	~4–6GB (motion module ~1.7GB + an SD1.5 checkpoint ~2–4GB)	—
Software	ComfyUI, Python 3.10+	—

AnimateDiff is a plug-and-play motion module — it adds animation to an existing SD1.5 image checkpoint without retraining the base, so you reuse any SD1.5 model you already have.

Installation

1. Install the ComfyUI-AnimateDiff-Evolved custom node

The dominant ComfyUI runtime for AnimateDiff is ComfyUI-AnimateDiff-Evolved by Kosinkadink. Clone it into your custom_nodes folder:

cd ComfyUI/custom_nodes
git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved.git
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git

ComfyUI-VideoHelperSuite provides the nodes for combining the generated frames into a video/GIF. Restart ComfyUI after cloning.

2. Download an SD1.5 motion module

Per the ComfyUI-AnimateDiff-Evolved README, the original SD1.5 motion modules — mm_sd_v14, mm_sd_v15, mm_sd_v15_v2, v3_sd15_mm — come from the canonical guoyww/animatediff HuggingFace repo. mm_sd_v15_v2.ckpt (≈1.7GB) is the recommended general-purpose SD1.5 module:

cd ComfyUI/models/animatediff_models
wget https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt

Place motion modules in ComfyUI/models/animatediff_models (the README also accepts ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models).

3. Have an SD1.5 base checkpoint ready

AnimateDiff layers on top of a regular SD1.5 checkpoint (the model that draws each frame). Drop any SD1.5 .safetensors checkpoint into ComfyUI/models/checkpoints/ if you don't already have one.

Running

Build a standard SD1.5 ComfyUI text-to-image graph, then insert the AnimateDiff Loader node (from AnimateDiff-Evolved) between the model loader and the sampler, pointing it at mm_sd_v15_v2.ckpt. Set the latent batch size to 16 (the frame count) at 512×512, then route the decoded frames into a VideoHelperSuite Video Combine node to export a GIF/MP4.

To stay inside the 8GB envelope, launch ComfyUI with low-VRAM offloading so model weights spill to system RAM when idle:

python main.py --lowvram

Output frames are combined by VideoHelperSuite and land in ComfyUI/output/.

Results

VRAM usage: The CivitAI Education Beginner's Guide to AnimateDiff reports, for the default 512×512, 16 frames config (Torch 2.0, measured on an RTX 4090), 5.6 GB with Xformers/SDP cross-attention optimization, versus 10.39 GB sub-quadratic and 12.13 GB with no optimization. The 5.6 GB optimized figure is what makes the SD1.5 / 512×512 / 16-frame path fit an 8GB card — keep cross-attention optimization on and avoid upscalers or higher frame counts in the same pass. See /check/animatediff/rtx-3060-ti.
Speed: No first-party RTX 3060 Ti benchmark exists yet (/check verdict is unknown). For a sense of scale on a sibling Ampere card, a firsthand RTX 3060 (12GB) owner in this CivitAI guide reports "16 frames and 8 fps takes about an hour if the image is 512x768, checkpoint, VAE and upscaler" — note that run is a larger 512×768 frame with an added upscaler stage, so a plain 512×512 pass is lighter. If you measure the 512×512 path on a 3060 Ti, please contribute it.
Quality notes: AnimateDiff is a 2023 technique that predates native video DiTs; expect short (16-frame) clips with subtle, looping motion rather than long high-fidelity video. mm_sd_v15_v2 gives the most stable motion of the SD1.5 modules.

For the full benchmark data, see /check/animatediff/rtx-3060-ti.

Troubleshooting

Out of memory on an 8GB card

The same firsthand RTX 3060 account above measured "7.7 GB of dedicated GPU memory" for a 512×768 16-frame run with an upscaler — that configuration exceeds the 8GB RTX 3060 Ti's effective budget (after the desktop compositor, ~7GB is usable). Stay at 512×512, keep the batch at 16 frames, run with --lowvram, ensure cross-attention optimization is active (which the CivitAI Education table credits for the 5.6 GB figure), and drop any upscaler out of the generation pass — upscale as a separate step afterward.

SDXL motion adapter won't fit

Do not use mm_sdxl_v10_beta on this card. The canonical AnimateDiff README quotes ~13GB VRAM for the SDXL path at 1024×1024×16 — it overflows 8GB. Stick to the SD1.5 modules.

Motion module fails to load

If a .ckpt motion module throws an unpickling error, re-download it from the canonical guoyww/animatediff repo — truncated or mirror copies are a common cause. Verify the file is ~1.7GB for mm_sd_v15_v2.ckpt.