How much VRAM does LTX Video 2.3 need?

About 34 GB — the minimum this recipe targets.

How hard is this setup?

Advanced — follow the steps above.

LTX Video 2.3 on Apple M2 Max: experimental 22B audio-video in unified memory via Draw Things (Metal) or MLX

What You'll Build

Generate short text-to-video and image-to-video clips — with joint audio — locally on an Apple M2 Max using LTX Video 2.3, the 22B-parameter audio-video DiT from Lightricks. This is the one open video model with a real, cited Apple-Silicon path, but it sits at the experimental edge of what a Mac can do: expect preview-quality output, long generation times, and a tight unified-memory budget. The lead path is Draw Things, a native Metal app whose own engine sidesteps the CUDA-only machinery the model normally relies on; an MLX community port is the second option for terminal users.

Hardware data: Apple M2 Max (64GB unified memory, ~400 GB/s, ~48GB GPU-addressable) · LTX-2.3 22B distilled · See benchmark data

⚠️ Experimental / preview-quality only. Video is the weakest local-AI tier on Apple Silicon. LTX-2.3 on a Mac is for experimentation, not production: the CUDA-only IC-LoRAs and torch.compile acceleration the NVIDIA workflow depends on are unavailable on Metal/MPS, the stock PyTorch path has a known noise bug on recent torch builds, and no first-party Apple benchmark exists for the M2 Max yet. Treat every clip as a preview. Generation is minutes-per-clip, not seconds.

ℹ️ "M2 Max 64GB" is the floor, not headroom. Apple's unified memory is shared CPU+GPU, and Metal only lets the GPU wire down ~75% of it (~48GB on a 64GB Mac). The distilled int8 MLX path peaks around the upper-30s of GB on a larger Mac (see Results — figure is from a different chip), so a 64GB M2 Max is the minimum viable config and you must raise the wired limit. Macs with less than 64GB unified memory are not recommended for this model.

Requirements

Component	Minimum	Tested
GPU / chip	Apple Silicon, 64GB unified memory	Apple M2 Max (64GB unified memory)
Unified memory	64GB (raise the GPU wired limit; see Troubleshooting)	64GB
Storage	~30GB (MLX distilled weights) or in-app (Draw Things)	~30GB
Software	macOS Sonoma 14+ · Draw Things or Python 3.12+ with MLX	macOS Sequoia 15

There is no CUDA, no FlashAttention, and no torch.compile on this platform — skip every pip install flash-attn step and every --use-cuda / cu128 wheel index you'll see in the NVIDIA LTX guides. The two Apple-native paths below replace that entire stack.

Installation

Path A (recommended) — Draw Things, native Metal

Draw Things is a free, native macOS app with its own Metal inference engine (including its own Metal attention — it does not need CUDA FlashAttention). Its release notes explicitly add "Support LTX-2.3 22B series models" and the Draw Things engineering team reports shipping Metal compute shaders that "improved LTX-2.3 video VAE decoding speed by about 2.4x on M1 through M4, and 4.7x on M5" (Draw Things engineering blog).

Install Draw Things from the Mac App Store or drawthings.ai/downloads.
In the app, switch to a video pipeline and import an LTX-2.3 22B model from the built-in model manager (the app ships built-in metadata for the LTX-2.3 series; the [distilled] variant is the fastest). See the Draw Things LTX-2 wiki page and Video Generation Basics for the current import list and settings.
Choose the distilled checkpoint for fewer steps, set a short frame count and a modest resolution (LTX-2.3 officially supports up to 1080×1920 9:16, but start small on a Mac), and generate.

No terminal, no Python environment, no quantization flags — Draw Things handles the Metal mapping and memory plan internally. This is why it is the recommended Apple path.

Path B — MLX (terminal), community port

A community MLX port runs the same 22B distilled checkpoint natively on Apple Silicon. The most-downloaded runtime is dgrauet/ltx-2-mlx:

# Clone and set up the MLX port (community, not official Lightricks)
git clone https://github.com/dgrauet/ltx-2-mlx.git
cd ltx-2-mlx
uv sync --all-extras

The repo's README states it targets LTX-2.3 (22B) and runs on "macOS with Apple Silicon (M1/M2/M3/M4)", with memory guidance "32GB+ RAM recommended (int8) or 16GB+ with --low-ram. 16GB minimum (int4 without streaming)" (dgrauet/ltx-2-mlx). On a 64GB M2 Max, prefer the int4 distilled weights, whose quantized transformer is 11.32GB on disk plus a 6.34GB connector and ~0.8GB VAE (dgrauet/ltx-2.3-mlx-q4 file tree).

# Generate (two-stage = base + upscaler). Community port — expect beta-flagged subcommands.
ltx-2-mlx generate --prompt "A sunset over the ocean" --two-stage -o sunset.mp4

An alternative MLX runtime, mlx-video-with-audio (authored by Prince Canuma, the developer behind MLX-VLM and MLX-Audio), is explicitly "Generate videos with synchronized audio on Apple Silicon using MLX. Supports … LTX-2." Either runtime works; the dgrauet port has the most community downloads for the 2.3 weights.

Running

In Draw Things (Path A): pick the LTX-2.3 [distilled] model, enter a prompt (text-to-video) or supply an init image (image-to-video), keep the clip short, and press generate. Audio is produced jointly with video for the audio-capable LTX-2.3 variants.

In MLX (Path B): the generate command above writes an .mp4. Use --two-stage for the base-plus-upscaler pipeline and the int4 distilled weights for the lowest memory footprint on a 64GB Mac. Subcommands flagged [beta] or [experimental] in the port may have known quality limitations — this is preview tooling.

The official stock-PyTorch path also exists: the Lightricks LTX-2.3 model card ships a Diffusers example that instructs you to "switch to 'mps' for apple devices". It works, but it is the slowest and most fragile option on a Mac (no torch.compile, op-by-op MPS fallbacks) — use Draw Things or MLX instead unless you specifically need the reference Diffusers pipeline.

Results

Speed: Omitted for the M2 Max — no first-party benchmark exists. Our /check data has no Apple measurement for this pair (/check/ltx-video-2-3/m2-max returns unknown). The only public chip-named timings are for other chips and must not be read as M2 Max throughput: a community MLX run on an M4 Max 128GB reports 152.5s denoising and 34.2GB peak memory for a 576×1024, 121-frame (5s) video-only clip (gajesh/LTX-2.3-mlx-fp16 model card). The M2 Max's ~400 GB/s bandwidth is well below the M4 Max's ~546 GB/s, so it will be slower — but by how much is unmeasured here. If you run this, please contribute your numbers so the next reader gets a real M2 Max figure.
Unified-memory usage: The int4 distilled MLX path keeps the resident set near the distilled transformer (11.32GB) plus connector (6.34GB) and VAE/vocoder; the int8 path is heavier (the cited M4 Max run peaked at 34.2GB). On a 64GB M2 Max (~48GB GPU-addressable) this fits, but tightly — raise the wired limit (Troubleshooting) and keep clips short.
Quality notes: Preview-grade. Distilled checkpoints trade fidelity for fewer steps; the community thread for the LTX family reports frame-count limits before noise appears (one user noted a practical ceiling around 4 seconds on a Mac, Lightricks/LTX-Video discussion #26). The CUDA-only IC-LoRAs that sharpen the NVIDIA output are unavailable here.

For the full benchmark data, see /check/ltx-video-2-3/m2-max.

Troubleshooting

Out of memory / the GPU can't allocate enough

A 64GB Mac only exposes ~48GB to the GPU by default (Metal's recommendedMaxWorkingSetSize is ~75% of unified memory). LTX-2.3 video sits near that ceiling. Raise the GPU wired limit before generating (macOS Sonoma 14 / Sequoia 15+):

# Allow the GPU to wire down ~56GB on a 64GB Mac. Leave 8–16GB for macOS.
sudo sysctl iogpu.wired_limit_mb=57344
# Reset to default afterwards:
sudo sysctl iogpu.wired_limit_mb=0

This is temporary and resets on reboot. Watch Activity Monitor's Memory-Pressure gauge — pushing toward 100% causes swapping and instability. On older macOS (Monterey/Ventura) the key is debug.iogpu.wired_limit in bytes instead.

Noisy / garbage video on the stock PyTorch (MPS) path

If you use the reference Diffusers/MPS path rather than Draw Things or MLX, recent PyTorch builds can produce noise on Apple Silicon. A community user reported "torch 2.5 or above would give me noise…Backed down to 2.4.1 solved the problem" (Lightricks/LTX-Video discussion #26). The simplest fix is to avoid the stock-PyTorch path entirely and use Draw Things (Path A) or the MLX port (Path B), whose engines don't hit this bug.

fp8 weights fail on Metal

fp8-quantized checkpoints (common in the NVIDIA workflow) do not run on Apple's Metal backend — a community user in the LTX-2 Mac thread notes you must "change the model to b16 or a gguf" (Lightricks/LTX-2 discussion #43). Use the bf16 or MLX-quantized (int4/int8) weights linked above, never an fp8 file. There is no FP8 or NVFP4 tensor-core hardware on Apple Silicon to accelerate those formats anyway.

It's slow

Expected. Video generation on Apple Silicon is minutes-per-clip and has no torch.compile acceleration. Use the distilled int4 weights, keep resolution and frame count low, and treat the output as a preview. Report your timings via the submission form so we can seed a real M2 Max benchmark.