How much VRAM does LTX Video 2.3 need?

About 34 GB — the minimum this recipe targets.

How hard is this setup?

Advanced — follow the steps above.

LTX Video 2.3 on Apple M4 Max: experimental 22B audio-video in unified memory via Draw Things (Metal) or MLX

What You'll Build

Generate short text-to-video and image-to-video clips — with joint audio — locally on an Apple M4 Max using LTX Video 2.3, the 22B-parameter audio-video DiT from Lightricks. This is the one open video model with a real, cited Apple-Silicon path, but it sits at the experimental edge of what a Mac can do: expect preview-quality output, long generation times, and a tight unified-memory budget. The lead path is Draw Things, a native Metal app whose own engine sidesteps the CUDA-only machinery the model normally relies on; an MLX community port is the second option for terminal users.

Hardware data: Apple M4 Max (48 GB unified memory, ~546 GB/s, ~32GB GPU-addressable safe) · LTX-2.3 22B distilled · See benchmark data

⚠️ Experimental / preview-quality only. Video is the weakest local-AI tier on Apple Silicon. LTX-2.3 on a Mac is for experimentation, not production: the CUDA-only IC-LoRAs and torch.compile acceleration the NVIDIA workflow depends on are unavailable on Metal/MPS, the stock PyTorch path has a known noise bug on recent torch builds, and no first-party Apple benchmark is seeded for the M4 Max yet. Treat every clip as a preview. Generation is minutes-per-clip, not seconds.

ℹ️ "M4 Max 48GB" is the floor, not headroom. Apple's unified memory is shared CPU+GPU, and Metal only lets the GPU wire down ~66–75% of it (~32GB safe / ~36GB optimistic on a 48GB Mac). The distilled int8 MLX path peaks at 34.2GB on an M4 Max (see Results — a measured M4 Max figure), which is above the ~32GB safe line, so the int8 path needs a wired-limit raise. The lighter int4 path (11.32GB) fits comfortably with no raise. Macs with less than 32GB unified memory are not recommended for this model.

Requirements

Component	Minimum	Tested
GPU / chip	Apple Silicon, 32GB+ unified memory	Apple M4 Max (48 GB unified memory)
Unified memory	32GB (raise the GPU wired limit for the int8 path; see Troubleshooting)	48GB
Storage	~30GB (MLX distilled weights) or in-app (Draw Things)	~30GB
Software	macOS Sonoma 14+ · Draw Things or Python 3.12+ with MLX	macOS Sequoia 15

There is no CUDA, no FlashAttention, and no torch.compile on this platform — skip every pip install flash-attn step and every --use-cuda / cu128 wheel index you'll see in the NVIDIA LTX guides. The two Apple-native paths below replace that entire stack.

Installation

Path A (recommended) — Draw Things, native Metal

Draw Things is a free, native macOS app with its own Metal inference engine (including its own Metal attention — it does not need CUDA FlashAttention). Its release notes explicitly add "Support LTX-2.3 22B series models" and the Draw Things engineering team reports shipping Metal compute shaders that "improved LTX-2.3 video VAE decoding speed by about 2.4x on M1 through M4, and 4.7x on M5" (Draw Things engineering blog) — note this 2.4x is a family-level figure spanning M1 through M4, not an M4 Max-specific elapsed-time measurement.

Install Draw Things from the Mac App Store or drawthings.ai/downloads.
In the app, switch to a video pipeline and import an LTX-2.3 22B model from the built-in model manager (the app ships built-in metadata for the LTX-2.3 series; the [distilled] variant is the fastest). See the Draw Things LTX-2 wiki page and Video Generation Basics for the current import list and settings.
Choose the distilled checkpoint for fewer steps, set a short frame count and a modest resolution (LTX-2.3 officially supports up to 1080×1920 9:16, but start small on a Mac), and generate.

No terminal, no Python environment, no quantization flags — Draw Things handles the Metal mapping and memory plan internally. This is why it is the recommended Apple path.

Path B — MLX (terminal), community port

A community MLX port runs the same 22B distilled checkpoint natively on Apple Silicon. The most-downloaded runtime is dgrauet/ltx-2-mlx:

# Clone and set up the MLX port (community, not official Lightricks)
git clone https://github.com/dgrauet/ltx-2-mlx.git
cd ltx-2-mlx
uv sync --all-extras

The repo's README states it targets LTX-2.3 (22B) and runs on "macOS with Apple Silicon (M1/M2/M3/M4)", with memory guidance "32GB+ RAM recommended (int8) or 16GB+ with --low-ram. 16GB minimum (int4 without streaming)" (dgrauet/ltx-2-mlx). On a 48 GB M4 Max, prefer the int4 distilled weights, whose quantized transformer is 11.32GB on disk plus a 6.34GB connector and ~0.8GB VAE (dgrauet/ltx-2.3-mlx-q4 file tree).

# Generate (two-stage = base + upscaler). Community port — expect beta-flagged subcommands.
ltx-2-mlx generate --prompt "A sunset over the ocean" --two-stage -o sunset.mp4

An alternative MLX runtime, mlx-video-with-audio (authored by Prince Canuma, the developer behind MLX-VLM and MLX-Audio), is explicitly "Generate videos with synchronized audio on Apple Silicon using MLX. Supports … LTX-2." Either runtime works; the dgrauet port has the most community downloads for the 2.3 weights.

Running

In Draw Things (Path A): pick the LTX-2.3 [distilled] model, enter a prompt (text-to-video) or supply an init image (image-to-video), keep the clip short, and press generate. Audio is produced jointly with video for the audio-capable LTX-2.3 variants.

In MLX (Path B): the generate command above writes an .mp4. Use --two-stage for the base-plus-upscaler pipeline and the int4 distilled weights for the lowest memory footprint on a 48 GB Mac. Subcommands flagged [beta] or [experimental] in the port may have known quality limitations — this is preview tooling.

The official stock-PyTorch path also exists: the Lightricks LTX-2.3 model card ships a Diffusers example that instructs you to "switch to 'mps' for apple devices". It works, but it is the slowest and most fragile option on a Mac (no torch.compile, op-by-op MPS fallbacks) — use Draw Things or MLX instead unless you specifically need the reference Diffusers pipeline.

Results

Speed: Omitted for the M4 Max — no first-party benchmark is seeded. Our /check data has no Apple measurement for this pair (/check/ltx-video-2-3/m4-max returns unknown). One community MLX run on an M4 Max 128GB reports 152.5s denoising for a 576×1024, 121-frame (5s) video-only clip (gajesh/LTX-2.3-mlx-fp16 model card, section "Benchmarks (M4 Max 128GB)") — chip-exact for the M4 Max but a single community source, and on a 128GB machine with far more memory headroom than this 48GB config, so we don't publish it as our throughput. The Draw Things VAE-decode speedup is reported only as a family-level 2.4x multiplier spanning M1 through M4, not an M4 Max elapsed time. If you run this, please contribute your numbers so the next reader gets a real M4 Max figure for our config.
Unified-memory usage: The int4 distilled MLX path keeps the resident set near the distilled transformer (11.32GB) plus connector (6.34GB) and VAE/vocoder; the int8 path is heavier — the same M4 Max community run peaked at 34.2GB (gajesh/LTX-2.3-mlx-fp16 model card, measured on an M4 Max). On a 48 GB M4 Max (~32GB safe / ~36GB optimistic GPU-addressable) the int4 path fits comfortably, but the 34.2GB int8 peak is above the ~32GB safe line — raise the wired limit (Troubleshooting) for the int8 path and keep clips short. The int4 path needs no raise.
Quality notes: Preview-grade. Distilled checkpoints trade fidelity for fewer steps; the community thread for the LTX family reports frame-count limits before noise appears (one user noted a practical ceiling around 4 seconds on a Mac, Lightricks/LTX-Video discussion #26). The CUDA-only IC-LoRAs that sharpen the NVIDIA output are unavailable here.

For the full benchmark data, see /check/ltx-video-2-3/m4-max.

Troubleshooting

Out of memory / the GPU can't allocate enough (int8 path)

A 48GB Mac only exposes ~32GB safe / ~36GB optimistic to the GPU by default (Metal's recommendedMaxWorkingSetSize is ~66–75% of unified memory). The int8 LTX-2.3 path peaks at 34.2GB — above that default ceiling. Raise the GPU wired limit before generating the int8 path (macOS Sonoma 14 / Sequoia 15+):

# Allow the GPU to wire down ~40GB on a 48GB Mac. Leave 8GB for macOS.
sudo sysctl iogpu.wired_limit_mb=40960
# Reset to default afterwards:
sudo sysctl iogpu.wired_limit_mb=0

This is temporary and resets on reboot. Watch Activity Monitor's Memory-Pressure gauge — pushing toward 100% causes swapping and instability. On older macOS (Monterey/Ventura) the key is debug.iogpu.wired_limit in bytes instead. The int4 path (11.32GB) fits the default ~32GB safe pool with no raise needed — prefer it if you'd rather not touch the wired limit.

Noisy / garbage video on the stock PyTorch (MPS) path

If you use the reference Diffusers/MPS path rather than Draw Things or MLX, recent PyTorch builds can produce noise on Apple Silicon. A community user reported "torch 2.5 or above would give me noise…Backed down to 2.4.1 solved the problem" (Lightricks/LTX-Video discussion #26). The simplest fix is to avoid the stock-PyTorch path entirely and use Draw Things (Path A) or the MLX port (Path B), whose engines don't hit this bug.

fp8 weights fail on Metal

fp8-quantized checkpoints (common in the NVIDIA workflow) do not run on Apple's Metal backend — a community user in the LTX-2 Mac thread notes you must "change the model to b16 or a gguf" (Lightricks/LTX-2 discussion #43). Use the bf16 or MLX-quantized (int4/int8) weights linked above, never an fp8 file. There is no FP8 or NVFP4 tensor-core hardware on Apple Silicon to accelerate those formats anyway.

It's slow

Expected. Video generation on Apple Silicon is minutes-per-clip and has no torch.compile acceleration. Use the distilled int4 weights, keep resolution and frame count low, and treat the output as a preview. Report your timings via the submission form so we can seed a real M4 Max benchmark.