What You'll Build
Generate short text-to-video and image-to-video clips — with joint audio — locally on an Apple M2 Max using LTX Video 2.3, the 22B-parameter audio-video DiT from Lightricks. This is the one open video model with a real, cited Apple-Silicon path, but it sits at the experimental edge of what a Mac can do: expect preview-quality output, long generation times, and a tight unified-memory budget. The lead path is Draw Things, a native Metal app whose own engine sidesteps the CUDA-only machinery the model normally relies on; an MLX community port is the second option for terminal users.
Hardware data: Apple M2 Max (64GB unified memory, ~400 GB/s, ~48GB GPU-addressable) · LTX-2.3 22B distilled · See benchmark data
⚠️ Experimental / preview-quality only. Video is the weakest local-AI tier on Apple Silicon. LTX-2.3 on a Mac is for experimentation, not production: the CUDA-only IC-LoRAs and
torch.compileacceleration the NVIDIA workflow depends on are unavailable on Metal/MPS, the stock PyTorch path has a known noise bug on recent torch builds, and no first-party Apple benchmark exists for the M2 Max yet. Treat every clip as a preview. Generation is minutes-per-clip, not seconds.
ℹ️ "M2 Max 64GB" is the floor, not headroom. Apple's unified memory is shared CPU+GPU, and Metal only lets the GPU wire down ~75% of it (~48GB on a 64GB Mac). The distilled int8 MLX path peaks around the upper-30s of GB on a larger Mac (see Results — figure is from a different chip), so a 64GB M2 Max is the minimum viable config and you must raise the wired limit. Macs with less than 64GB unified memory are not recommended for this model.
Requirements
| Component | Minimum | Tested |
|---|---|---|
| GPU / chip | Apple Silicon, 64GB unified memory | Apple M2 Max (64GB unified memory) |
| Unified memory | 64GB (raise the GPU wired limit; see Troubleshooting) | 64GB |
| Storage | ~30GB (MLX distilled weights) or in-app (Draw Things) | ~30GB |
| Software | macOS Sonoma 14+ · Draw Things or Python 3.12+ with MLX | macOS Sequoia 15 |
There is no CUDA, no FlashAttention, and no torch.compile on this platform — skip every pip install flash-attn step and every --use-cuda / cu128 wheel index you'll see in the NVIDIA LTX guides. The two Apple-native paths below replace that entire stack.
Installation
Path A (recommended) — Draw Things, native Metal
Draw Things is a free, native macOS app with its own Metal inference engine (including its own Metal attention — it does not need CUDA FlashAttention). Its release notes explicitly add "Support LTX-2.3 22B series models" and the Draw Things engineering team reports shipping Metal compute shaders that "improved LTX-2.3 video VAE decoding speed by about 2.4x on M1 through M4, and 4.7x on M5" (Draw Things engineering blog).
- Install Draw Things from the Mac App Store or drawthings.ai/downloads.
- In the app, switch to a video pipeline and import an LTX-2.3 22B model from the built-in model manager (the app ships built-in metadata for the LTX-2.3 series; the
[distilled]variant is the fastest). See the Draw Things LTX-2 wiki page and Video Generation Basics for the current import list and settings. - Choose the distilled checkpoint for fewer steps, set a short frame count and a modest resolution (LTX-2.3 officially supports up to 1080×1920 9:16, but start small on a Mac), and generate.
No terminal, no Python environment, no quantization flags — Draw Things handles the Metal mapping and memory plan internally. This is why it is the recommended Apple path.
Path B — MLX (terminal), community port
A community MLX port runs the same 22B distilled checkpoint natively on Apple Silicon. The most-downloaded runtime is dgrauet/ltx-2-mlx:
# Clone and set up the MLX port (community, not official Lightricks)
git clone https://github.com/dgrauet/ltx-2-mlx.git
cd ltx-2-mlx
uv sync --all-extras
The repo's README states it targets LTX-2.3 (22B) and runs on "macOS with Apple Silicon (M1/M2/M3/M4)", with memory guidance "32GB+ RAM recommended (int8) or 16GB+ with --low-ram. 16GB minimum (int4 without streaming)" (dgrauet/ltx-2-mlx). On a 64GB M2 Max, prefer the int4 distilled weights, whose quantized transformer is 11.32GB on disk plus a 6.34GB connector and ~0.8GB VAE (dgrauet/ltx-2.3-mlx-q4 file tree).
# Generate (two-stage = base + upscaler). Community port — expect beta-flagged subcommands.
ltx-2-mlx generate --prompt "A sunset over the ocean" --two-stage -o sunset.mp4
An alternative MLX runtime, mlx-video-with-audio (authored by Prince Canuma, the developer behind MLX-VLM and MLX-Audio), is explicitly "Generate videos with synchronized audio on Apple Silicon using MLX. Supports … LTX-2." Either runtime works; the dgrauet port has the most community downloads for the 2.3 weights.
Running
In Draw Things (Path A): pick the LTX-2.3 [distilled] model, enter a prompt (text-to-video) or supply an init image (image-to-video), keep the clip short, and press generate. Audio is produced jointly with video for the audio-capable LTX-2.3 variants.
In MLX (Path B): the generate command above writes an .mp4. Use --two-stage for the base-plus-upscaler pipeline and the int4 distilled weights for the lowest memory footprint on a 64GB Mac. Subcommands flagged [beta] or [experimental] in the port may have known quality limitations — this is preview tooling.
The official stock-PyTorch path also exists: the Lightricks LTX-2.3 model card ships a Diffusers example that instructs you to "switch to 'mps' for apple devices". It works, but it is the slowest and most fragile option on a Mac (no torch.compile, op-by-op MPS fallbacks) — use Draw Things or MLX instead unless you specifically need the reference Diffusers pipeline.
Results
- Speed: Omitted for the M2 Max — no first-party benchmark exists. Our
/checkdata has no Apple measurement for this pair (/check/ltx-video-2-3/m2-max returnsunknown). The only public chip-named timings are for other chips and must not be read as M2 Max throughput: a community MLX run on an M4 Max 128GB reports 152.5s denoising and 34.2GB peak memory for a 576×1024, 121-frame (5s) video-only clip (gajesh/LTX-2.3-mlx-fp16 model card). The M2 Max's ~400 GB/s bandwidth is well below the M4 Max's ~546 GB/s, so it will be slower — but by how much is unmeasured here. If you run this, please contribute your numbers so the next reader gets a real M2 Max figure. - Unified-memory usage: The int4 distilled MLX path keeps the resident set near the distilled transformer (11.32GB) plus connector (6.34GB) and VAE/vocoder; the int8 path is heavier (the cited M4 Max run peaked at 34.2GB). On a 64GB M2 Max (~48GB GPU-addressable) this fits, but tightly — raise the wired limit (Troubleshooting) and keep clips short.
- Quality notes: Preview-grade. Distilled checkpoints trade fidelity for fewer steps; the community thread for the LTX family reports frame-count limits before noise appears (one user noted a practical ceiling around 4 seconds on a Mac, Lightricks/LTX-Video discussion #26). The CUDA-only IC-LoRAs that sharpen the NVIDIA output are unavailable here.
For the full benchmark data, see /check/ltx-video-2-3/m2-max.
Troubleshooting
Out of memory / the GPU can't allocate enough
A 64GB Mac only exposes ~48GB to the GPU by default (Metal's recommendedMaxWorkingSetSize is ~75% of unified memory). LTX-2.3 video sits near that ceiling. Raise the GPU wired limit before generating (macOS Sonoma 14 / Sequoia 15+):
# Allow the GPU to wire down ~56GB on a 64GB Mac. Leave 8–16GB for macOS.
sudo sysctl iogpu.wired_limit_mb=57344
# Reset to default afterwards:
sudo sysctl iogpu.wired_limit_mb=0
This is temporary and resets on reboot. Watch Activity Monitor's Memory-Pressure gauge — pushing toward 100% causes swapping and instability. On older macOS (Monterey/Ventura) the key is debug.iogpu.wired_limit in bytes instead.
Noisy / garbage video on the stock PyTorch (MPS) path
If you use the reference Diffusers/MPS path rather than Draw Things or MLX, recent PyTorch builds can produce noise on Apple Silicon. A community user reported "torch 2.5 or above would give me noise…Backed down to 2.4.1 solved the problem" (Lightricks/LTX-Video discussion #26). The simplest fix is to avoid the stock-PyTorch path entirely and use Draw Things (Path A) or the MLX port (Path B), whose engines don't hit this bug.
fp8 weights fail on Metal
fp8-quantized checkpoints (common in the NVIDIA workflow) do not run on Apple's Metal backend — a community user in the LTX-2 Mac thread notes you must "change the model to b16 or a gguf" (Lightricks/LTX-2 discussion #43). Use the bf16 or MLX-quantized (int4/int8) weights linked above, never an fp8 file. There is no FP8 or NVFP4 tensor-core hardware on Apple Silicon to accelerate those formats anyway.
It's slow
Expected. Video generation on Apple Silicon is minutes-per-clip and has no torch.compile acceleration. Use the distilled int4 weights, keep resolution and frame count low, and treat the output as a preview. Report your timings via the submission form so we can seed a real M2 Max benchmark.