What You'll Build
Generate uncensored text-to-video and image-to-video clips locally with Sulphur 2 — an LTX-2.3 fine-tune from SulphurAI — on a 16GB consumer GPU. The upstream sulphur_dev_fp8mixed.safetensors is 29.2 GB and won't fit on 16GB VRAM; this recipe uses the community Q4_K_S GGUF (13.2 GB) from vantagewithai/Sulphur-2-Base-GGUF together with the same quantized Gemma 3 12B text encoder used by the parent LTX-2.3 recipe.
Hardware data: RTX 5060 Ti (16GB VRAM) · Q4_K_S GGUF + Gemma 3 12B QAT-Q4 encoder · See benchmark data
⚠️ Known issue: The upstream
sulphur_dev_bf16.safetensors(46.1 GB) andsulphur_dev_fp8mixed.safetensors(29.2 GB) shipped on SulphurAI/Sulphur-2-base are too large for 16GB VRAM. Use the GGUF path below.
Requirements
| Component | Minimum | Tested |
|---|---|---|
| GPU | 16GB VRAM (Ampere or newer) | RTX 5060 Ti (16GB) |
| RAM | 32GB | 32GB |
| Storage | ~25GB | Q4_K_S 13.2 GB + Gemma Q4 encoder + VAE |
| Software | ComfyUI + ComfyUI-LTXVideo + ComfyUI-GGUF + KJNodes | Python 3.10+, CUDA 12.7+ |
Sulphur 2 inherits the LTX-2.3 architecture (architecture: ltxv per the vantagewithai GGUF card) and the same Gemma 3 12B text-encoder requirement. On 16GB cards, the quantized GGUF path is the only one that fits — the same constraint that drives the parent LTX-2.3 recipe.
Installation
1. Install ComfyUI and the LTX-Video custom nodes
If you already followed the LTX-2.3 recipe, you already have these — skip to step 2. Otherwise:
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cd custom_nodes
git clone https://github.com/Lightricks/ComfyUI-LTXVideo.git
pip install -r ComfyUI-LTXVideo/requirements.txt
git clone https://github.com/city96/ComfyUI-GGUF.git
pip install -r ComfyUI-GGUF/requirements.txt
git clone https://github.com/kijai/ComfyUI-KJNodes.git
pip install -r ComfyUI-KJNodes/requirements.txt
The Sulphur-2 workflow shipped on the canonical repo uses LTXV-prefixed nodes (LTXVConcatAVLatent, LTXVCropGuides, LTXVPreprocess, SamplerCustomAdvanced) — all provided by ComfyUI-LTXVideo, confirmed by inspecting workflows/ltx23_t2v distilled.json on the upstream card.
2. Download the Q4_K_S Sulphur-2 GGUF
# Q4_K_S — 13.2 GB, the sweet spot for 16GB VRAM
huggingface-cli download vantagewithai/Sulphur-2-Base-GGUF \
sulphur_dev-Q4_K_S.gguf \
--local-dir ComfyUI/models/unet/
Quant-tier file-size reference (from vantagewithai/Sulphur-2-Base-GGUF, 21B params, architecture: ltxv):
| Quant | File size | Fits 16GB GPU? |
|---|---|---|
| Q3_K_S | 10.3 GB | yes (headroom for encoder) |
| Q3_K_M | 11.1 GB | yes |
| Q4_K_S | 13.2 GB | yes — recommended |
| Q4_K_M | 14.3 GB | tight |
| Q5_K_S | 15.0 GB | no (no room for encoder + activations) |
| Q5_K_M | 16.1 GB | no |
| Q6_K | 17.8 GB | no |
| Q8_0 | 22.8 GB | no |
The 16GB ceiling assumption is anchored by the parent LTX-2.3 recipe's cited consumer-GPU datapoint: a 16GB ComfyUI user running the architecturally-identical LTX-2 distilled stack reported a peak of 14926 MiB during sampling (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weight file is comparable in size to the LTX-2.3 distilled Q4_K_S that produced that measurement.
3. Download the quantized Gemma 3 12B text encoder
Sulphur 2 inherits LTX-2.3's Gemma 3 12B text encoder. The full unquantized Gemma 3 12B will OOM on 16GB cards when loaded alongside the Sulphur 2 weights — the closest published consumer-GPU OOM datapoint in the LTX family is 29068 MiB peak on RTX 5080 16GB with the LTX-2 19B-dev-fp8 stack (Lightricks/ComfyUI-LTXVideo#303); Sulphur-2's 21B distilled weights at bf16 are heavier still. Use the QAT-Q4 GGUF instead:
huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
gemma-3-12b-it-qat-UD-Q4_K_XL.gguf \
--local-dir ComfyUI/models/text_encoders/
huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
mmproj-BF16.gguf \
--local-dir ComfyUI/models/text_encoders/
Both files are loaded by ComfyUI-GGUF's Gemma encoder node.
4. Download the LTX video VAE (Kijai community mirror)
Sulphur 2 reuses the upstream LTX video VAE — neither SulphurAI/Sulphur-2-base nor Lightricks/LTX-2.3 exposes the VAE as a standalone file (LTX-2.3 bundles it inside the 22B .safetensors). The simplest path for the GGUF-only flow is the community mirror by Kijai, which exposes a standalone VAE in bf16 — architecture: ltxv is shared across the LTX family:
huggingface-cli download Kijai/LTXV2_comfy \
VAE/LTX2_video_vae_bf16.safetensors \
--local-dir ComfyUI/models/vae/
File listing confirmed at Kijai/LTXV2_comfy.
5. Download the canonical Sulphur-2 workflow JSON
The canonical Sulphur 2 ComfyUI workflow lives on the upstream SulphurAI repo:
huggingface-cli download SulphurAI/Sulphur-2-base \
"workflows/ltx23_t2v distilled.json" \
--local-dir ComfyUI/user/default/workflows/
Optionally also pull the distill LoRA — per the upstream SulphurAI README, this is the recommended quality path when running the dev (non-distilled) weights:
huggingface-cli download SulphurAI/Sulphur-2-base \
sulphur_lora_rank_768.safetensors \
--local-dir ComfyUI/models/loras/
The upstream README explicitly notes: "I'm aware the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." If you load the GGUF in step 2 instead of the bf16/fp8mixed weights, you do not need the LoRA — the distill is already baked in.
Running
Launch ComfyUI:
python main.py --listen
Open the browser UI, then load the workflow downloaded in step 5:
ComfyUI/user/default/workflows/ltx23_t2v distilled.json
In the loaded graph, swap the default UNet loader for the Unet Loader (GGUF) node from ComfyUI-GGUF (point it at sulphur_dev-Q4_K_S.gguf), and point the text encoder at the GGUF Gemma 3 loader from the same custom node pack. Defaults from the canonical workflow file:
| Parameter | Value | Source |
|---|---|---|
| Frame count | 18 | LTXVPreprocess widget in ltx23_t2v distilled.json |
| Resolution (longer edge) | 1536 px | ResizeImagesByLongerEdge widget in the same file |
Start small — drop the longer edge to 832 px and frames to 65 max on a 16GB card while you verify the workflow loads cleanly, then scale up only if peak VRAM stays comfortably below 16GB.
Optional: prompt enhancer
The upstream Sulphur 2 ships a Q8_0 prompt enhancer (sulphur_prompt_enhancer_model-q8_0.gguf + mmproj-BF16.gguf) intended to be used via LM Studio. Per the SulphurAI README: create Sulphur/promptenhancer/ inside your LM Studio model folder, drop both files in, and load the model from LM Studio's UI. There is no system prompt — send the raw text (and optionally an image) you want enhanced.
Results
- Speed: Omitted — no published Sulphur-2 benchmark on a 16GB-class consumer GPU at the time of writing. Empirical 5060 Ti data will appear at /check/sulphur-2/rtx-5060-ti once a benchmark report lands. See the parent LTX-2.3 recipe for order-of-magnitude wall-time data points on related hardware; Sulphur-2 has not been measured separately.
- VRAM usage: Sulphur-2 inherits the LTX-2.3 architecture. The closest cited consumer-GPU peak from the parent stack is
14926 MiBduring sampling on a 16GB ComfyUI user running LTX-2 distilled (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weights weigh in at 13.2 GB per vantagewithai/Sulphur-2-Base-GGUF, comparable to the LTX-2.3 distilled Q4_K_S that produced that measurement. - Quality notes: The Sulphur 2 GGUF is a quantization of the distilled checkpoint — expect the same 8-step / CFG=1 short-step sampling profile as LTX-2.3 distilled, and a similar quality regression at the Q3 tier and below.
For up-to-date benchmark data on this pair, see /check/sulphur-2/rtx-5060-ti.
Troubleshooting
"Can I run this on 8GB VRAM?" — No, not realistically
A community walk-through specifically about Sulphur-2 deployment (knightli.com) addresses this directly: "If you only have 8GB VRAM, try reducing pressure... but it is not realistic to expect high-resolution, long-video, complex workflows on 8GB." The Q3_K_S GGUF (10.3 GB weight) is below the encoder + activation budget on an 8GB card, and aggressive offloading destroys throughput. 16GB is the practical floor.
OOM when loading the text encoder
Same root cause as the parent LTX-2.3 recipe — the default unquantized Gemma 3 12B encoder will OOM on 16GB cards when loaded alongside the Sulphur 2 weights (Lightricks/ComfyUI-LTXVideo#303 reports peak 29068 MiB on RTX 5080 16GB with the LTX-2 19B-dev-fp8 pipeline, and Sulphur-2's 21B distilled weights are heavier). Replace with gemma-3-12b-it-qat-UD-Q4_K_XL.gguf from Unsloth (step 3 above).
"sulphur_final" referenced in the workflow but missing locally
The upstream workflow JSON contains a sulphur_final checkpoint reference that does not exist as a published file. Per the SulphurAI README: "the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." If you used the GGUF in step 2, point the loader at sulphur_dev-Q4_K_S.gguf instead and delete or bypass the LoRA node — the distill is already baked into the GGUF weights.
Gemma GGUF loader fails or outputs gibberish
The Gemma 3 GGUF loader in ComfyUI-GGUF required PRs #399 and #402 to be merged at the time the parent LTX-2.3 recipe was authored (Kijai/LTXV2_comfy discussion #7). Pull the latest city96/ComfyUI-GGUF main — both PRs are now merged.
Slow generation
Keep the Gemma encoder offloaded with the KJNodes model-offload nodes; VRAM thrashing on a 16GB card kills wall time. Empirical 5060 Ti numbers will appear at /check/sulphur-2/rtx-5060-ti when a community benchmark lands.