How much VRAM does Sulphur 2 need?

About 16 GB — the minimum this recipe targets.

How hard is this setup?

Advanced — follow the steps above.

Sulphur 2 on RTX 5060 Ti: Uncensored LTX-2.3 Video via GGUF in ComfyUI

What You'll Build

Generate uncensored text-to-video and image-to-video clips locally with Sulphur 2 — an LTX-2.3 fine-tune from SulphurAI — on a 16GB consumer GPU. The upstream sulphur_dev_fp8mixed.safetensors is 29.2 GB and won't fit on 16GB VRAM; this recipe uses the community Q4_K_S GGUF (13.2 GB) from vantagewithai/Sulphur-2-Base-GGUF together with the same quantized Gemma 3 12B text encoder used by the parent LTX-2.3 recipe.

Hardware data: RTX 5060 Ti (16GB VRAM) · Q4_K_S GGUF + Gemma 3 12B QAT-Q4 encoder · See benchmark data

⚠️ Known issue: The upstream sulphur_dev_bf16.safetensors (46.1 GB) and sulphur_dev_fp8mixed.safetensors (29.2 GB) shipped on SulphurAI/Sulphur-2-base are too large for 16GB VRAM. Use the GGUF path below.

Requirements

Component	Minimum	Tested
GPU	16GB VRAM (Ampere or newer)	RTX 5060 Ti (16GB)
RAM	32GB	32GB
Storage	~25GB	Q4_K_S 13.2 GB + distill LoRA 662 MB + Gemma Q4 encoder + VAE
Software	ComfyUI + ComfyUI-LTXVideo + ComfyUI-GGUF + KJNodes	Python 3.10+, CUDA 12.7+

Sulphur 2 inherits the LTX-2.3 architecture (architecture: ltxv per the vantagewithai GGUF card) and the same Gemma 3 12B text-encoder requirement. On 16GB cards, the quantized GGUF path is the only one that fits — the same constraint that drives the parent LTX-2.3 recipe.

Installation

1. Install ComfyUI and the LTX-Video custom nodes

If you already followed the LTX-2.3 recipe, you already have these — skip to step 2. Otherwise:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

cd custom_nodes
git clone https://github.com/Lightricks/ComfyUI-LTXVideo.git
pip install -r ComfyUI-LTXVideo/requirements.txt

git clone https://github.com/city96/ComfyUI-GGUF.git
pip install -r ComfyUI-GGUF/requirements.txt

git clone https://github.com/kijai/ComfyUI-KJNodes.git
pip install -r ComfyUI-KJNodes/requirements.txt

The Sulphur-2 workflow shipped on the canonical repo uses LTXV-prefixed nodes (LTXVConcatAVLatent, LTXVCropGuides, LTXVPreprocess, SamplerCustomAdvanced) — all provided by ComfyUI-LTXVideo, confirmed by inspecting workflows/ltx23_t2v distilled.json on the upstream card.

2. Download the Q4_K_S Sulphur-2 GGUF

# Q4_K_S — 13.2 GB, the sweet spot for 16GB VRAM
huggingface-cli download vantagewithai/Sulphur-2-Base-GGUF \
  sulphur_dev-Q4_K_S.gguf \
  --local-dir ComfyUI/models/unet/

Quant-tier file-size reference (from vantagewithai/Sulphur-2-Base-GGUF, 21B params, architecture: ltxv):

Quant	File size	Fits 16GB GPU?
Q3_K_S	10.3 GB	yes (headroom for encoder)
Q3_K_M	11.1 GB	yes
Q4_K_S	13.2 GB	yes — recommended
Q4_K_M	14.3 GB	tight
Q5_K_S	15.0 GB	no (no room for encoder + activations)
Q5_K_M	16.1 GB	no
Q6_K	17.8 GB	no
Q8_0	22.8 GB	no

The 16GB ceiling assumption is anchored by the parent LTX-2.3 recipe's cited consumer-GPU datapoint: a 16GB ComfyUI user running the architecturally-identical LTX-2 distilled stack reported a peak of 14926 MiB during sampling (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weight file is comparable in size to the LTX-2.3 distilled Q4_K_S that produced that measurement.

3. Download the quantized Gemma 3 12B text encoder

Sulphur 2 inherits LTX-2.3's Gemma 3 12B text encoder. The full unquantized Gemma 3 12B will OOM on 16GB cards when loaded alongside the Sulphur 2 weights — the closest published consumer-GPU OOM datapoint in the LTX family is 29068 MiB peak on RTX 5080 16GB with the LTX-2 19B-dev-fp8 stack (Lightricks/ComfyUI-LTXVideo#303); Sulphur-2's 21B dev weights at bf16 are heavier still. Use the QAT-Q4 GGUF instead:

huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
  gemma-3-12b-it-qat-UD-Q4_K_XL.gguf \
  --local-dir ComfyUI/models/text_encoders/

huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
  mmproj-BF16.gguf \
  --local-dir ComfyUI/models/text_encoders/

Both files are loaded by ComfyUI-GGUF's Gemma encoder node.

4. Download the LTX video VAE (Kijai community mirror)

Sulphur 2 reuses the upstream LTX video VAE — neither SulphurAI/Sulphur-2-base nor Lightricks/LTX-2.3 exposes the VAE as a standalone file (LTX-2.3 bundles it inside the 22B .safetensors). The simplest path for the GGUF-only flow is the community mirror by Kijai, which exposes a standalone VAE in bf16 — architecture: ltxv is shared across the LTX family:

huggingface-cli download Kijai/LTXV2_comfy \
  VAE/LTX2_video_vae_bf16.safetensors \
  --local-dir ComfyUI/models/vae/

File listing confirmed at Kijai/LTXV2_comfy.

5. Download the canonical Sulphur-2 workflow JSON

The canonical Sulphur 2 ComfyUI workflow lives on the upstream SulphurAI repo:

huggingface-cli download SulphurAI/Sulphur-2-base \
  "workflows/ltx23_t2v distilled.json" \
  --local-dir ComfyUI/user/default/workflows/

Also pull the distill LoRA — it is required on the GGUF path. Every tier published by vantagewithai is named sulphur_dev-*.gguf: the GGUF is a quantization of the non-distilled dev weights, not of the distilled checkpoint. The dev model's base output is corrupted without a distill LoRA (Discussion #14); the canonical ltx23_t2v distilled.json wires in the 662 MB distill_loras/ distill LoRA — download it:

huggingface-cli download SulphurAI/Sulphur-2-base \
  distill_loras/ltx-2.3-22b-distilled-lora-1.1_fro90_ceil72_condsafe.safetensors \
  --local-dir ComfyUI/models/loras/

The upstream README explicitly notes: "I'm aware the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." That means EITHER (dev weights + this distill LoRA — the GGUF path here) OR a full distilled checkpoint (sulphur_distil_bf16.safetensors), never both. Keep the workflow's distill-LoRA nodes wired (next section); without them the 8-step / CFG=1 schedule runs un-distilled and produces degraded, under-denoised output. (The repo's heavier sulphur_lora_rank_768.safetensors, 9.56 GiB, is a 24 GB-class alternative — skip it on 16 GB.)

Running

Launch ComfyUI:

python main.py --listen

Open the browser UI, then load the workflow downloaded in step 5:

ComfyUI/user/default/workflows/ltx23_t2v distilled.json

In the loaded graph, swap the default UNet loader for the Unet Loader (GGUF) node from ComfyUI-GGUF (point it at sulphur_dev-Q4_K_S.gguf), and point the text encoder at the GGUF Gemma 3 loader from the same custom node pack. Then keep the workflow's LoraLoaderModelOnly distill-LoRA node wired and point it at ltx-2.3-22b-distilled-lora-1.1_fro90_ceil72_condsafe.safetensors (the file you downloaded) — do not delete it; the dev GGUF needs the distill LoRA for the 8-step / CFG=1 short-step schedule. Defaults from the canonical workflow file:

Parameter	Value	Source
Frame count	18	`LTXVPreprocess` widget in `ltx23_t2v distilled.json`
Resolution (longer edge)	1536 px	`ResizeImagesByLongerEdge` widget in the same file

Start small — drop the longer edge to 832 px and frames to 65 max on a 16GB card while you verify the workflow loads cleanly, then scale up only if peak VRAM stays comfortably below 16GB.

Optional: prompt enhancer

The upstream Sulphur 2 ships a Q8_0 prompt enhancer (sulphur_prompt_enhancer_model-q8_0.gguf + mmproj-BF16.gguf) intended to be used via LM Studio. Per the SulphurAI README: create Sulphur/promptenhancer/ inside your LM Studio model folder, drop both files in, and load the model from LM Studio's UI. There is no system prompt — send the raw text (and optionally an image) you want enhanced.

Results

Speed: Omitted — no published Sulphur-2 benchmark on a 16GB-class consumer GPU at the time of writing. Empirical 5060 Ti data will appear at /check/sulphur-2/rtx-5060-ti once a benchmark report lands. See the parent LTX-2.3 recipe for order-of-magnitude wall-time data points on related hardware; Sulphur-2 has not been measured separately.
VRAM usage: Sulphur-2 inherits the LTX-2.3 architecture. The closest cited consumer-GPU peak from the parent stack is 14926 MiB during sampling on a 16GB ComfyUI user running LTX-2 distilled (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weights weigh in at 13.2 GB per vantagewithai/Sulphur-2-Base-GGUF, comparable to the LTX-2.3 distilled Q4_K_S that produced that measurement.
Quality notes: The Sulphur 2 GGUF is a quantization of the non-distilled dev weights; the distill LoRA wired into the workflow supplies the 8-step / CFG=1 short-step sampling profile (matching LTX-2.3 distilled). Run it without the LoRA and the output under-denoises. Expect a further quality regression at the Q3 tier and below.

For up-to-date benchmark data on this pair, see /check/sulphur-2/rtx-5060-ti.

Troubleshooting

"Can I run this on 8GB VRAM?" — No, not realistically

A community walk-through specifically about Sulphur-2 deployment (knightli.com) addresses this directly: "If you only have 8GB VRAM, try reducing pressure... but it is not realistic to expect high-resolution, long-video, complex workflows on 8GB." The Q3_K_S GGUF (10.3 GB weight) is below the encoder + activation budget on an 8GB card, and aggressive offloading destroys throughput. 16GB is the practical floor.

OOM when loading the text encoder

Same root cause as the parent LTX-2.3 recipe — the default unquantized Gemma 3 12B encoder will OOM on 16GB cards when loaded alongside the Sulphur 2 weights (Lightricks/ComfyUI-LTXVideo#303 reports peak 29068 MiB on RTX 5080 16GB with the LTX-2 19B-dev-fp8 pipeline, and Sulphur-2's 21B dev weights are heavier). Replace with gemma-3-12b-it-qat-UD-Q4_K_XL.gguf from Unsloth (step 3 above).

"sulphur_final" referenced in the workflow but missing locally

The upstream workflow JSON contains a sulphur_final checkpoint reference that does not exist as a published file. Per the SulphurAI README: "the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." On the GGUF path, point the model loader at sulphur_dev-Q4_K_S.gguf and keep the workflow's LoraLoaderModelOnly distill-LoRA node pointed at ltx-2.3-22b-distilled-lora-1.1_fro90_ceil72_condsafe.safetensors (from step 5) — do not delete it. The dev GGUF is not distilled; without the LoRA the 8-step / CFG=1 schedule runs un-distilled and degrades output.

Gemma GGUF loader fails or outputs gibberish

The Gemma 3 GGUF loader in ComfyUI-GGUF required PRs #399 and #402 to be merged at the time the parent LTX-2.3 recipe was authored (Kijai/LTXV2_comfy discussion #7). Pull the latest city96/ComfyUI-GGUF main — both PRs are now merged.

Slow generation

Keep the Gemma encoder offloaded with the KJNodes model-offload nodes; VRAM thrashing on a 16GB card kills wall time. Empirical 5060 Ti numbers will appear at /check/sulphur-2/rtx-5060-ti when a community benchmark lands.