self-hosted/ai
§01·recipe · video

Sulphur 2 on RTX 5060 Ti: Uncensored LTX-2.3 Video via GGUF in ComfyUI

videoadvanced16GB+ VRAMMay 19, 2026
models
tools
prerequisites
  • NVIDIA RTX 5060 Ti (16GB VRAM) or any 16GB consumer GPU
  • 32GB+ system RAM (for CPU offload of the Gemma 3 text encoder)
  • Python 3.10+ and CUDA 12.7+
  • ComfyUI installed (latest version) with ComfyUI-LTXVideo, ComfyUI-GGUF, ComfyUI-KJNodes custom nodes
  • ~25GB free disk space for Q4-tier GGUF + Gemma 3 12B QAT encoder + VAE

What You'll Build

Generate uncensored text-to-video and image-to-video clips locally with Sulphur 2 — an LTX-2.3 fine-tune from SulphurAI — on a 16GB consumer GPU. The upstream sulphur_dev_fp8mixed.safetensors is 29.2 GB and won't fit on 16GB VRAM; this recipe uses the community Q4_K_S GGUF (13.2 GB) from vantagewithai/Sulphur-2-Base-GGUF together with the same quantized Gemma 3 12B text encoder used by the parent LTX-2.3 recipe.

Hardware data: RTX 5060 Ti (16GB VRAM) · Q4_K_S GGUF + Gemma 3 12B QAT-Q4 encoder · See benchmark data

⚠️ Known issue: The upstream sulphur_dev_bf16.safetensors (46.1 GB) and sulphur_dev_fp8mixed.safetensors (29.2 GB) shipped on SulphurAI/Sulphur-2-base are too large for 16GB VRAM. Use the GGUF path below.

Requirements

ComponentMinimumTested
GPU16GB VRAM (Ampere or newer)RTX 5060 Ti (16GB)
RAM32GB32GB
Storage~25GBQ4_K_S 13.2 GB + Gemma Q4 encoder + VAE
SoftwareComfyUI + ComfyUI-LTXVideo + ComfyUI-GGUF + KJNodesPython 3.10+, CUDA 12.7+

Sulphur 2 inherits the LTX-2.3 architecture (architecture: ltxv per the vantagewithai GGUF card) and the same Gemma 3 12B text-encoder requirement. On 16GB cards, the quantized GGUF path is the only one that fits — the same constraint that drives the parent LTX-2.3 recipe.

Installation

1. Install ComfyUI and the LTX-Video custom nodes

If you already followed the LTX-2.3 recipe, you already have these — skip to step 2. Otherwise:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

cd custom_nodes
git clone https://github.com/Lightricks/ComfyUI-LTXVideo.git
pip install -r ComfyUI-LTXVideo/requirements.txt

git clone https://github.com/city96/ComfyUI-GGUF.git
pip install -r ComfyUI-GGUF/requirements.txt

git clone https://github.com/kijai/ComfyUI-KJNodes.git
pip install -r ComfyUI-KJNodes/requirements.txt

The Sulphur-2 workflow shipped on the canonical repo uses LTXV-prefixed nodes (LTXVConcatAVLatent, LTXVCropGuides, LTXVPreprocess, SamplerCustomAdvanced) — all provided by ComfyUI-LTXVideo, confirmed by inspecting workflows/ltx23_t2v distilled.json on the upstream card.

2. Download the Q4_K_S Sulphur-2 GGUF

# Q4_K_S — 13.2 GB, the sweet spot for 16GB VRAM
huggingface-cli download vantagewithai/Sulphur-2-Base-GGUF \
  sulphur_dev-Q4_K_S.gguf \
  --local-dir ComfyUI/models/unet/

Quant-tier file-size reference (from vantagewithai/Sulphur-2-Base-GGUF, 21B params, architecture: ltxv):

QuantFile sizeFits 16GB GPU?
Q3_K_S10.3 GByes (headroom for encoder)
Q3_K_M11.1 GByes
Q4_K_S13.2 GByes — recommended
Q4_K_M14.3 GBtight
Q5_K_S15.0 GBno (no room for encoder + activations)
Q5_K_M16.1 GBno
Q6_K17.8 GBno
Q8_022.8 GBno

The 16GB ceiling assumption is anchored by the parent LTX-2.3 recipe's cited consumer-GPU datapoint: a 16GB ComfyUI user running the architecturally-identical LTX-2 distilled stack reported a peak of 14926 MiB during sampling (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weight file is comparable in size to the LTX-2.3 distilled Q4_K_S that produced that measurement.

3. Download the quantized Gemma 3 12B text encoder

Sulphur 2 inherits LTX-2.3's Gemma 3 12B text encoder. The full unquantized Gemma 3 12B will OOM on 16GB cards when loaded alongside the Sulphur 2 weights — the closest published consumer-GPU OOM datapoint in the LTX family is 29068 MiB peak on RTX 5080 16GB with the LTX-2 19B-dev-fp8 stack (Lightricks/ComfyUI-LTXVideo#303); Sulphur-2's 21B distilled weights at bf16 are heavier still. Use the QAT-Q4 GGUF instead:

huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
  gemma-3-12b-it-qat-UD-Q4_K_XL.gguf \
  --local-dir ComfyUI/models/text_encoders/

huggingface-cli download unsloth/gemma-3-12b-it-qat-GGUF \
  mmproj-BF16.gguf \
  --local-dir ComfyUI/models/text_encoders/

Both files are loaded by ComfyUI-GGUF's Gemma encoder node.

4. Download the LTX video VAE (Kijai community mirror)

Sulphur 2 reuses the upstream LTX video VAE — neither SulphurAI/Sulphur-2-base nor Lightricks/LTX-2.3 exposes the VAE as a standalone file (LTX-2.3 bundles it inside the 22B .safetensors). The simplest path for the GGUF-only flow is the community mirror by Kijai, which exposes a standalone VAE in bf16 — architecture: ltxv is shared across the LTX family:

huggingface-cli download Kijai/LTXV2_comfy \
  VAE/LTX2_video_vae_bf16.safetensors \
  --local-dir ComfyUI/models/vae/

File listing confirmed at Kijai/LTXV2_comfy.

5. Download the canonical Sulphur-2 workflow JSON

The canonical Sulphur 2 ComfyUI workflow lives on the upstream SulphurAI repo:

huggingface-cli download SulphurAI/Sulphur-2-base \
  "workflows/ltx23_t2v distilled.json" \
  --local-dir ComfyUI/user/default/workflows/

Optionally also pull the distill LoRA — per the upstream SulphurAI README, this is the recommended quality path when running the dev (non-distilled) weights:

huggingface-cli download SulphurAI/Sulphur-2-base \
  sulphur_lora_rank_768.safetensors \
  --local-dir ComfyUI/models/loras/

The upstream README explicitly notes: "I'm aware the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." If you load the GGUF in step 2 instead of the bf16/fp8mixed weights, you do not need the LoRA — the distill is already baked in.

Running

Launch ComfyUI:

python main.py --listen

Open the browser UI, then load the workflow downloaded in step 5:

ComfyUI/user/default/workflows/ltx23_t2v distilled.json

In the loaded graph, swap the default UNet loader for the Unet Loader (GGUF) node from ComfyUI-GGUF (point it at sulphur_dev-Q4_K_S.gguf), and point the text encoder at the GGUF Gemma 3 loader from the same custom node pack. Defaults from the canonical workflow file:

ParameterValueSource
Frame count18LTXVPreprocess widget in ltx23_t2v distilled.json
Resolution (longer edge)1536 pxResizeImagesByLongerEdge widget in the same file

Start small — drop the longer edge to 832 px and frames to 65 max on a 16GB card while you verify the workflow loads cleanly, then scale up only if peak VRAM stays comfortably below 16GB.

Optional: prompt enhancer

The upstream Sulphur 2 ships a Q8_0 prompt enhancer (sulphur_prompt_enhancer_model-q8_0.gguf + mmproj-BF16.gguf) intended to be used via LM Studio. Per the SulphurAI README: create Sulphur/promptenhancer/ inside your LM Studio model folder, drop both files in, and load the model from LM Studio's UI. There is no system prompt — send the raw text (and optionally an image) you want enhanced.

Results

  • Speed: Omitted — no published Sulphur-2 benchmark on a 16GB-class consumer GPU at the time of writing. Empirical 5060 Ti data will appear at /check/sulphur-2/rtx-5060-ti once a benchmark report lands. See the parent LTX-2.3 recipe for order-of-magnitude wall-time data points on related hardware; Sulphur-2 has not been measured separately.
  • VRAM usage: Sulphur-2 inherits the LTX-2.3 architecture. The closest cited consumer-GPU peak from the parent stack is 14926 MiB during sampling on a 16GB ComfyUI user running LTX-2 distilled (Comfy-Org/ComfyUI#11726). The Q4_K_S Sulphur-2 weights weigh in at 13.2 GB per vantagewithai/Sulphur-2-Base-GGUF, comparable to the LTX-2.3 distilled Q4_K_S that produced that measurement.
  • Quality notes: The Sulphur 2 GGUF is a quantization of the distilled checkpoint — expect the same 8-step / CFG=1 short-step sampling profile as LTX-2.3 distilled, and a similar quality regression at the Q3 tier and below.

For up-to-date benchmark data on this pair, see /check/sulphur-2/rtx-5060-ti.

Troubleshooting

"Can I run this on 8GB VRAM?" — No, not realistically

A community walk-through specifically about Sulphur-2 deployment (knightli.com) addresses this directly: "If you only have 8GB VRAM, try reducing pressure... but it is not realistic to expect high-resolution, long-video, complex workflows on 8GB." The Q3_K_S GGUF (10.3 GB weight) is below the encoder + activation budget on an 8GB card, and aggressive offloading destroys throughput. 16GB is the practical floor.

OOM when loading the text encoder

Same root cause as the parent LTX-2.3 recipe — the default unquantized Gemma 3 12B encoder will OOM on 16GB cards when loaded alongside the Sulphur 2 weights (Lightricks/ComfyUI-LTXVideo#303 reports peak 29068 MiB on RTX 5080 16GB with the LTX-2 19B-dev-fp8 pipeline, and Sulphur-2's 21B distilled weights are heavier). Replace with gemma-3-12b-it-qat-UD-Q4_K_XL.gguf from Unsloth (step 3 above).

"sulphur_final" referenced in the workflow but missing locally

The upstream workflow JSON contains a sulphur_final checkpoint reference that does not exist as a published file. Per the SulphurAI README: "the workflows contain sulphur_final right now, just use the lora or use the full models, don't use both at the same time." If you used the GGUF in step 2, point the loader at sulphur_dev-Q4_K_S.gguf instead and delete or bypass the LoRA node — the distill is already baked into the GGUF weights.

Gemma GGUF loader fails or outputs gibberish

The Gemma 3 GGUF loader in ComfyUI-GGUF required PRs #399 and #402 to be merged at the time the parent LTX-2.3 recipe was authored (Kijai/LTXV2_comfy discussion #7). Pull the latest city96/ComfyUI-GGUF main — both PRs are now merged.

Slow generation

Keep the Gemma encoder offloaded with the KJNodes model-offload nodes; VRAM thrashing on a 16GB card kills wall time. Empirical 5060 Ti numbers will appear at /check/sulphur-2/rtx-5060-ti when a community benchmark lands.