What You'll Build
A working ComfyUI setup that runs Chroma V48 — the 8.9B-parameter, Apache-2.0, uncensored re-derivation of Flux.1-Schnell published by Lodestone Rock — on an RTX 5060 Ti (16GB). Because the standalone lodestones/Chroma repo is marked deprecated and points downstream users to lodestones/Chroma1-HD ("Chroma1-HD is not the old Chroma-v.50 it has been retrained from v.48"), this recipe uses the Chroma1-HD GGUF redistribution for current installs.
Hardware data: RTX 5060 Ti (16GB VRAM) · Chroma1 family officially targets a 16GB minimum on consumer hardware · See benchmark data
⚠️ Headroom is tight. The Chroma1 family is explicitly resource-intensive: in the lodestones Chroma1-Radiance ComfyUI thread a 12 GB RTX 5070 user reports the model "barely completes generation at 98–99% VRAM usage" with quality degradation. 16 GB is the comfortable floor for the V48-lineage at standard resolutions; expect to use the Q8_0 GGUF (9.74 GB on disk) or smaller, plus the fp8 T5 XXL text encoder.
Requirements
| Component | Minimum | Tested |
|---|---|---|
| GPU | 16 GB VRAM (per the Chroma1-Radiance ComfyUI thread) | RTX 5060 Ti (16 GB) |
| RAM | 16 GB system | — |
| Storage | ~12 GB (Q8 weights + T5 XXL fp8 + FLUX VAE) | — |
| Software | ComfyUI + ComfyUI-GGUF custom node by city96 | — |
Installation
1. Update ComfyUI
A recent ComfyUI release is required; native Chroma1-Radiance support landed in ComfyUI v3.60 (the same custom-loader stack covers the Chroma1-HD GGUF flow). See the Chroma1-Radiance ComfyUI support thread for the version note.
2. Install the GGUF custom node
From ComfyUI/custom_nodes:
git clone https://github.com/city96/ComfyUI-GGUF
cd ComfyUI-GGUF
pip install -r requirements.txt
Restart ComfyUI after installation. ComfyUI-GGUF is the loader that consumes the .gguf Chroma1-HD weights.
3. Download the Chroma V48 (Chroma1-HD) GGUF weights
Pick one quantization from the silveroxides/Chroma1-HD-GGUF repository. File sizes per quantization (verbatim from the model card):
| Quant | Size |
|---|---|
| Q4_K_S | 5.43 GB |
| Q4_0 | 5.43 GB |
| Q4_K_M | 5.57 GB |
| Q4_1 | 5.97 GB |
| Q5_K_S | 6.51 GB |
| Q5_K_M | 6.65 GB |
| Q6_K | 7.65 GB |
| Q8_0 | 9.74 GB |
Recommendation for the 5060 Ti: Q8_0 (9.74 GB) for the highest in-family quality that still leaves headroom for the text encoder, VAE, and intermediate activations on a 16 GB card. Drop to Q4_K_M (5.57 GB) if you want to stack acceleration LoRAs or push past 1024×1024.
Drop the downloaded .gguf into ComfyUI/models/diffusion_models/.
4. Download the T5 XXL text encoder and FLUX VAE
The official lodestones/Chroma README pins the same text-encoder and VAE files the FLUX ecosystem uses:
# T5 XXL (fp8 — use this on 16 GB; the fp16 variant doubles the footprint)
wget -P ComfyUI/models/clip/ \
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors
# FLUX VAE (ae.safetensors from the FLUX.1 schnell release)
# Place into ComfyUI/models/vae/
URLs verbatim from the lodestones/Chroma README.
5. Load the Chroma workflow
The official ComfyUI workflow JSON ships in both the Chroma1-HD repo and the deprecated lodestones/Chroma repo (ComfyUI_Chroma1-HD_T2I-workflow.json). Download it, drag it onto the ComfyUI canvas, and swap the default Load Diffusion Model node for the Unet Loader (GGUF) node from ComfyUI-GGUF, pointing it at your downloaded .gguf.
Running
Set the workflow's text-encoder and VAE nodes to the files placed in step 4. Use a 1024×1024 latent for the first run; the Chroma1-Radiance ComfyUI thread notes the family uses ~30 inference steps as the standard ComfyUI template default — start there and adjust to taste.
Trigger: Queue Prompt
Output: PNG saved to ComfyUI/output/
The first generation pays a cold-load cost (weights → VRAM, text encoder → VRAM). Subsequent generations with the same model reuse the loaded weights.
Results
- Speed: Omitted. The only first-party generation-time data point in the Chroma1-HD speed thread is on an RTX 5090 (32 GB) — not comparable enough to the 5060 Ti to quote without misleading. Once community benchmarks land, the /check/ endpoint will surface them.
- VRAM usage: Plan for ≥ 16 GB. The Chroma1 family's own ComfyUI discussions treat 16 GB as the comfortable minimum and report quality degradation when squeezed onto 12 GB cards.
- Quality notes: Chroma V48 is a Flux.1-Schnell de-distillation: it restores the multi-step diffusion behavior that Schnell distilled away, so it runs more like a Flux.1-Dev-class model than a 4-step turbo. Don't expect Schnell-tier speed.
For the full benchmark data, see /check/chroma-v48/rtx-5060-ti.
Troubleshooting
Noise artifacts when using --fp8_e5m2-unet
Per the Chroma1-Radiance ComfyUI thread, the --fp8_e5m2-unet ComfyUI flag produces noise artifacts on the Chroma1 family. Stick to the default loader (or --fp8_e4m3fn-unet if you need a fp8 path that isn't GGUF).
Quality regressions from FP8 weights or acceleration LoRAs
Same thread: FP8 weight precision and standard acceleration LoRAs visibly hurt prompt adherence and surface undercooked hands/faces on the Chroma1 family. The GGUF Q8_0 path documented above sidesteps both — Q8 GGUF is generally close to bf16 in the FLUX-family quantization literature, and it doesn't require the model-weight casts that the fp8 path does.
"v48", "Chroma1-HD", "Chroma1-Base", "Chroma1-Radiance" — which one is V48?
Per lodestones/Chroma1-Base's README, "Chroma1-Base is Chroma-v.48". Chroma1-HD is explicitly "retrained from v.48" as a finetune-ready base. The deprecated lodestones/Chroma repo's chroma-unlocked-v48-detail-calibrated.safetensors is the original V48 weight file. Chroma1-Radiance is a separate output-head variant (no FLUX VAE, different decoder) — close cousin, not the same architecture, so its discussion threads are referenced here as adjacent evidence rather than ground truth for V48 specifically.