xinsir/controlnet-union-sdxl-1.0 - AI Image Models Tool
Overview
ControlNet-Union (xinsir/controlnet-union-sdxl-1.0) is an open-source "ControlNet++" model built for Stable Diffusion XL (SDXL). It unifies many ControlNet-style conditionings into one network so you can run pose-, edge-, depth- or line-guided text-to-image generation and multi-condition editing from a single checkpoint. The project emphasizes multi-condition fusion (learned during training), NovelAI-style bucket training for arbitrary aspect ratios, and advanced tile-based editing tools (tile deblur, tile variation, tile super-resolution) designed for very high-resolution outputs. ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)) The repository and model card describe support for more than ten control types (examples: OpenPose, Depth, Canny, Lineart, AnimeLineart, MLSD, Scribble, HED, PIDI/Softedge, TEED, Segment, Normal) and a ProMax variant that advertises 12 control types plus five advanced editing modes (tile deblur, variation, super-resolution, inpainting, outpainting). The author provides inference scripts and integration notes (Diffusers/ComfyUI guidance), and the model is released under Apache-2.0. Community threads show active user adoption and feature requests (for example, requests for an SD1.5 union variant and some troubleshooting discussion). ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0))
Model Statistics
- Downloads: 111,693
- Likes: 1646
- Pipeline: text-to-image
License: apache-2.0
Model Details
Architecture and capabilities: ControlNet-Union extends the original ControlNet idea with two new modules named the Control Encoder and the Condition Transformer. The design shares a single condition encoder across multiple control types (reducing parameter inflation) and uses a transformer layer to fuse multiple condition features into a condition bias that is added to the source image features. This enables single- and multi-condition inference without manually tuning per-condition hyperparameters. The project author reports training with multi-resolution strategies, re-captioned prompts (CogVLM-style), and bucket training on a dataset the README describes as "over 10,000,000 images" to improve robustness and prompt-following. The repository includes inference scripts for single- and multi-condition setups and practical tips (for example, a recommended draw_bodypose replacement for best OpenPose results). ([github.com](https://github.com/xinsir6/ControlNetPlus?utm_source=openai)) Compatibility and practical notes: The checkpoint is intended for SDXL pipelines and works with Hugging Face Diffusers' SDXL ControlNet pipelines (StableDiffusionXLControlNetPipeline / ControlNetModel). The model card and GitHub emphasise compatibility with other SDXL base models and LoRA adapters. Model parameters are not published in the model card (parameters: unknown). The checkpoint and code are licensed under Apache-2.0. Users should expect to run this on GPU-capable environments; many community spaces (ComfyUI and Hugging Face Spaces) already use this model. ([huggingface.co](https://huggingface.co/docs/diffusers/api/pipelines/controlnet_sdxl?utm_source=openai))
Key Features
- Unified ControlNet supporting 10+ control types in one checkpoint (pose, depth, canny, lineart, segment, normal).
- Multi-condition fusion learned during training — no manual fusion hyperparameters required.
- Advanced tile editing: tile deblur, tile-based variation, and tile super-resolution for ultra-high-res images.
- Inpainting and outpainting workflows integrated into the ProMax variant for content-aware image editing.
- Novel bucket training strategy for arbitrary aspect ratios and high-resolution outputs.
- Compatibility with Hugging Face Diffusers SDXL ControlNet pipelines and common SDXL base models.
- Open-source Apache-2.0 license and inference scripts provided in the GitHub repository.
Example Usage
Example (python):
from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, AutoencoderKL
from diffusers.utils import load_image
from PIL import Image
import numpy as np
import torch
import cv2
# Note: this example follows Diffusers' SDXL ControlNet guidance.
# Replace model IDs or device as needed.
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load ControlNet-Union checkpoint (SDXL-compatible controlnet)
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-union-sdxl-1.0",
torch_dtype=torch.float16 if device == "cuda" else torch.float32,
)
# Load an SDXL base model (example: stabilityai base); supply an appropriate VAE
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16 if device == "cuda" else torch.float32)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
torch_dtype=torch.float16 if device == "cuda" else torch.float32,
)
# Enable memory optimizations if available
if device == "cuda":
pipe.enable_model_cpu_offload()
# Prepare a single control image (example: Canny edge map)
img = load_image("./example_input.jpg").convert("RGB")
img = img.resize((1024, 1024))
arr = np.array(img)
edges = cv2.Canny(arr, 100, 200)
edges = edges[:, :, None]
edges = np.concatenate([edges, edges, edges], axis=2)
control_image = Image.fromarray(edges)
prompt = "A cinematic portrait of a warrior wearing ornate armor, dramatic rim lighting"
# Generate
result = pipe(
prompt,
image=control_image,
controlnet_conditioning_scale=1.0, # per-condition strength
num_inference_steps=30,
).images[0]
result.save("./controlnet_union_result.png")
# For multi-condition use: pass a list of control images and set the appropriate control_type in the repo scripts.
# See the model's inference scripts and the Diffusers ControlNet SDXL docs for multi-control examples and pre-processing tips.
# References: Diffusers SDXL ControlNet docs and xinsir ControlNetPlus repo.
Benchmarks
Hugging Face downloads (last month): 111,693 (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))
Hugging Face likes / stars: ~1.65k (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))
Supported control types (documented): 12 (OpenPose, Depth, Canny, Lineart, AnimeLineart, MLSD, Scribble, HED, PIDI/Softedge, TEED, Segment, Normal) (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))
Advanced editing modes (ProMax): 5 (Tile Deblur, Tile Variation, Tile Super Resolution, Inpainting, Outpainting) (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))
Release / announcement date: Initial ControlNet++ release July 6, 2024 (ProMax around July 13, 2024) (Source: ([github.com](https://github.com/xinsir6/ControlNetPlus?utm_source=openai)))
Key Information
- Category: Image Models
- Type: AI Image Models Tool