xinsir/controlnet-union-sdxl-1.0 - AI Image Models Tool

Overview

ControlNet-Union (xinsir/controlnet-union-sdxl-1.0) is an open-source "ControlNet++" model built for Stable Diffusion XL (SDXL). It unifies many ControlNet-style conditionings into one network so you can run pose-, edge-, depth- or line-guided text-to-image generation and multi-condition editing from a single checkpoint. The project emphasizes multi-condition fusion (learned during training), NovelAI-style bucket training for arbitrary aspect ratios, and advanced tile-based editing tools (tile deblur, tile variation, tile super-resolution) designed for very high-resolution outputs. ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)) The repository and model card describe support for more than ten control types (examples: OpenPose, Depth, Canny, Lineart, AnimeLineart, MLSD, Scribble, HED, PIDI/Softedge, TEED, Segment, Normal) and a ProMax variant that advertises 12 control types plus five advanced editing modes (tile deblur, variation, super-resolution, inpainting, outpainting). The author provides inference scripts and integration notes (Diffusers/ComfyUI guidance), and the model is released under Apache-2.0. Community threads show active user adoption and feature requests (for example, requests for an SD1.5 union variant and some troubleshooting discussion). ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0))

Model Statistics

  • Downloads: 111,693
  • Likes: 1646
  • Pipeline: text-to-image

License: apache-2.0

Model Details

Architecture and capabilities: ControlNet-Union extends the original ControlNet idea with two new modules named the Control Encoder and the Condition Transformer. The design shares a single condition encoder across multiple control types (reducing parameter inflation) and uses a transformer layer to fuse multiple condition features into a condition bias that is added to the source image features. This enables single- and multi-condition inference without manually tuning per-condition hyperparameters. The project author reports training with multi-resolution strategies, re-captioned prompts (CogVLM-style), and bucket training on a dataset the README describes as "over 10,000,000 images" to improve robustness and prompt-following. The repository includes inference scripts for single- and multi-condition setups and practical tips (for example, a recommended draw_bodypose replacement for best OpenPose results). ([github.com](https://github.com/xinsir6/ControlNetPlus?utm_source=openai)) Compatibility and practical notes: The checkpoint is intended for SDXL pipelines and works with Hugging Face Diffusers' SDXL ControlNet pipelines (StableDiffusionXLControlNetPipeline / ControlNetModel). The model card and GitHub emphasise compatibility with other SDXL base models and LoRA adapters. Model parameters are not published in the model card (parameters: unknown). The checkpoint and code are licensed under Apache-2.0. Users should expect to run this on GPU-capable environments; many community spaces (ComfyUI and Hugging Face Spaces) already use this model. ([huggingface.co](https://huggingface.co/docs/diffusers/api/pipelines/controlnet_sdxl?utm_source=openai))

Key Features

  • Unified ControlNet supporting 10+ control types in one checkpoint (pose, depth, canny, lineart, segment, normal).
  • Multi-condition fusion learned during training — no manual fusion hyperparameters required.
  • Advanced tile editing: tile deblur, tile-based variation, and tile super-resolution for ultra-high-res images.
  • Inpainting and outpainting workflows integrated into the ProMax variant for content-aware image editing.
  • Novel bucket training strategy for arbitrary aspect ratios and high-resolution outputs.
  • Compatibility with Hugging Face Diffusers SDXL ControlNet pipelines and common SDXL base models.
  • Open-source Apache-2.0 license and inference scripts provided in the GitHub repository.

Example Usage

Example (python):

from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, AutoencoderKL
from diffusers.utils import load_image
from PIL import Image
import numpy as np
import torch
import cv2

# Note: this example follows Diffusers' SDXL ControlNet guidance.
# Replace model IDs or device as needed.

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load ControlNet-Union checkpoint (SDXL-compatible controlnet)
controlnet = ControlNetModel.from_pretrained(
    "xinsir/controlnet-union-sdxl-1.0",
    torch_dtype=torch.float16 if device == "cuda" else torch.float32,
)

# Load an SDXL base model (example: stabilityai base); supply an appropriate VAE
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16 if device == "cuda" else torch.float32)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    vae=vae,
    torch_dtype=torch.float16 if device == "cuda" else torch.float32,
)

# Enable memory optimizations if available
if device == "cuda":
    pipe.enable_model_cpu_offload()

# Prepare a single control image (example: Canny edge map)
img = load_image("./example_input.jpg").convert("RGB")
img = img.resize((1024, 1024))
arr = np.array(img)
edges = cv2.Canny(arr, 100, 200)
edges = edges[:, :, None]
edges = np.concatenate([edges, edges, edges], axis=2)
control_image = Image.fromarray(edges)

prompt = "A cinematic portrait of a warrior wearing ornate armor, dramatic rim lighting"

# Generate
result = pipe(
    prompt,
    image=control_image,
    controlnet_conditioning_scale=1.0,  # per-condition strength
    num_inference_steps=30,
).images[0]

result.save("./controlnet_union_result.png")

# For multi-condition use: pass a list of control images and set the appropriate control_type in the repo scripts.
# See the model's inference scripts and the Diffusers ControlNet SDXL docs for multi-control examples and pre-processing tips.
# References: Diffusers SDXL ControlNet docs and xinsir ControlNetPlus repo.

Benchmarks

Hugging Face downloads (last month): 111,693 (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))

Hugging Face likes / stars: ~1.65k (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))

Supported control types (documented): 12 (OpenPose, Depth, Canny, Lineart, AnimeLineart, MLSD, Scribble, HED, PIDI/Softedge, TEED, Segment, Normal) (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))

Advanced editing modes (ProMax): 5 (Tile Deblur, Tile Variation, Tile Super Resolution, Inpainting, Outpainting) (Source: ([huggingface.co](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0)))

Release / announcement date: Initial ControlNet++ release July 6, 2024 (ProMax around July 13, 2024) (Source: ([github.com](https://github.com/xinsir6/ControlNetPlus?utm_source=openai)))

Last Refreshed: 2026-01-09

Key Information

  • Category: Image Models
  • Type: AI Image Models Tool