UniRig - AI Vision Models Tool

Overview

UniRig is an open-source, unified framework that automates 3D model rigging by predicting skeleton hierarchies and per-vertex skinning weights with an autoregressive, GPT-like transformer. The system splits rigging into two stages: (1) an autoregressive skeleton predictor that uses a novel Skeleton Tree Tokenization to produce topologically valid, hierarchical skeletons, and (2) a Bone-Point Cross Attention skinning module that estimates per-vertex linear-blend-skinning weights and bone attributes. The project code, datasets, and initial checkpoints are publicly available under an MIT license on the official GitHub repository. ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai)) UniRig was introduced in a SIGGRAPH/TOG submission and an arXiv preprint that report strong improvements over prior academic and commercial auto-rigging methods (paper claims a 215% improvement in rigging accuracy and 194% improvement in motion accuracy on the authors' benchmarks). The authors also publish Rig-XL (≈14,000+ rigged models) and VRoid subsets used for training and evaluation; an initial skeleton+skinning checkpoint (trained on Articulation-XL2.0) is available on Hugging Face while full Rig-XL/VRoid-trained checkpoints are being released progressively. System requirements and practical notes (Python 3.11, PyTorch ≥2.3.1, CUDA GPU with >8GB VRAM) and example inference scripts are provided in the repository README. ([ar5iv.org](https://ar5iv.org/pdf/2504.12451))

GitHub Statistics

  • Stars: 1,296
  • Forks: 108
  • Contributors: 4
  • License: MIT
  • Primary Language: Python
  • Last Updated: 2026-01-06T16:17:38Z

Key Features

  • Autoregressive skeleton prediction using Skeleton Tree Tokenization for topologically valid rigs.
  • Bone-Point Cross Attention predicts per-vertex skinning weights conditioned on predicted skeleton.
  • Single unified model designed to handle humans, animals, fictional characters, and inorganic shapes.
  • Dataset release (Rig-XL ~14,000 models) and VRoid subset for anime-style characters.
  • CLI / scripts for skeleton generation, skin prediction, and merge; Blender addon support for VRM export.

Example Usage

Example (python):

import subprocess
import shlex

# Example: run the provided inference shell to generate a skeleton for a single file
cmd = "bash launch/inference/generate_skeleton.sh --input examples/giraffe.glb --output results/giraffe_skeleton.fbx"
subprocess.run(shlex.split(cmd), check=True)

# Example: then predict skinning weights (use the predicted skeleton as input)
cmd2 = "bash launch/inference/generate_skin.sh --input examples/skeleton/giraffe.fbx --output results/giraffe_skin.fbx"
subprocess.run(shlex.split(cmd2), check=True)

# Merge skeleton/skin with original mesh
cmd3 = "bash launch/inference/merge.sh --source results/giraffe_skin.fbx --target examples/giraffe.glb --output results/giraffe_rigged.glb"
subprocess.run(shlex.split(cmd3), check=True)

# Notes: follow repository README for environment setup (Python 3.11, PyTorch>=2.3.1). See GitHub for full instructions and configs. ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai))

Benchmarks

Rigging accuracy improvement (reported): 215% improvement vs prior methods (paper claim) (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))

Motion accuracy improvement (reported): 194% improvement vs prior methods (paper claim) (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))

Skeleton tokenization reduction (VRoid): 27.47% token reduction vs naive representation (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))

Skeleton tokenization reduction (Rig-XL): 29.72% token reduction vs naive representation (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))

Typical training time (skeleton model): Best results ~120 epochs (~18 hours) on 4 × RTX 4090 (authors' note) (Source: ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai)))

Last Refreshed: 2026-01-09

Key Information

  • Category: Vision Models
  • Type: AI Vision Models Tool