UniRig - AI Vision Models Tool
Overview
UniRig is an open-source, unified framework that automates 3D model rigging by predicting skeleton hierarchies and per-vertex skinning weights with an autoregressive, GPT-like transformer. The system splits rigging into two stages: (1) an autoregressive skeleton predictor that uses a novel Skeleton Tree Tokenization to produce topologically valid, hierarchical skeletons, and (2) a Bone-Point Cross Attention skinning module that estimates per-vertex linear-blend-skinning weights and bone attributes. The project code, datasets, and initial checkpoints are publicly available under an MIT license on the official GitHub repository. ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai)) UniRig was introduced in a SIGGRAPH/TOG submission and an arXiv preprint that report strong improvements over prior academic and commercial auto-rigging methods (paper claims a 215% improvement in rigging accuracy and 194% improvement in motion accuracy on the authors' benchmarks). The authors also publish Rig-XL (≈14,000+ rigged models) and VRoid subsets used for training and evaluation; an initial skeleton+skinning checkpoint (trained on Articulation-XL2.0) is available on Hugging Face while full Rig-XL/VRoid-trained checkpoints are being released progressively. System requirements and practical notes (Python 3.11, PyTorch ≥2.3.1, CUDA GPU with >8GB VRAM) and example inference scripts are provided in the repository README. ([ar5iv.org](https://ar5iv.org/pdf/2504.12451))
GitHub Statistics
- Stars: 1,296
- Forks: 108
- Contributors: 4
- License: MIT
- Primary Language: Python
- Last Updated: 2026-01-06T16:17:38Z
Key Features
- Autoregressive skeleton prediction using Skeleton Tree Tokenization for topologically valid rigs.
- Bone-Point Cross Attention predicts per-vertex skinning weights conditioned on predicted skeleton.
- Single unified model designed to handle humans, animals, fictional characters, and inorganic shapes.
- Dataset release (Rig-XL ~14,000 models) and VRoid subset for anime-style characters.
- CLI / scripts for skeleton generation, skin prediction, and merge; Blender addon support for VRM export.
Example Usage
Example (python):
import subprocess
import shlex
# Example: run the provided inference shell to generate a skeleton for a single file
cmd = "bash launch/inference/generate_skeleton.sh --input examples/giraffe.glb --output results/giraffe_skeleton.fbx"
subprocess.run(shlex.split(cmd), check=True)
# Example: then predict skinning weights (use the predicted skeleton as input)
cmd2 = "bash launch/inference/generate_skin.sh --input examples/skeleton/giraffe.fbx --output results/giraffe_skin.fbx"
subprocess.run(shlex.split(cmd2), check=True)
# Merge skeleton/skin with original mesh
cmd3 = "bash launch/inference/merge.sh --source results/giraffe_skin.fbx --target examples/giraffe.glb --output results/giraffe_rigged.glb"
subprocess.run(shlex.split(cmd3), check=True)
# Notes: follow repository README for environment setup (Python 3.11, PyTorch>=2.3.1). See GitHub for full instructions and configs. ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai)) Benchmarks
Rigging accuracy improvement (reported): 215% improvement vs prior methods (paper claim) (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))
Motion accuracy improvement (reported): 194% improvement vs prior methods (paper claim) (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))
Skeleton tokenization reduction (VRoid): 27.47% token reduction vs naive representation (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))
Skeleton tokenization reduction (Rig-XL): 29.72% token reduction vs naive representation (Source: ([ar5iv.org](https://ar5iv.org/pdf/2504.12451)))
Typical training time (skeleton model): Best results ~120 epochs (~18 hours) on 4 × RTX 4090 (authors' note) (Source: ([github.com](https://github.com/VAST-AI-Research/UniRig?utm_source=openai)))
Key Information
- Category: Vision Models
- Type: AI Vision Models Tool