Home › Audio Tools › GPT-SoVITS

GPT-SoVITS - AI Audio Tools Tool

Overview

GPT-SoVITS is an open-source WebUI for few-shot and zero-shot text-to-speech that can train usable TTS voices from as little as one minute of recorded audio. The project combines modern neural TTS techniques with convenience tooling—voice separation, dataset segmentation, and integrated ASR—to simplify creation and preparation of training data and speed up iteration on custom voices. Its WebUI exposes training, inference, and dataset tools in a single interface to make voice cloning accessible to researchers and hobbyists without deep engineering work. Designed for both few-shot (fine-tune from minutes of data) and zero-shot (synthesize a target speaker without per-speaker re-training) workflows, GPT-SoVITS also supports cross-lingual inference so models can speak in languages different from the training data. The repository is MIT-licensed and widely adopted: according to the GitHub repository it has tens of thousands of stars and thousands of forks, reflecting a large user base and ecosystem of community tools and forks that extend deployment and model-export workflows.

GitHub Statistics

Stars: 53,913
Forks: 5,907
Contributors: 90
License: MIT
Primary Language: Python
Last Updated: 2025-12-30T08:00:21Z
Latest Release: 20250606v2pro

According to the GitHub repository, GPT-SoVITS has 53,913 stars, 5,907 forks, and 90 contributors, and is released under an MIT license. The repository shows active maintenance with recent commits (last recorded commit: 2025-12-30). High star and fork counts indicate strong community adoption, while a relatively large contributor base suggests ongoing development and third-party integrations. The combination of frequent updates and many forks points to an active ecosystem for plugins, model recipes, and deployment examples.

Installation

Install via docker:

git clone https://github.com/RVC-Boss/GPT-SoVITS.git

cd GPT-SoVITS

docker build -t gpt-sovits .

docker run --rm -it -p 7860:7860 -v "$(pwd)/models:/app/models" gpt-sovits

Key Features

Train TTS models with as little as one minute of recorded speech
Zero-shot TTS: synthesize new speakers without per-speaker retraining
Few-shot TTS: fine-tune voice models from small datasets
Cross-lingual inference: speak in languages different from training data
Integrated tools: voice separation, dataset segmentation, and ASR pipelines

Community

A large and active community surrounds GPT-SoVITS. According to the project repository, it has 53,913 stars, 5,907 forks, and 90 contributors. That scale has produced many forks, community model releases, and deployment examples; frequent commits and broad contributor involvement indicate the project is actively maintained and widely adopted.

Last Refreshed: 2026-01-09

GitHub

Key Information

Category: Audio Tools
Type: AI Audio Tools Tool

Visit Official Website

GPT-SoVITS - AI Audio Tools Tool

Overview

GitHub Statistics

Installation

Key Features

Community

Key Information

Related Tools

WhisperX

Retrieval-based Voice Conversion WebUI

Replica

ClearerVoice-Studio

VCClient Real-time Voice Changer

Whisper French Demo