ClearerVoice-Studio - AI Audio Tools Tool

Overview

ClearerVoice-Studio is an open-source, AI-driven speech processing toolkit that bundles state-of-the-art pretrained models, training scripts, and evaluation tools for real-world audio workflows. The project focuses on four tightly related tasks—speech enhancement (denoising), speech separation, speech super-resolution (bandwidth extension), and target (reference- or audio‑visual‑conditioned) speaker extraction—so teams can build end-to-end audio pipelines without assembling disparate research code. The repository includes production-ready models such as FRCRN and MossFormer family weights and provides CLI/ Python inference helpers plus demo scripts. ([github.com](https://github.com/modelscope/ClearerVoice-Studio)) It also ships SpeechScore, a bundled speech-quality toolkit (PESQ, STOI, DNSMOS, SI-SDR, etc.) for reproducible evaluation and benchmarking, and supports automatic model downloads from HuggingFace/ModelScope for quick prototyping. Recent additions include a NumPy-in/NumPy-out inference interface (demo_Numpy2Numpy.py), expanded audio-format support, and a PyPI package (clearvoice) to simplify installation. The project positions itself for researchers and engineers who need robust pretrained models and reproducible training/evaluation pipelines for production or research use. ([github.com](https://github.com/modelscope/ClearerVoice-Studio))

GitHub Statistics

  • Stars: 3,946
  • Forks: 323
  • Contributors: 7
  • License: Apache-2.0
  • Primary Language: Python
  • Last Updated: 2025-08-14T08:26:30Z

Repository activity and community indicators show healthy adoption with a compact core team. According to the GitHub project page and aggregated listings, ClearerVoice-Studio has roughly 3.8–3.9k stars and ~310–323 forks, is licensed under Apache‑2.0, and lists a small contributor base (about seven contributors). The README and PyPI metadata indicate active maintenance through 2025 with recent feature commits (examples: pip packaging, NumPy interface, and added super-resolution training scripts). The issues count (dozens open) and active demos on ModelScope/HuggingFace suggest both active usage and user-reported edge cases to address. Overall: strong interest and useful surface area for contributors, but the small number of core contributors suggests most community involvement is via usage/issue reporting rather than large contributor churn. ([github.com](https://github.com/modelscope/ClearerVoice-Studio))

Installation

Install via pip:

pip install clearvoice
git clone https://github.com/modelscope/ClearerVoice-Studio.git
cd ClearerVoice-Studio/clearvoice
pip install --editable .
sudo apt update && sudo apt install ffmpeg   # Ubuntu/Debian (if you need non-wav formats)
brew install ffmpeg                         # macOS (Homebrew)
On Windows: download static build from ffmpeg.org and add bin to PATH

Key Features

  • Pretrained speech enhancement models (FRCRN, MossFormer2) for 16 kHz and 48 kHz inference.
  • Speech separation (MossFormer family) for multi‑speaker isolation at 8/16 kHz sample rates.
  • Speech super-resolution (bandwidth extension) to upsample ≥16 kHz audio to 48 kHz.
  • Audio‑visual target speaker extraction (face/lip conditioning) and audio‑only reference extraction.
  • SpeechScore evaluation toolkit: PESQ, DNSMOS, STOI, SI‑SDR and additional non-intrusive metrics.

Community

Adoption and community feedback are strong: the repo has ~3.8–3.9k GitHub stars and ~310 forks, and the authors report production uses of core models (FRCRN ~3M uses, MossFormer ~2.5M uses on ModelScope). The project publishes demos on HuggingFace and ModelScope and maintains a PyPI package for easier installs. Community channels include GitHub Issues, repository PRs, and platform demo comments; some users have reported task-specific failures (for example, user reports about speaker separation not working in some cases). Overall feedback shows active real‑world usage and continued maintenance, with a compact core maintainer team and typical issue-driven community contributions. ([github.com](https://github.com/modelscope/ClearerVoice-Studio))

Last Refreshed: 2026-03-03

Key Information

  • Category: Audio Tools
  • Type: AI Audio Tools Tool