spaCy Models - AI Language Models Tool

Overview

spaCy Models is the official Explosion-maintained GitHub release repository that publishes pre-trained pipeline packages for the spaCy NLP library (distributed as .whl and .tar.gz release artifacts). The repository provides ready-to-install pipelines for core NLP tasks (tokenization, tagging, dependency parsing, lemmatization, sentence segmentation and named-entity recognition), language-specific variants (sm/md/lg) and transformer-backed pipelines (trf) that integrate Hugging Face transformer weights. Model packages include metadata (meta.json, config.cfg, accuracy.json) and follow a clear naming/versioning convention that encodes spaCy compatibility and model variant. ([github.com](https://github.com/explosion/spacy-models)) Releases are organised so the best-matching model for your installed spaCy version can be installed with spaCy's CLI (python -m spacy download <model>) or directly with pip (pip install <release .whl or .tar.gz>). The repository also publishes per-release pages showing model size, pipeline components, evaluation metrics (token/tag/dep/NER scores), and license information. Compatibility between spaCy and model package versions is driven by a compatibility.json used by spaCy's download/validate commands; Explosion has recently adjusted packaging flags to make models more tolerant of spaCy minor-version changes. These practices make it straightforward to pick a trade-off between speed, footprint, and accuracy (for example, en_core_web_sm for CPU efficiency vs en_core_web_trf for higher NER accuracy). ([github.com](https://github.com/explosion/spacy-models))

GitHub Statistics

  • Stars: 1,835
  • Forks: 312
  • Contributors: 12
  • Primary Language: Python
  • Last Updated: 2025-05-27T09:24:39Z
  • Latest Release: ca_core_news_lg-3.8.0

Key Features

  • Pre-packaged pipelines as .whl and .tar.gz releases for easy pip installation.
  • Model naming convention encodes language, genre, and size (sm, md, lg, trf).
  • Transformer-backed pipelines (trf) that integrate Hugging Face transformer weights.
  • Per-release accuracy.json and meta.json with token/tag/dep/NER evaluation metrics.
  • Compatibility index (compatibility.json) and spaCy CLI integration for version matching.

Example Usage

Example (python):

# Install spaCy (if needed)
# pip install -U spacy

# 1) Recommended: use spaCy's downloader for compatibility
#    This downloads and links the best model version for your spaCy install
python -m spacy download en_core_web_trf

# 2) Or install a specific release wheel directly (example URL from releases)
# pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_trf-3.2.0/en_core_web_trf-3.2.0-py3-none-any.whl

# Load and use the model
import spacy
nlp = spacy.load("en_core_web_trf")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.start_char, ent.end_char)

Pricing

Free — models and repository are published under permissive licenses (models/releases are distributed under the MIT license).

Benchmarks

en_core_web_trf — NER F1 (report): 89.90 (ENTS_F from model release accuracy.json) (Source: https://newreleases.io/project/github/explosion/spacy-models/release/en_core_web_trf-3.2.0)

en_core_web_trf — model size: 438 MB (packaged release) (Source: https://newreleases.io/project/github/explosion/spacy-models/release/en_core_web_trf-3.2.0)

Repository popularity: ~1.8k stars on GitHub (explosion/spacy-models) (Source: https://github.com/explosion/spacy-models)

spaCy language coverage (library-level): spaCy supports tokenization/training for 70+ languages (models available for many languages) (Source: https://github.com/explosion/spaCy)

Last Refreshed: 2026-01-17

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool