Text Embeddings Inference

An open-source, high-performance toolkit developed by Hugging Face for deploying and serving text embeddings and sequence classification models. It features dynamic batching, optimized transformers code (via Flash Attention and cuBLASLt), support for multiple model types, and lightweight docker images for fast inference.

Key Information

  • Category: Developer Tools
  • Source: Github
  • Last updated: January 09, 2026

Structured Metrics

No structured metrics captured yet.

Links

Canonical source: https://github.com/huggingface/text-embeddings-inference