Home › Embedding Models › ModernBERT Embed

ModernBERT Embed - AI Embedding Models Tool

Overview

ModernBERT Embed is a BERT-derived sentence embedding model released by Nomic and provided on Hugging Face. It is derived from ModernBERT-base and tuned to produce dense sentence embeddings suitable for tasks like semantic search, nearest-neighbor retrieval, clustering, and sentence similarity. The model exposes a full 768-dimensional embedding and can be used with a truncated 256-dimensional representation when lower memory or faster indexing is required. The Hugging Face model card and examples show integration with multiple client libraries, including SentenceTransformers, the Transformers library, and Transformers.js, which makes it easy to drop into Python and JavaScript pipelines for both server-side and browser inference. The model is distributed under the Apache-2.0 license, has accumulated community adoption on Hugging Face, and is primarily intended as an unsupervised embedding model for downstream semantic tasks (see the model page for usage examples and framework-specific guidance). According to the Hugging Face listing, the model is available for immediate use via the sentence-similarity pipeline and example code.

Model Statistics

Downloads: 78,816
Likes: 223
Pipeline: sentence-similarity
Parameters: 149.0M

License: apache-2.0

Model Details

Architecture and origins: ModernBERT Embed is a BERT-style encoder derived from ModernBERT-base and associated unsupervised embedding variants (e.g., nomic-ai/modernbert-embed-unsupervised). It follows the common BERT encoder + pooling approach to produce fixed-size sentence vectors. The model has approximately 149 million parameters and returns 768-dimensional embeddings by default. The project also documents a truncated 256-dimensional output option for lower-dimensional use cases. Capabilities: The model targets semantic tasks such as sentence similarity scoring, semantic search, clustering, and semantic retrieval. It can be used directly with the SentenceTransformers API for batched embedding generation, with Hugging Face Transformers for custom pooling logic, or with Transformers.js for browser-side inference. The model card indicates a sentence-similarity pipeline, and the distributed license (Apache-2.0) allows commercial and research usage under standard terms. Integration notes: For many applications you can load the model with SentenceTransformer("nomic-ai/modernbert-embed-base"). If you require lower-dimensional vectors for indexing or storage, you can either use the model's documented truncated variant or apply standard dimensionality reduction (PCA/TruncatedSVD) post-encoding. For JavaScript projects, the model page includes Transformers.js usage examples.

Key Features

Produces 768-dimensional sentence embeddings for semantic similarity and search
Supports a truncated 256-dimensional workflow for lower-memory or faster indexing
Designed for sentence-similarity pipeline and unsupervised embedding use cases
Compatible with SentenceTransformers, Hugging Face Transformers, and Transformers.js
Distributed under Apache-2.0 license for research and commercial use

Example Usage

Example (python):

from sentence_transformers import SentenceTransformer
import numpy as np
from sklearn.decomposition import TruncatedSVD

# Load the model (works with SentenceTransformers if model is SentenceTransformer-compatible)
model = SentenceTransformer("nomic-ai/modernbert-embed-base")

sentences = [
    "The quick brown fox jumps over the lazy dog.",
    "A fast brown fox leaps over a sleepy dog."
]

# Produce 768-d embeddings
embeddings_768 = model.encode(sentences, convert_to_numpy=True, show_progress_bar=False)
print("Embeddings shape (768-d):", embeddings_768.shape)

# Option A: If you need a 256-d variant and the model provides one, load that variant instead.
# Option B: Reduce to 256-d using TruncatedSVD (lossy but fast for indexing)
svd = TruncatedSVD(n_components=256, random_state=42)
embeddings_256 = svd.fit_transform(embeddings_768)
print("Embeddings shape (256-d reduced):", embeddings_256.shape)

# Example: compute cosine similarity between the two sentences
def cosine_sim(a, b):
    a_norm = a / np.linalg.norm(a)
    b_norm = b / np.linalg.norm(b)
    return float(np.dot(a_norm, b_norm))

sim_score = cosine_sim(embeddings_768[0], embeddings_768[1])
print("Cosine similarity (768-d):", sim_score)

# Transformers raw approach (for custom pooling):
from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("nomic-ai/modernbert-embed-base")
model_tf = AutoModel.from_pretrained("nomic-ai/modernbert-embed-base")
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
    outputs = model_tf(**inputs)
    # Use CLS token pooling (depends on model card recommendations)
    cls_embeddings = outputs.last_hidden_state[:, 0, :].cpu().numpy()
    print("CLS embeddings shape:", cls_embeddings.shape)