Dolphin 3.0 R1 Mistral 24B - AI Language Models Tool

Overview

Dolphin 3.0 R1 Mistral 24B is an instruct‑tuned, locally-deployable text generation model built on the Mistral Small 24B base. It was fine‑tuned for reasoning using ~800,000 reasoning traces from the Dolphin‑R1 dataset (trained for 3 epochs) and is positioned for coding, math, chain‑of‑thought reasoning, function‑calling, and agentic workflows. The project emphasizes steerable alignment: owners set the system prompt and alignment rules, enabling deployers to retain full control over prompts, data, and behavior rather than relying on a remote provider. ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)) Because it inherits the Mistral Small 24B architecture, Dolphin 3.0 R1 supports very large contexts (the Mistral Small family uses a 32,768 token context window), which makes it suitable for long‑form documents, RAG pipelines, and extended multi‑turn sessions. Quantized GGUF builds and Ollama recipes are available to enable lighter local deployments (several community quant variants exist). Community reports indicate strong reasoning‑style outputs in many cases but also note variability in coding and hallucination behavior depending on quantization and runtime configuration — test thoroughly for your use case. ([huggingface.co](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501?utm_source=openai))

Key Features

  • Steerable system prompt: owner-controlled alignment and persona configuration.
  • Trained on 800k reasoning traces for explicit chain-of-thought style outputs.
  • Function-calling and agentic task support via function-call datasets used in fine-tuning.
  • Large context support (≈32K tokens) for long documents, RAG and multi-step workflows.
  • Multiple quantized builds (GGUF) and Ollama/LMStudio compatibility for local deployment.

Example Usage

Example (python):

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Replace with the Hugging Face repo id for Dolphin 3.0 R1
MODEL_ID = "dphn/Dolphin3.0-R1-Mistral-24B"

# Load tokenizer + model (device_map='auto' attempts to place layers on available devices)
# For large local deployments consider using quantized GGUF builds or a runtime like vLLM/ollama.
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
    device_map="auto",
    cache_dir=None,
)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

prompt = "<|im_start|>system\nYou are Dolphin, a helpful coding assistant.\n<|im_end|>\n<|im_start|>user\nWrite a short Python function that returns the nth Fibonacci number.\n<|im_end|>"

# Recommended low temperature for Mistral-based Dolphin models.
outputs = pipe(prompt, max_new_tokens=256, temperature=0.06, do_sample=True)
print(outputs[0]["generated_text"])

# Notes:
# - For memory constrained systems use quantized GGUF variants (community-provided) or Ollama.
# - The model card recommends low temperatures (0.05–0.1) for most tasks. 

Benchmarks

Training data (reasoning traces): 800,000 traces; 3 epochs (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))

Base model / parameter count: Mistral Small (≈24B parameters) (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))

Context window: 32,768 tokens (32K) (Source: ([huggingface.co](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501?utm_source=openai)))

Recommended sampling temperature: 0.05–0.10 (low temperature recommended) (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))

Selected benchmark snapshots (community‑reported): MMLU Pro: 22.28; BBH: 33.76; MATH Lvl 5: 31.19 (reported aggregator scores) (Source: ([llm-explorer.com](https://llm-explorer.com/model/cognitivecomputations%2FDolphin3.0-R1-Mistral-24B%2C5XnWr41voF5RORyi3Iwot2?utm_source=openai)))

Last Refreshed: 2026-03-03

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool