Dolphin 3.0 R1 Mistral 24B - AI Language Models Tool
Overview
Dolphin 3.0 R1 Mistral 24B is an instruct‑tuned, locally deployable text‑generation model built on the Mistral Small 24B base. The R1 variant was fine‑tuned for reasoning with ~800k reasoning traces and three epochs of specialized SFT data to improve math, coding, chain‑of‑thought, and agentic/function‑calling behavior. The project emphasizes steerable alignment (you control the system prompt and alignment) and local control for businesses that prefer self‑hosted models rather than closed hosted APIs. ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)) ([huggingface.co](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501?utm_source=openai)) Technically, Dolphin3.0‑R1 is a 24B‑parameter BF16 model (the base Mistral Small model provides a ~32k token context window). The Hugging Face model card documents recommended low temperatures (0.05–0.1) for stable outputs, provides ChatML examples and an Ollama quickstart, and links quantized/GGUF builds for efficient local inference. Community discussion on forums (including LocalLLaMA/Reddit threads) shows rapid adoption among local‑inference users, with early feedback noting improved reasoning traces but encouraging users to evaluate behavior for their own tasks. ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B))
Key Features
- Instruct‑tuned for coding, math, reasoning, and agentic tasks.
- Steerable alignment via explicit system prompt control for application‑specific behavior.
- Function‑calling and agent‑friendly outputs suitable for tool integrations.
- Fine‑tuned with ~800k reasoning traces (R1) to improve chain‑of‑thought and math.
- Available quantized builds (GGUF / Ollama) for compact local inference.
- Based on Mistral Small 24B base with a ~32k token context window.
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
# Model ID (Hugging Face)
model_id = "dphn/Dolphin3.0-R1-Mistral-24B"
# Load tokenizer and model (adjust device_map / dtype to your hardware)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")
# Use ChatML-style system prompt to steer behavior (model card examples use ChatML).
prompt = (
"<|im_start|>system\n"
"You are Dolphin, a helpful coding assistant.\n<|im_end|>"
"<|im_start|>user\n"
"Write a short, well‑commented Python function that computes factorial iteratively.\n<|im_end|>"
"<|im_start|>assistant\n"
)
out = pipe(prompt, max_new_tokens=256, temperature=0.07)
print(out[0]['generated_text']) Benchmarks
Parameters: 24B (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))
Reasoning traces (R1 fine‑tune): ~800k reasoning traces; trained for 3 epochs (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))
Context window: ≈32,768 tokens (Mistral Small base) (Source: ([huggingface.co](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501?utm_source=openai)))
Tensor dtype: BF16 (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))
Recommended sampling temperature: 0.05–0.1 (for stable reasoning/coding outputs) (Source: ([huggingface.co](https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B)))
Key Information
- Category: Language Models
- Type: AI Language Models Tool