Shuttle-3 - AI Language Models Tool

Overview

Shuttle-3 is a 72–73B-parameter, instruction-tuned language model from ShuttleAI that is fine-tuned from Qwen-2.5-72B-Instruct to prioritize high-quality prose, multi-turn chat, role-play scenarios, and multilingual reasoning. The public Hugging Face model card and accompanying maintainer notes describe Shuttle-3 as explicitly optimized to emulate Claude‑3–style writing while retaining Qwen’s multilingual and code-oriented pretraining characteristics. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Shuttle-3 is presented both as a downloadable model on Hugging Face (BF16 safetensors, ChatML prompt format) and as an engine accessible via ShuttleAI’s inference offerings. The model card lists a concise fine-tuning run (≈130M tokens, ~12 hours on 4×A100 PCIe) and indicates the model is intended for complex chat, agent tasks, and multilingual instructions. Community resources and quantized ports (GGUF, Q8) exist for lower‑resource local inference and downstream experimentation. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))

Model Statistics

Downloads: 44
Likes: 39
Pipeline: text-generation
Parameters: 72.7B

License: other

Model Details

Architecture and base: Shuttle-3 is a causal LLM (≈72.7B parameters) fine-tuned from Qwen‑2.5‑72B‑Instruct. The published model artifacts on Hugging Face use BF16 tensors and the ChatML role-based prompt format for multi-turn dialogue. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Training & fine-tuning: The maintainers report a focused post‑training pass of roughly 130 million tokens performed over a short fine-tuning window (reported 12 hours using four A100 PCIe GPUs). The fine-tuning dataset emphasized role-play and high‑quality conversational prose to emulate a Claude‑3–like response style while preserving Qwen’s multilingual and code capabilities. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Deployment & compatibility: Official artifacts appear as safetensors on Hugging Face; community quantizations (GGUF, GPTQ, AWQ, Q8) exist for running on consumer hardware (Llama.cpp / GGUF workflows). Shuttle-3 is also referenced as a model exposed via ShuttleAI’s API products, which publish separate platform pricing and rate-limit tiers. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))

Key Features

72–73B parameter LLM fine-tuned from Qwen‑2.5‑72B‑Instruct.
Fine-tuned on ≈130M tokens to improve role-play and conversational prose.
ChatML-format prompting for structured multi‑turn dialogues.
BF16 safetensors official artifacts for higher-efficiency inference.
Community quantizations (GGUF/GPTQ/Q8) for lower‑resource local deployment.
Optimized for multilingual chat, agent flows, and reasoning tasks.

Example Usage

Example (python):

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
# Example loads the Hugging Face Shuttle-3 artifact. Large models require appropriate hardware.
# See the Shuttle-3 model card for BF16/safetensors details and ChatML prompting. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))

model_name = "shuttleai/shuttle-3"

# Tokenizer and model (this will attempt to download ~70+GB unless using quantized/local builds)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="bfloat16",
    device_map="auto"
)

gen = pipeline("text-generation", model=model, tokenizer=tokenizer)

# ChatML-style example prompt (ChatML role tokens shown in model doc)
chatml = (
    "<|im_start|>system\nYou are a helpful assistant.\n<|im_end|>"
    "<|im_start|>user\nWrite a friendly, 3-sentence summary of why unit tests help teams.\n<|im_end|>"
)

resp = gen(chatml, max_new_tokens=200, do_sample=False)
print(resp[0]["generated_text"])

Pricing

Shuttle-3 is published on Hugging Face as a downloadable Qwen‑licensed artifact (no model-specific price on HF). ShuttleAI also operates an API/platform offering access to hosted models; the company publishes tiered plans (Free, Basic, Premium, Scale) with sample pricing (e.g., Free, $10/month, $25/month, $75/month tiers shown on the ShuttleAI site) and associated rate limits and context windows. If you need commercial-hosted inference for Shuttle-3 specifically, consult ShuttleAI’s pricing and docs for plan details and any enterprise agreements. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))

Benchmarks

Parameter count: 72.7B (reported) (Source: https://huggingface.co/shuttleai/shuttle-3)

Training tokens (fine-tune): 130 million tokens (fine-tune pass) (Source: https://huggingface.co/shuttleai/shuttle-3)

Fine-tune compute: 12 hours on 4 × A100 PCIe (reported) (Source: https://huggingface.co/shuttleai/shuttle-3)

Tensor dtype / recommended precision: BF16 safetensors (official artifact) (Source: https://huggingface.co/shuttleai/shuttle-3)

Hugging Face downloads (recent): Downloads last month: 56 (as shown on HF model page) (Source: https://huggingface.co/shuttleai/shuttle-3)

Last Refreshed: 2026-03-03

HuggingFace

Key Information

Category: Language Models
Type: AI Language Models Tool

Visit Official Website

Shuttle-3 - AI Language Models Tool

Overview

Model Statistics

Model Details

Key Features

Example Usage

Pricing

Benchmarks

Key Information

Related Tools

Qwen2.5-7B

DeepSeek-V3

Llama 3

UNfilteredAI-1B

WizardLM

Aria