Shuttle-3 - AI Language Models Tool
Quick Take
A niche 72B open-weights model for developers who specifically want Claude-3-style prose quality in a self-hostable form factor. Position as a specialized tool for role-play and conversational prose, not a general-purpose LLM contender.
Shuttle-3 is a legitimate open-weight language model with clear differentiation: a 72B Qwen-2.5 fine-tune specifically targeting Claude-3-style prose quality for role-play and conversational tasks. The page offers genuine decision value by helping users compare this niche model against alternatives like base Qwen-2.5, Llama-3.1, or Mistral for prose-focused applications. The Hugging Face presence with 39 likes despite only 44 downloads suggests engaged interest from the community. The availability of community quantizations (GGUF, GPTQ, Q8) and API access options provides practical deployment pathways. While not a mainstream model, it fills a specific niche for users seeking Claude-like prose quality in an open-weights model.
- Best for: ML practitioners and developers seeking an open-weights alternative to Claude-3 for high-quality prose, role-play scenarios, and multilingual chat. Particularly valuable for self-hosting use cases, researchers experimenting with prose-style fine-tuning, and hobbyists with consumer hardware wanting 70B-scale capabilities via quantized builds.
- Skip if: General consumers looking for hosted AI chat services, teams requiring enterprise SLAs, or users unwilling to manage large model deployments. Those seeking benchmark-leading performance should look elsewhere as this model prioritizes prose style over raw benchmark scores.
Why Choose It
- Clarifies that this is a Qwen-2.5-72B fine-tune optimized specifically for Claude-like prose style rather than general-purpose use
- Helps users weigh whether the 130M-token prose-focused fine-tuning justifies choosing this over base Qwen-2.5-72B or other 70B-class models
- Provides practical deployment guidance via GGUF/GPTQ quantizations for consumer hardware
- Frames the model as a niche offering for role-play/prose scenarios vs mainstream alternatives
- Identifies the dual access model (Hugging Face download vs ShuttleAI API) for different user needs
Consider Instead
- Qwen-2.5-72B-Instruct
- Llama-3.1-70B-Instruct
- Mistral-Large-2
- Claude-3-Sonnet (for prose style comparison)
- DeepSeek-V3
Overview
Shuttle-3 is a 72–73B-parameter, instruction-tuned language model from ShuttleAI that is fine-tuned from Qwen-2.5-72B-Instruct to prioritize high-quality prose, multi-turn chat, role-play scenarios, and multilingual reasoning. The public Hugging Face model card and accompanying maintainer notes describe Shuttle-3 as explicitly optimized to emulate Claude‑3–style writing while retaining Qwen’s multilingual and code-oriented pretraining characteristics. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Shuttle-3 is presented both as a downloadable model on Hugging Face (BF16 safetensors, ChatML prompt format) and as an engine accessible via ShuttleAI’s inference offerings. The model card lists a concise fine-tuning run (≈130M tokens, ~12 hours on 4×A100 PCIe) and indicates the model is intended for complex chat, agent tasks, and multilingual instructions. Community resources and quantized ports (GGUF, Q8) exist for lower‑resource local inference and downstream experimentation. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))
Model Statistics
- Downloads: 44
- Likes: 39
- Pipeline: text-generation
- Parameters: 72.7B
License: other
Model Details
Architecture and base: Shuttle-3 is a causal LLM (≈72.7B parameters) fine-tuned from Qwen‑2.5‑72B‑Instruct. The published model artifacts on Hugging Face use BF16 tensors and the ChatML role-based prompt format for multi-turn dialogue. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Training & fine-tuning: The maintainers report a focused post‑training pass of roughly 130 million tokens performed over a short fine-tuning window (reported 12 hours using four A100 PCIe GPUs). The fine-tuning dataset emphasized role-play and high‑quality conversational prose to emulate a Claude‑3–like response style while preserving Qwen’s multilingual and code capabilities. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai)) Deployment & compatibility: Official artifacts appear as safetensors on Hugging Face; community quantizations (GGUF, GPTQ, AWQ, Q8) exist for running on consumer hardware (Llama.cpp / GGUF workflows). Shuttle-3 is also referenced as a model exposed via ShuttleAI’s API products, which publish separate platform pricing and rate-limit tiers. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))
Key Features
- 72–73B parameter LLM fine-tuned from Qwen‑2.5‑72B‑Instruct.
- Fine-tuned on ≈130M tokens to improve role-play and conversational prose.
- ChatML-format prompting for structured multi‑turn dialogues.
- BF16 safetensors official artifacts for higher-efficiency inference.
- Community quantizations (GGUF/GPTQ/Q8) for lower‑resource local deployment.
- Optimized for multilingual chat, agent flows, and reasoning tasks.
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
# Example loads the Hugging Face Shuttle-3 artifact. Large models require appropriate hardware.
# See the Shuttle-3 model card for BF16/safetensors details and ChatML prompting. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))
model_name = "shuttleai/shuttle-3"
# Tokenizer and model (this will attempt to download ~70+GB unless using quantized/local builds)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="bfloat16",
device_map="auto"
)
gen = pipeline("text-generation", model=model, tokenizer=tokenizer)
# ChatML-style example prompt (ChatML role tokens shown in model doc)
chatml = (
"<|im_start|>system\nYou are a helpful assistant.\n<|im_end|>"
"<|im_start|>user\nWrite a friendly, 3-sentence summary of why unit tests help teams.\n<|im_end|>"
)
resp = gen(chatml, max_new_tokens=200, do_sample=False)
print(resp[0]["generated_text"]) Pricing
Shuttle-3 is published on Hugging Face as a downloadable Qwen‑licensed artifact (no model-specific price on HF). ShuttleAI also operates an API/platform offering access to hosted models; the company publishes tiered plans (Free, Basic, Premium, Scale) with sample pricing (e.g., Free, $10/month, $25/month, $75/month tiers shown on the ShuttleAI site) and associated rate limits and context windows. If you need commercial-hosted inference for Shuttle-3 specifically, consult ShuttleAI’s pricing and docs for plan details and any enterprise agreements. ([huggingface.co](https://huggingface.co/shuttleai/shuttle-3?utm_source=openai))
Benchmarks
Parameter count: 72.7B (reported) (Source: https://huggingface.co/shuttleai/shuttle-3)
Training tokens (fine-tune): 130 million tokens (fine-tune pass) (Source: https://huggingface.co/shuttleai/shuttle-3)
Fine-tune compute: 12 hours on 4 × A100 PCIe (reported) (Source: https://huggingface.co/shuttleai/shuttle-3)
Tensor dtype / recommended precision: BF16 safetensors (official artifact) (Source: https://huggingface.co/shuttleai/shuttle-3)
Hugging Face downloads (recent): Downloads last month: 56 (as shown on HF model page) (Source: https://huggingface.co/shuttleai/shuttle-3)
Key Information
- Category: Language Models
- Type: AI Language Models Tool