watt-tool-70B - AI Language Models Tool
Overview
watt-tool-70B is a 70B-class instruction-following LLM fine-tuned from the LLaMa-3.3-70B-Instruct family specifically for reliable tool/function calling and multi-turn agent workflows. The maintainers describe it as optimized for selecting and invoking external tools in multi-step conversations, with supervised fine-tuning and multi-turn preference optimization techniques used during training. The model card states this focus explicitly and notes that watt-tool-70B achieved top performance on the Berkeley Function-Calling Leaderboard (BFCL) at the time of its release. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-70B)) Practically, watt-tool-70B is positioned for workflow builders and agent platforms (examples cited by the authors include Lupan and Coze). The Hugging Face model card lists concrete deployment artifacts (BF16 weights, quantized community variants) and links to the training paper that introduces Direct Multi-Turn Preference Optimization (DMPO), the loss used to improve multi-turn agent behavior. That combination makes the model useful for researchers and engineers building orchestrators that must maintain turn-level context and choose or chain function calls across dialogue turns. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-70B))
Model Statistics
- Downloads: 90
- Likes: 118
- Parameters: 70.6B
License: apache-2.0
Model Details
Architecture and base: watt-tool-70B is a 70B-parameter causal LLM derived from the LLaMa-3.3-70B-Instruct lineage and distributed on Hugging Face (model page lists the model size as 71B parameters). The model weights on Hugging Face are provided in BF16 and multiple community quantized builds (GGUF / GPTQ / AWQ variants) are available for lower-memory inference. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-70B)) Training and objectives: the fine-tuning pipeline reported by the maintainers combines supervised fine-tuning (SFT) on curated tool-call / multi-turn datasets with Direct Multi-Turn Preference Optimization (DMPO), a multi-turn extension of direct preference optimization intended to reduce compounding errors across turns. The DMPO paper and code are public (arXiv:2406.14868). These choices prioritize correct function selection, parameter extraction, and turn-aware decision-making over generic single-turn instruction tuning. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-70B)) Capabilities and caveats: watt-tool-70B is optimized for: robust function-call syntax generation, multi-step tool orchestration across turns, and automatic tool selection when multiple candidate functions are available. It is not listed as deployed to any major inference provider on the official model page (users commonly run local or community-quantized variants). Real-world performance on BFCL may change as the leaderboard updates and new models are submitted. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-70B))
Key Features
- Fine-tuned for multi-turn function/tool selection and orchestration.
- Trained with DMPO to reduce compounding errors across dialogue turns.
- Generates parsable function-call outputs suitable for automated tool invocation.
- Distributed in BF16 with community quantized GGUF/GPTQ/AWQ variants for lower-memory inference.
- Designed for integration into workflow builders (examples: Lupan, Coze) and agent pipelines.
- Model card and training paper are public, enabling reproducibility and inspection.
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "watt-ai/watt-tool-70B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
# Example: single-turn function-calling prompt (adapt for your tool spec)
system = "You are an assistant that must choose and call the correct function when appropriate. Return only the function call if you invoke a tool."
user = "Find the weather in New York City for the next 3 days."
messages = [{"role":"system","content":system}, {"role":"user","content":user}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True))
# See the model card for a tool-list format example and recommended function-call output format. (Hugging Face model page contains a fuller how-to.) Benchmarks
Parameter count: 71B (model card) (Source: https://huggingface.co/watt-ai/watt-tool-70B)
License: Apache-2.0 (model card) (Source: https://huggingface.co/watt-ai/watt-tool-70B)
Hugging Face likes: 118 likes (page) (Source: https://huggingface.co/watt-ai/watt-tool-70B)
Downloads (recent month): Downloads in last month: listed on model page (example snapshot: 93) (Source: https://huggingface.co/watt-ai/watt-tool-70B)
Function-calling benchmark: Reported state-of-the-art / top-performing on Berkeley Function-Calling Leaderboard (model card claim; leaderboard is live and updates over time) (Source: https://huggingface.co/watt-ai/watt-tool-70B, https://gorilla.cs.berkeley.edu/leaderboard)
Key Information
- Category: Language Models
- Type: AI Language Models Tool