Watt Tool 8B - AI Language Models Tool
Overview
Watt Tool 8B is an 8-billion‑parameter language model fine‑tuned from meta-llama/Llama-3.1-8B-Instruct for robust function/tool calling and stateful multi‑turn dialogues. It was trained with supervised fine‑tuning (SFT) augmented by Direct Multi‑Turn Preference Optimization (DMPO) and chain‑of‑thought (CoT) style data synthesis to improve multi‑step decision making and precise tool selection in conversational workflows. The model is distributed on Hugging Face as safetensors (BF16) and includes a chat template and example prompts for function‑calling-style outputs. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) Watt Tool 8B targets integrations where language models must choose, compose, and invoke external tools or APIs across multiple turns — for example, AI workflow builders such as Lupan and agent platforms like Coze. The authors highlight performance on the Berkeley Function‑Calling Leaderboard (BFCL) as evidence of its function‑calling capabilities, and the community has produced multiple quantized/converted builds (GGUF/llama.cpp) for edge or local inference. Hugging Face hosts the model files, tokenizer, and README with usage examples. ([gorilla.cs.berkeley.edu](https://gorilla.cs.berkeley.edu/leaderboard?utm_source=openai))
Model Statistics
- Downloads: 71,231
- Likes: 117
License: apache-2.0
Model Details
Base architecture and size: Watt Tool 8B is an 8B‑parameter causal language model derived from meta-llama/Llama-3.1-8B-Instruct and released on Hugging Face in safetensors format (BF16). The model card lists model weight shards, tokenizer files, and a chat template; inference-ready instructions show usage via the Hugging Face Transformers AutoModelForCausalLM/AutoTokenizer APIs. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B/tree/e966e34a21f2cb17a6c3e8bc2209a2faed9268fb?utm_source=openai)) Training and optimization: The maintainers report a supervised fine‑tuning pipeline combined with Direct Multi‑Turn Preference Optimization (DMPO) and Chain‑of‑Thought style synthetic data to improve multi‑turn tool use and preference alignment. The model was explicitly tuned on datasets and synthesis pipelines to teach correct function selection, parameter extraction, and multi‑step orchestration. The model card cites the paper “Direct Multi‑Turn Preference Optimization for Language Agents” as inspiration for the training recipe. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) Deployment and variants: Watt Tool 8B is distributed as BF16 safetensors and many community members have created quantized GGUF/llama.cpp ports (Q4/Q6/Q8, IQ quantizations) for CPU or low‑memory GPU inference. Some community deployments (Ollama listings, GGUF repos) provide trimmed/quantized files and instructions for llama.cpp and other local runtimes. The model card notes it is not currently deployed by any inference provider on Hugging Face. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai))
Key Features
- Fine‑tuned for precise function/tool selection in agent-style workflows.
- Optimized for multi‑turn context maintenance and multi-step task orchestration.
- Training uses SFT + Direct Multi‑Turn Preference Optimization (DMPO).
- Distributed as BF16 safetensors with a provided chat template and tokenizer.
- Community quantizations (GGUF/Q4–Q8, IQ series) for local/edge inference.
- Example function‑calling templates and tool schemas included in README.
Example Usage
Example (python):
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "watt-ai/watt-tool-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map="auto")
# Example multi-turn function-calling style chat messages
system_prompt = "You are an expert in composing functions. You will respond only with function call(s) in the required format if applicable."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Find me the sales growth rate for company XYZ for the last 3 years and the interest coverage ratio for the same period."}
]
# Apply the model's chat template if available and generate
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Note: example adapted from the model README; follow the model card for up-to-date usage details. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) Benchmarks
Berkeley Function‑Calling Leaderboard (BFCL): Model card reports state‑of‑the‑art performance on BFCL (no numeric score provided on model card). (Source: ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)))
Model size: 8B parameters (safetensors/BF16 release). (Source: ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B/tree/e966e34a21f2cb17a6c3e8bc2209a2faed9268fb?utm_source=openai)))
Hugging Face activity: Downloads last month: 4,196 (Hugging Face listing). (Source: ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)))
Community quantizations: Multiple GGUF/Q4–Q8 and IQ quantized builds available from community forks. (Source: ([huggingface.co](https://huggingface.co/Mungert/watt-tool-8B-GGUF?utm_source=openai)))
Key Information
- Category: Language Models
- Type: AI Language Models Tool