Watt Tool 8B - AI Language Models Tool
Overview
Watt Tool 8B is an 8-billion-parameter language model fine-tuned from meta-llama/Llama-3.1-8B-Instruct to specialize in precise tool selection and multi-turn dialogue for workflow automation. The model is intended for agent-style applications that must choose, format, and invoke external tools (APIs/functions) across conversational turns, and the Hugging Face model card explicitly highlights its optimization for tool usage and function-calling tasks. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) The authors report using supervised fine-tuning plus Direct Multi‑Turn Preference Optimization (DMPO) and Chain-of‑Thought (CoT) data synthesis to improve multi-turn decision-making, and they cite the Berkeley Function‑Calling Leaderboard (BFCL) as a target evaluation where watt-tool-8B achieves top results in function-calling accuracy. This combination makes the model suitable for embedding into workflow builders such as Lupan and similar agent platforms, and the community has produced quantized/ GGUF conversions for local inference. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B/commit/6cef7891d87f392a343086108fd4ce4c28a08c52?utm_source=openai))
Model Statistics
- Downloads: 649
- Likes: 116
- Parameters: 8.0B
License: apache-2.0
Model Details
Base architecture and license: watt-tool-8B is a fine-tuned derivative of meta-llama/Llama-3.1-8B-Instruct (8.0B parameters) and the project model card / commit lists the license as Apache-2.0. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) Training & optimization: the maintainers describe a training pipeline composed of supervised fine-tuning (SFT), Chain-of-Thought (CoT) synthesis for multi-step dialogues, and Direct Multi‑Turn Preference Optimization (DMPO) to encourage robust multi-turn tool-selection behavior. Those techniques aim to improve function-calling precision, relevance detection (when no tool is appropriate), and maintaining context across multiple turns. ([promptlayer.com](https://www.promptlayer.com/models/watt-tool-8b?utm_source=openai)) Precision & deployment options: the primary upload uses BF16 tensors for full-precision checkpoints, and the community has released numerous quantized variants (GGUF, Q4/Q5/Q6 etc.) and llama.cpp-compatible packages for low-memory/local inference. Perplexity and quantized-size measurements for several quantizations are available in community conversion repos. The Hugging Face page shows the model as not currently deployed by a hosted inference provider. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai))
Key Features
- Fine-tuned from LLaMa-3.1-8B-Instruct for tool and function-calling accuracy.
- Optimized for multi-turn dialogues with context retention across several agent steps.
- Trained with SFT and DMPO, using CoT data synthesis for complex decision chains.
- Available community quantizations (GGUF/Q4–Q8) for low-memory local inference.
- Designed to produce structured function/tool calls suited for workflow automation.
Example Usage
Example (python):
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "watt-ai/watt-tool-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map='auto')
# Example chat-style usage (apply your own function/tool schema when integrating)
messages = [
{"role": "system", "content": "You are a function-invoking assistant. Return only structured tool calls when needed."},
{"role": "user", "content": "Find sales growth for company XYZ last 3 years and interest coverage ratios."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors='pt').to(model.device)
outputs = model.generate(inputs['input_ids'], max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# See the watt-tool-8B model card for a recommended chat template and tool invocation examples. ([huggingface.co](https://huggingface.co/watt-ai/watt-tool-8B?utm_source=openai)) Benchmarks
Parameters: 8.0B (Source: https://huggingface.co/watt-ai/watt-tool-8B)
Hugging Face downloads (last month): 680 (reported on model page) (Source: https://huggingface.co/watt-ai/watt-tool-8B)
Tensor dtype: BF16 (primary checkpoint) (Source: https://huggingface.co/watt-ai/watt-tool-8B)
Reported BFCL performance (leaderboard entry): Described as state-of-the-art on Berkeley Function-Calling Leaderboard in model card (Source: https://gorilla.cs.berkeley.edu/leaderboard)
Perplexity (quantized IQ3_M example): μPPL ≈ 7.84 (reported for a quantized variant in community conversion) (Source: https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/blob/main/README.md)
Key Information
- Category: Language Models
- Type: AI Language Models Tool