WizardLM - AI Language Models Tool
Overview
WizardLM is an open-source family of instruction-tuned large language models and training methods developed by the WizardLM team (affiliated with Microsoft AI). WizardLM-2 (released April 15, 2024) expands the line with three main variants — WizardLM-2 7B, 70B, and an MoE-style 8x22B — and emphasizes complex-chat, multilingual understanding, reasoning, coding, and agent-style tasks. The project pairs model releases with a research contribution called Auto Evol-Instruct, a fully AI-driven pipeline that automatically designs and optimizes instruction-evolution methods so instruction datasets become progressively more diverse and more challenging. According to the Auto Evol-Instruct paper and the project release notes, this automated pipeline replaces manual heuristics, scales Evol-Instruct across many domains, and leverages an Arena Learning-style setup where multiple models produce and critique high-difficulty training examples. WizardLM’s engineering approach combines staged training (progressive learning), co‑teaching/self‑teaching among multiple models (the team calls this “AI Align AI”), and reinforcement-style preference tuning (Stage‑DPO and RLEIF). The team reports strong internal results on MT-Bench, AlpacaEval, GSM8K and code generation benchmarks, and publishes model weights and code on Hugging Face and GitHub. Core strengths cited by the authors and community: automated instruction evolution, strong multi-domain instruction-following, and specialized variants (e.g., WizardCoder) tuned for code generation. For details see the Auto Evol-Instruct paper (arXiv) and the WizardLM-2 release blog.
Key Features
- Auto Evol-Instruct: fully AI-driven pipeline to design and optimize instruction evolution methods.
- Arena Learning: scales evolved instruction generation across many domains for harder training data.
- Model family: WizardLM-2 variants (7B, 70B, MoE 8x22B) targeting speed, reasoning, or top-tier capability.
- Code-specialist variants (WizardCoder) fine-tuned with evolved code instructions for strong code generation.
- Open-source releases: weights and code published on Hugging Face and GitHub (Apache 2.0 / community licenses).
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM
def generate_prompt_response(model_id, prompt, max_new_tokens=256):
# Many WizardLM models release code/weights on Hugging Face; large/MoE models
# may require specialized runtimes (vLLM, text-generation-inference) or
# trust_remote_code=True for repo-specific model code.
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map='auto',
torch_dtype='auto',
trust_remote_code=True # required for some community repos
)
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
if __name__ == '__main__':
# Example: replace with a small WizardLM model id if available locally
model_id = 'KnutJaegersberg/WizardLM-2-8x22B'
prompt = (
'A chat between a curious user and an artificial intelligence assistant. '
'The assistant gives helpful, detailed, and polite answers.\n'
'USER: Explain Auto Evol-Instruct in plain language.\nASSISTANT:'
)
print(generate_prompt_response(model_id, prompt)) Benchmarks
MT-Bench (reported): 8.09 (Mixtral-8x7B fine-tuned on 10K evolved ShareGPT) (Source: https://arxiv.org/pdf/2406.00770)
AlpacaEval (reported): 91.4 (Mixtral-8x7B fine-tuned on 10K evolved ShareGPT) (Source: https://arxiv.org/pdf/2406.00770)
GSM8K (reported): 82.49 (Mixtral-8x7B fine-tuned on 7K evolved GSM8K) (Source: https://arxiv.org/pdf/2406.00770)
HumanEval (WizardCoder): 57.3 pass@1 (WizardCoder-15B-v1.0, reported) (Source: https://huggingface.co/WizardLMTeam/WizardCoder-15B-V1.0)
Model size (WizardLM-2 8x22B / MoE): 141B effective parameters (Mixture-of-Experts family representation) (Source: https://huggingface.co/KnutJaegersberg/WizardLM-2-8x22B)
Key Information
- Category: Language Models
- Type: AI Language Models Tool