OpenAI GPT-4o - AI Language Models Tool
Overview
OpenAI GPT-4o ("o" for "omni") is a flagship multimodal model family designed for high-throughput, low-latency applications that combine text, vision and (in select realtime deployments) audio. GPT-4o provides a very large context window (128k tokens), structured-output features, function calling, fine-tuning support, and Realtime endpoints for low-latency voice/text experiences. ([platform.openai.com](https://platform.openai.com/docs/models/gpt-4o)) The model family includes the full GPT-4o variant and the cost-optimized GPT-4o-mini. OpenAI positions GPT-4o as substantially faster and cheaper than previous flagship models (OpenAI reports roughly 2x speed and ~50% cost reduction versus GPT-4 Turbo in API comparisons), while Azure’s GPT-4o-mini releases emphasize even lower per-token costs and vision support for production workloads. Availability and supported modalities differ by endpoint and provider: text+image capabilities are broadly available in the API, and realtime audio/voice capabilities have been exposed via dedicated realtime previews (including Azure’s realtime preview). Note that OpenAI retired GPT-4o in ChatGPT on February 13, 2026 — the model family remains available through API endpoints (and via Azure OpenAI deployments). ([openai.com](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/))
Key Features
- 128k-token context window for long-document and multi-step tasks.
- Multimodal: text and image inputs broadly supported; realtime audio/voice in preview endpoints.
- Realtime API with low-latency streaming for conversational voice agents.
- Function calling, structured outputs, and fine-tuning for production integrations.
- Variants for cost/latency tradeoffs (gpt-4o vs gpt-4o-mini) to optimise price/performance.
Example Usage
Example (python):
from openai import OpenAI
# Requires openai-python library (https://github.com/openai/openai-python)
# Set OPENAI_API_KEY environment variable before running
client = OpenAI()
response = client.responses.create(
model="gpt-4o",
input="You are a helpful assistant. Summarize the benefits of multimodal AI in two sentences."
)
print(response.output_text)
Pricing
OpenAI public API pricing: gpt-4o listed at $2.50 per 1M input tokens and $10.00 per 1M output tokens (OpenAI model docs). GPT-4o-mini is positioned as a lower-cost variant (OpenAI lists significantly lower rates; Azure and Azure OpenAI publish regional/global pricing and GPT-4o-mini rates). Azure’s published example (TechCommunity/Azure) shows per-1,000-token rates for GPT-4o and GPT-4o-mini (global/regional), e.g. GPT-4o: $0.005 input / $0.015 output per 1,000 tokens; GPT-4o-mini global: $0.00015 input / $0.0006 output per 1,000 tokens. Pricing and modality availability vary by provider, deployment type (API vs Realtime), region, and whether you use OpenAI’s direct API or Azure OpenAI Service; consult provider pages for current billing. ([platform.openai.com](https://platform.openai.com/docs/models/gpt-4o))
Benchmarks
MMLU (GPT-4o-mini, reported): 82.0% (Source: https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/)
Audio response latency (OpenAI system card): as low as 232 ms (average ~320 ms) for audio inputs (Source: https://openai.com/index/gpt-4o-system-card/)
Context window (max): 128,000 tokens (Source: https://developers.openai.com/api/docs/models/gpt-4o)
API token pricing (OpenAI, per 1M tokens): Input $2.50; Output $10.00 (gpt-4o) (Source: https://developers.openai.com/api/docs/models/gpt-4o)
Key Information
- Category: Language Models
- Type: AI Language Models Tool