OpenAI GPT-5 - AI Language Models Tool

Overview

OpenAI GPT-5 is a multi-variant language model family (gpt-5, gpt-5-mini, gpt-5-nano and GPT-5 pro) built for advanced reasoning, code generation, instruction following, and multimodal inputs. OpenAI describes GPT-5 as a unified system that routes queries to either a fast responder or a dedicated “thinking” reasoning engine (GPT‑5 thinking) and falls back to mini variants when limits are reached; the rollout made GPT‑5 the default for ChatGPT and added a pro variant for extended reasoning. ([openai.com](https://openai.com/index/introducing-gpt-5/)) Technically, GPT‑5 supports very large context windows and explicit controls to trade off speed, cost, and depth: the public docs list up to a 400,000-token context window and a large max-output token cap, plus configurable reasoning and verbosity controls (values like minimal/low/medium/high for reasoning, low/medium/high for verbosity). These controls let developers limit internal “thinking” tokens or steer answer length without repeatedly rewriting prompts. Pricing and API access details are published on OpenAI’s pricing pages. ([platform.openai.com](https://platform.openai.com/docs/models/gpt-5)) Early community and press reaction has been mixed: many developers praised the step-change in coding and multimodal reasoning, while some users asked for personality and conversational warmth adjustments, which OpenAI subsequently iterated on in follow-up releases. Replicate and other developer playgrounds provide practical parameter guidance and usage examples for the gpt-5 series. ([replicate.com](https://replicate.com/openai/gpt-5))

Key Features

  • Configurable reasoning_effort (minimal/low/medium/high) to trade speed for deeper multi-step thinking.
  • Verbosity control (low/medium/high) to steer answer length separate from token caps.
  • Huge context window (up to 400k tokens) for long documents, books, or large codebases.
  • Multimodal input: accepts images for visual reasoning and diagram interpretation.
  • Model family variants: gpt-5, gpt-5-mini, gpt-5-nano, plus GPT‑5 pro for extended reasoning.

Example Usage

Example (python):

from openai import OpenAI

# Example: call the Responses API with GPT-5, steering reasoning effort and verbosity
client = OpenAI()

resp = client.responses.create(
    model="gpt-5",
    input=[{"role": "user", "content": "Summarize the key classes in this repo and suggest a testing plan."}],
    reasoning_effort="high",       # minimal|low|medium|high (trade speed for depth)
    verbosity="medium",           # low|medium|high (steer output length)
    max_completion_tokens=1500     # budget for the generated answer
)

# Print main text output (structure varies by SDK/response shape)
print(resp.output[0].content[0].text)

# Notes: update to the latest OpenAI SDK and consult platform docs for exact response shapes and streaming.

Pricing

OpenAI lists per-1M-token pricing for the GPT-5 family on its pricing page: example public rates for GPT-5 batch/API pricing are Input $1.25 / 1M tokens, Cached input $0.125 / 1M tokens, Output $10.00 / 1M tokens. Mini and Nano variants are cheaper (e.g., GPT-5 mini Input $0.25 / 1M, Output $2.00 / 1M). Separate flagship variants (e.g., GPT-5.2 and GPT‑5.2 pro) have different rates; consult OpenAI’s pricing page for the latest tiered rates and enterprise terms. ([openai.com](https://openai.com/api/pricing/))

Benchmarks

GPQA (GPT‑5 pro, without tools): 88.4% (Source: https://openai.com/index/introducing-gpt-5/)

SWE-bench Verified (real-world coding): 74.9% (Source: https://openai.com/index/introducing-gpt-5/)

AIME 2025 (math, without tools): 94.6% (Source: https://openai.com/index/introducing-gpt-5/)

MMMU (multimodal understanding): 84.2% (Source: https://openai.com/index/introducing-gpt-5/)

Context window (API docs): 400,000 tokens (context), 128,000 max output tokens (Source: https://platform.openai.com/docs/models/gpt-5)

Last Refreshed: 2026-02-24

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool