gpt-5-pro
Model Overview
Version of GPT-5 that produces smarter and more precise responses
How to Make a Call
API Schema
Note: This model can ONLY be called via the /responses endpoint!
Responses Endpoint
This endpoint is currently used only with OpenAI models. Some models support both the /chat/completions and /responses endpoints, while others support only one of them.
Text, image, or file inputs to the model, used to generate a response.
A text input to the model, equivalent to a text input with the user role.
An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
The unique ID of the previous response to the model. Use this to create multi-turn conversations.
Whether to store the generated model response for later retrieval via API.
falseIf set to true, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe truncation strategy to use for the model response.
- auto: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation.
- disabled (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error.
disabledPossible values: How the model should select which tool (or tools) to use when generating a response.
Controls which (if any) tool is called by the model.
none means the model will not call any tool and instead generates a message.
auto means the model can pick between generating a message or calling one or more tools.
required means the model must call one or more tools.
Code Example: Using /responses Endpoint
Last updated
Was this helpful?