gpt-4o-transcribe
Model Overview
Setup your API Key
API Schemas
Creating and sending a speech-to-text conversion task to the server
post
Authorizations
AuthorizationstringRequired
Bearer key
Body
modelundefined · enumRequiredPossible values:
languagestringOptional
The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available
promptstringOptional
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
temperaturenumber · max: 1OptionalDefault:
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
0urlstring · uriRequired
URL of the input audio file.
Responses
201Success
application/json
post
/v1/stt/create201Success
Requesting the result of the task from the server using the generation_id
get
Authorizations
AuthorizationstringRequired
Bearer key
Path parameters
generation_idstringRequired
Responses
201Success
application/json
get
/v1/stt/{generation_id}201Success
Code Example: Processing a Speech Audio File via URL
Last updated
Was this helpful?