gpt-4o-transcribe
Model Overview
Setup your API Key
API Schemas
Creating and sending a speech-to-text conversion task to the server
post
Body
modelstring · enumRequiredPossible values:
urlstring · uriOptional
URL of the input audio file.
languagestringOptional
The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available
promptstringOptional
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
temperaturenumber · max: 1OptionalDefault:
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
0Responses
200Success
application/json
generation_idstringRequired
post
/v1/stt/create200Success
Requesting the result of the task from the server using the generation_id
get
Path parameters
generation_idstringRequired
Responses
200Success
application/json
idstringRequired
statusstring · enumRequiredPossible values:
outputany ofRequired
or
or
get
/v1/stt/{generation_id}200Success
Code Example: Processing a Speech Audio File via URL
Last updated
Was this helpful?