gpt-4o-mini-transcribe

circle-info

This documentation is valid for the following list of our models:

  • openai/gpt-4o-mini-transcribe

Model Overview

A speech-to-text model based on GPT-4o mini for audio transcription. It provides improved word error rates and more accurate language recognition compared to the original Whisper models. Recommended for use cases that require higher transcription accuracy.

circle-check

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guidearrow-up-right.

API Schemas

Creating and sending a speech-to-text conversion task to the server

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumRequiredPossible values:
languagestringOptional

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

promptstringOptional

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

temperaturenumber · max: 1Optional

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Default: 0
urlstring · uriRequired

URL of the input audio file.

Responses
post
/v1/stt/create
201Success

Requesting the result of the task from the server using the generation_id

get
Authorizations
AuthorizationstringRequired

Bearer key

Path parameters
generation_idstringRequired
Responses
get
/v1/stt/{generation_id}
201Success

Example Code: Processing a Speech Audio File via URL

Let's use the openai/gpt-4o-mini-transcribe model to transcribe the following audio fragment:

chevron-rightResponsehashtag

Last updated

Was this helpful?