gpt-4o-mini-tts
A text-to-speech model based on GPT-4o mini, supporting up to 2,000 input tokens.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
Code Example
import requests
# Insert your AI/ML API key instead of <YOUR_AIMLAPI_KEY>:
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v1"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
data = {
"model": "openai/gpt-4o-mini-tts",
"text": "GPT-4o-mini-tts is a small and fast model. Use it to convert text to natural sounding spoken text.",
"voice": "coral",
}
response = requests.post(f"{base_url}/tts", headers=headers, json=data)
response.raise_for_status()
result = response.json()
print("Audio URL:", result["audio"]["url"])
Listen to the audio sample we generated:
API Schema
The text content to be converted to speech.
Name of the voice to be used
Determines the style exaggeration of the voice. This setting attempts to amplify the style of the original speaker. It does consume additional computational resources and might increase latency if set to anything other than 0.
Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.
mp3
Possible values: Adjusts the speed of the voice. A value of 1.0 is the default speed, while values less than 1.0 slow down the speech, and values greater than 1.0 speed it up.
1
POST /v1/tts HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 113
{
"model": "openai/gpt-4o-mini-tts",
"text": "text",
"voice": "alloy",
"style": "text",
"response_format": "mp3",
"speed": 1
}
{
"metadata": {
"transaction_key": "text",
"request_id": "text",
"sha256": "text",
"created": "2025-10-20T12:12:52.407Z",
"duration": 1,
"channels": 1,
"models": [
"text"
],
"model_info": {
"ANY_ADDITIONAL_PROPERTY": {
"name": "text",
"version": "text",
"arch": "text"
}
}
}
}
Last updated
Was this helpful?