Speech 2.5 Turbo Preview
A high-definition text-to-speech model with enhanced multilingual expressiveness, more precise voice replication, and expanded support for 40 languages.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
Quick Code Example
Here is an example of generating an audio response to the user input provided in the text parameter.
import os
import requests
def main():
url = "https://api.aimlapi.com/v1/tts"
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
}
payload = {
"model": "minimax/speech-2.5-turbo-preview",
"text": "Hi! What are you doing today?",
"voice_setting": {
"voice_id": "Wise_Woman"
}
}
response = requests.post(url, headers=headers, json=payload, stream=True)
dist = os.path.abspath("your_file_name.wav")
with open(dist, "wb") as write_stream:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
write_stream.write(chunk)
print("Audio saved to:", dist)
main()import fs from "fs";
import path from "path";
async function main() {
const url = "https://api.aimlapi.com/v1/tts";
const payload = {
model: "minimax/speech-2.5-turbo-preview",
text: "Hi! What are you doing today?",
voice_setting: {
voice_id: "Wise_Woman"
}
};
const response = await fetch(url, {
method: "POST",
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": `Bearer <YOUR_AIMLAPI_KEY>`,
"Content-Type": "application/json"
},
body: JSON.stringify(payload)
});
// Read response as ArrayBuffer and convert to Buffer
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(arrayBuffer);
// Save audio to file in the current working directory
const dist = path.join(process.cwd(), "your_file_name.wav");
fs.writeFileSync(dist, buffer);
console.log("Audio saved to:", dist);
}
main();API Schema
Bearer key
The text content to be converted to speech.
Enable streaming mode for real-time audio generation. When enabled, audio is generated and delivered in chunks as it's processed.
falseLanguage recognition enhancement option.
Enable subtitle generation service. Only available for non-streaming requests. Generates timing information for the synthesized speech.
falseFormat of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.
hexPossible values: {
"metadata": {
"transaction_key": "text",
"request_id": "text",
"sha256": "text",
"created": "2025-11-13T00:54:49.456Z",
"duration": 1,
"channels": 1,
"models": [
"text"
],
"model_info": {
"ANY_ADDITIONAL_PROPERTY": {
"name": "text",
"version": "text",
"arch": "text"
}
}
}
}Last updated
Was this helpful?