Speech 2.5 Turbo Preview

his documentation is valid for the following model: minimax/speech-2.5-turbo-preview

A high-definition text-to-speech model with enhanced multilingual expressiveness, more precise voice replication, and expanded support for 40 languages.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

Quick Code Example

Here is an example of generating an audio response to the user input provided in the text parameter.

import os
import requests


def main():
    url = "https://api.aimlapi.com/v1/tts"
    headers = {
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
    }
    payload = {
        "model": "minimax/speech-2.5-turbo-preview",
        "text": "Hi! What are you doing today?",
        "voice_setting": {
          "voice_id": "Wise_Woman"
        }
    }

    response = requests.post(url, headers=headers, json=payload, stream=True)
    dist = os.path.abspath("your_file_name.wav")

    with open(dist, "wb") as write_stream:
        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                write_stream.write(chunk)

    print("Audio saved to:", dist)


main()
Response
Audio saved to: c:\Users\user\Documents\Python Scripts\TTSes\your_file_name.wav

API Schema

post
Authorizations
Body
modelundefined · enumRequiredPossible values:
textstring · min: 1 · max: 5000Required

The text content to be converted to speech.

streambooleanOptional

Enable streaming mode for real-time audio generation. When enabled, audio is generated and delivered in chunks as it's processed.

Default: false
language_booststring · enumOptional

Language recognition enhancement option.

Possible values:
subtitle_enablebooleanOptional

Enable subtitle generation service. Only available for non-streaming requests. Generates timing information for the synthesized speech.

Default: false
output_formatstring · enumOptional

Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.

Default: hexPossible values:
Responses
201Success
application/json
201Success
{
  "metadata": {
    "transaction_key": "text",
    "request_id": "text",
    "sha256": "text",
    "created": "2025-10-02T07:41:30.126Z",
    "duration": 1,
    "channels": 1,
    "models": [
      "text"
    ],
    "model_info": {
      "ANY_ADDITIONAL_PROPERTY": {
        "name": "text",
        "version": "text",
        "arch": "text"
      }
    }
  }
}

Last updated

Was this helpful?