Speech 2.6 HD

This documentation is valid for the following model:

  • minimax/speech-2.6-hd

The model generates speech from text prompts and multiple voices, optimized for high-fidelity, natural-sounding output.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

Code Example

import os
import requests

def main():
    url = "https://api.aimlapi.com/v1/tts"
    headers = {
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
    }
    payload = {
        "model": "minimax/speech-2.6-hd",
        "text": "Hi! What are you doing today?",
        "voice_setting": {
          "voice_id": "Wise_Woman"
        }
    }

    response = requests.post(url, headers=headers, json=payload, stream=True)
    dist = os.path.abspath("your_file_name.wav")

    with open(dist, "wb") as write_stream:
        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                write_stream.write(chunk)

    print("Audio saved to:", dist)

main()
Response

Generation time: ~ 5.8 s.

API Schema

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumRequiredPossible values:
textstring · min: 1 · max: 5000Required

The text content to be converted to speech.

streambooleanOptional

Enable streaming mode for real-time audio generation. When enabled, audio is generated and delivered in chunks as it's processed.

Default: false
language_booststring · enumOptional

Language recognition enhancement option.

Possible values:
subtitle_enablebooleanOptional

Enable subtitle generation service. Only available for non-streaming requests. Generates timing information for the synthesized speech.

Default: false
output_formatstring · enumOptional

Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.

Default: hexPossible values:
Responses
post
/v1/tts
201Success

Last updated

Was this helpful?