MAI-Voice 2

This documentation is valid for the following list of our models:

  • microsoft/mai-voice-2

Model Overview

MAI-Voice-2 — text-to-speech model from Microsoft. Generates natural and expressive speech with support for multiple languages, voices, and audio formats.

Setup your API Key

If you don't have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

API Schema

post
Body
modelstring · enumRequiredPossible values:
textstring · min: 1 · max: 4096Required

The text content to be converted to speech.

Example: Hello! This is a demo of Microsoft MAI-Voice-2 text to speech.
voicestring · enumOptional

Name of the voice to be used.

Default: en-US-AvaNeuralExample: en-US-AvaNeuralPossible values:
response_formatstring · enumOptional

Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.

Default: mp3Example: mp3Possible values:
max_tokensinteger · min: 1Optional

The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

Example: 4096
temperaturenumber · max: 2Optional

What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

Default: 1Example: 1
top_pnumber · max: 1Optional

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

Default: 1Example: 1
max_completion_tokensinteger · min: 1Optional

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

Example: 4096
streamboolean · enumOptionalDefault: falsePossible values:
Responses
200Success
application/json
audiostring · uriRequired

The URL of the generated audio file.

post
/v1/tts
200Success

Code Example

Response

Last updated

Was this helpful?