MAI-Voice 2
Model Overview
Setup your API Key
API Schema
The text content to be converted to speech.
Hello! This is a demo of Microsoft MAI-Voice-2 text to speech.Name of the voice to be used.
en-US-AvaNeuralExample: en-US-AvaNeuralPossible values: Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.
mp3Example: mp3Possible values: The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
4096What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
1Example: 1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
1Example: 1An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
4096falsePossible values: The URL of the generated audio file.
Code Example
Last updated
Was this helpful?