MAI-Transcribe 1.5
Model Overview
Setup your API Key
API Schemas
Creating and sending a speech-to-text conversion task to the server
URL of the input audio file. Provide either url or audio — exactly one is required, not both.
https://example.com/audio/sample.mp3The audio file to transcribe. Provide either url or audio — exactly one is required, not both.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
4096What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
1Example: 1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
1Example: 1An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
4096Requesting the result of the task from the server using the generation_id
Code Example: Processing a Speech Audio File via URL
Last updated
Was this helpful?