vibevoice-1.5b
Designed to produce rich, multi-speaker conversations from text, the model is well-suited for podcasts and other long-form audio content.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
Code Example
import os
import requests
def main():
url = "https://api.aimlapi.com/v1/tts"
headers = {
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
}
payload = {
"model": "microsoft/vibevoice-1.5b",
"script": "Speaker 1: Wow, whats happening, Alice? \nSpeaker 2: Oh, just the usual… a full-blown AI revolution. Nothing to worry about",
"speakers": [
{ "preset": "Frank [EN]" },
{ "preset": "Alice [EN]" }
]
}
try:
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
response_data = response.json()
audio_url = response_data["audio"]["url"]
file_name = response_data["audio"]["file_name"]
audio_response = requests.get(audio_url, stream=True)
audio_response.raise_for_status()
# Save with the original file extension from the API
# dist = os.path.join(os.path.dirname(__file__), file_name) # if you run this code as a .py file
dist = "audio.wav" # if you run this code in Jupyter Notebook
with open(dist, "wb") as write_stream:
for chunk in audio_response.iter_content(chunk_size=8192):
if chunk:
write_stream.write(chunk)
print("Audio saved to:", dist)
print(f"Duration: {response_data['duration']} seconds")
print(f"Sample rate: {response_data['sample_rate']} Hz")
except requests.exceptions.RequestException as e:
print(f"Error making request: {e}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Listen to the dialogue we generated:
API Schema
post
Authorizations
Body
modelundefined · enumRequiredPossible values:
scriptstring · min: 1 · max: 5000Required
The script to convert to speech. Can be formatted with "Speaker X:" prefixes for multi-speaker dialogues.
seedintegerOptional
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.
cfg_scalenumber · min: 0.1 · max: 2OptionalDefault:
The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt.
1.3
Responses
201Success
application/json
201Success
{
"metadata": {
"transaction_key": "text",
"request_id": "text",
"sha256": "text",
"created": "2025-09-16T15:07:37.094Z",
"duration": 1,
"channels": 1,
"models": [
"text"
],
"model_info": {
"ANY_ADDITIONAL_PROPERTY": {
"name": "text",
"version": "text",
"arch": "text"
}
}
}
}
Last updated
Was this helpful?