vibevoice-7b

This documentation is valid for the following model: microsoft/vibevoice-7b

Designed to produce rich, multi-speaker conversations from text, the model is well-suited for podcasts and other long-form audio content. The 7-billion-parameter version of the model.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

API Schema

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumRequiredPossible values:
scriptstring · min: 1 · max: 5000Required

The script to convert to speech. Can be formatted with "Speaker X:" prefixes for multi-speaker dialogues.

seedintegerOptional

If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.

cfg_scalenumber · min: 0.1 · max: 2Optional

The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt.

Default: 1.3
Responses
201Success

Code Example

Response

Last updated

Was this helpful?