stable-audio

This documentation is valid for the following list of our models:

  • stable-audio

An advanced audio generation model designed to create high-quality audio tracks from textual prompts.

How to Make a Call

Step-by-Step Instructions

Generating an audio using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a video generation task to the server (returns a generation ID).

  • The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.

Below, you can find two corresponding API schemas and examples for both endpoint calls.


If you want to learn how to call AI models via API from the very basics, feel free to use our Quickstart guide.

API Schemas

post
Body
modelstring · enumRequiredPossible values:
promptstringRequired

The prompt to generate audio.

seconds_startinteger · min: 1 · max: 47Optional

The start point of the audio clip to generate.

seconds_totalinteger · min: 1 · max: 47Optional

The duration of the audio clip to generate.

Default: 30
stepsinteger · min: 1 · max: 1000Optional

The number of steps to denoise the audio.

Default: 100
Responses
200Success
application/json
idstringRequired

The ID of the generated audio.

Example: 60ac7c34-3224-4b14-8e7d-0aa0db708325
statusstring · enumRequired

The current status of the generation task.

Example: completedPossible values:
post
/v2/generate/audio
200Success

Retrieve the generated music sample from the server

After sending a request for music generation, this task is added to the queue. This endpoint lets you check the status of a audio generation task using its id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated audio URL and additional metadata.

get
Authorizations
AuthorizationstringRequired

Bearer key

Query parameters
generation_idstringRequiredExample: <REPLACE_WITH_YOUR_GENERATION_ID>
Responses
200Success
application/json
idstringRequired

The ID of the generated audio.

Example: 60ac7c34-3224-4b14-8e7d-0aa0db708325
statusstring · enumRequired

The current status of the generation task.

Example: completedPossible values:
get
/v2/generate/audio
200Success

Full Example: Generating and Retrieving the Audio From the Server

The code below creates a audio generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.

Response

Listen to the track we generated:

Last updated

Was this helpful?