music-01

circle-info

This documentation is valid for the following list of our models:

  • music-01

An advanced AI model that generates diverse high-quality audio compositions by analyzing and reproducing musical patterns, rhythms, and vocal styles from the reference track. Refine the process using a text prompt.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guidearrow-up-right.

API Schemas

Upload a reference sample

This endpoint uploads a reference music piece to the server, analyzes it, and returns identifiers for the voice and/or instrumental patterns to use later.

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
filestring · binaryRequired

Audio file local path, supports WAV and MP3 formats. The audio duration must be longer than 10s and no more than 10 minutes.

purposestring · enumRequired
  1. If purpose is song:
  • You need to upload a music file containing both acapella (vocals) and accompaniment.
  • The acapella must be in singing form; normal speech is not supported.
  • Outputs: voice_id and instrumental_id.
  1. If purpose is voice:
  • You need to upload a file containing only acapella in singing form (normal speech audio is not supported).
  • Output: voice_id.
  1. If purpose is instrumental:
  • You need to upload a file containing only accompaniment.
  • Output: instrumental_id.
Possible values:
Responses
chevron-right
default
application/json
default

Generate music sample

This endpoint generates a new music piece based on the voice and/or instrumental pattern identifiers obtained from the first endpoint above. The generation can be completed in 50-60 seconds or take a bit more.

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
lyricsstringRequired

Lyrics with optional formatting. You can use a newline to separate each line of lyrics. You can use two newlines to add a pause between lines. You can use double hash marks (##) at the beginning and end of the lyrics to add accompaniment. Maximum 600 characters.

Example: ##Swift and Boundless In the realm of innovation, where visions align, AIML API's the name, making tech shine. Intelligent solutions, breaking the mold, Swift inference power, bold and untold. ##
modelundefined · enumRequiredPossible values:
refer_voicestringOptional

voice_id. At least one of refer_voice or refer_instrumental is required. When only refer_voice is provided, the system can still output music data. The generated music will be an a cappella vocal hum that aligns with the provided refer_voice and the generated lyrics, without any instrumental accompaniment.

Example: vocal-2025010100000000-a0AAAaaa
refer_instrumentalstringOptional

instrumental_id. At least one of refer_voice or refer_instrumental is required. When only refer_instrumental is provided, the system can still output music data. The generated music will be a purely instrumental track that aligns with the provided refer_instrumental, without any vocals.

Example: instrumental-2025010100000000-Aaa0aAaA
Responses
chevron-right
default
application/json
post
/v2/generate/audio/minimax/generate
default

Quick Code Example

Here is an example of generation an audio file based on a sample and a prompt using the music model music-01.

chevron-rightResponsehashtag

Listen to the track we generated:

Last updated

Was this helpful?