music-01

This documentation is valid for the following list of our models:

  • music-01

Model Overview

An advanced AI model that generates diverse high-quality audio compositions by analyzing and reproducing musical patterns, rhythms, and vocal styles from the reference track. Refine the process using a text prompt.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

API Schemas

Generate a music sample

This endpoint uploads a reference music piece to the server, analyzes it, and returns identifiers for the voice and/or instrumental patterns to use later.

post
Authorizations
Body
filestring · binaryRequired

Audio file local path, supports WAV and MP3 formats. The audio duration must be longer than 10s and no more than 10 minutes.

purposestring · enumRequired
  1. If purpose is song:
  • You need to upload a music file containing both acapella (vocals) and accompaniment.
  • The acapella must be in singing form; normal speech is not supported.
  • Outputs: voice_id and instrumental_id.
  1. If purpose is voice:
  • You need to upload a file containing only acapella in singing form (normal speech audio is not supported).
  • Output: voice_id.
  1. If purpose is instrumental:
  • You need to upload a file containing only accompaniment.
  • Output: instrumental_id.
Possible values:
Responses
default
application/json
default
{
  "voice_id": "vocal-2025011003141025-d5ZEMxmp",
  "instrumental_id": "instrumental-2025011003141125-Akz9eWnD",
  "base_resp": {
    "status_code": 1,
    "status_msg": "text"
  }
}

Generate music sample

This endpoint generates a new music piece based on the voice and/or instrumental pattern identifiers obtained from the first endpoint above. The generation can be completed in 50-60 seconds or take a bit more.

post
Authorizations
Body
all ofOptional
Responses
default
application/json
default
{
  "data": {
    "status": 1,
    "audio": "text"
  },
  "extra_info": {
    "audio_length": 1,
    "audio_size": 1,
    "audio_bitrate": 1,
    "audio_sample_rate": 1
  },
  "trace_id": "text",
  "base_resp": {
    "status_code": 1,
    "status_msg": "text"
  }
}

Quick Code Example

Here is an example of generation an audio file based on a sample and a prompt using the music model music-01.

import requests

# Insert your AI/ML API key here:
aimlapi_key = "<YOUR_AIMLAPI_KEY>"

# Input data
audio_url = "https://tand-dev.github.io/audio-hosting/spinning-head-271171.mp3"
file_name = "spinning-head-271171.mp3"
purpose = "song"  # Possible values: 'song', 'voice', 'instrumental'


def upload_reference_file():
    """Download file from URL and upload it to AIML API"""

    url = "https://api.aimlapi.com/v2/generate/audio/minimax/upload"

    try:
        # Step 1: Download the file
        response = requests.get(audio_url)
        response.raise_for_status()

        # Step 2: Upload to AIML API
        payload = {"purpose": purpose}
        files = {"file": (file_name, response.content, "audio/mpeg")}
        headers = {"Authorization": f"Bearer {aimlapi_key}"}

        upload_response = requests.post(url, headers=headers, files=files, data=payload)
        upload_response.raise_for_status()

        data = upload_response.json()
        print("Upload successful:", data)
        return data  # return JSON with file ids

    except requests.exceptions.RequestException as error:
        print(f"Error during upload: {error}")
        return None


def generate_audio(voice_id=None, instrumental_id=None):
    """Send audio generation request and save result"""

    url = "https://api.aimlapi.com/v2/generate/audio/minimax/generate"
    lyrics = (
        "##Side by side, through thick and thin, \n\n"
        "With a laugh, we always win. \n\n"
        "Storms may come, but we stay true, \n\n"
        "Friends forever—me and you!##"
    )

    payload = {
        "refer_voice": voice_id,
        "refer_instrumental": instrumental_id,
        "lyrics": lyrics,
        "model": "music-01",
    }
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {aimlapi_key}",
    }

    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()

    audio_hex = response.json()["data"]["audio"]
    decoded_hex = bytes.fromhex(audio_hex)

    out_name = "generated_audio.mp3"
    with open(out_name, "wb") as f:
        f.write(decoded_hex)

    print(f"Generated audio saved as {out_name}")


def main():
    uploaded = upload_reference_file()
    if not uploaded:
        return

    # Extract IDs depending on purpose
    voice_id = uploaded.get("voice_id")
    instrumental_id = uploaded.get("instrumental_id")

    generate_audio(voice_id, instrumental_id)


if __name__ == "__main__":
    main()
Response
Upload successful: {'voice_id': 'vocal-2025082518145625-6XW9wCOF', 'instrumental_id': 'instrumental-2025082518145625-vCCEiiES', 'trace_id': '04fb6a8721abeee5b66edd452b4d0f33', 'base_resp': {'status_code': 0, 'status_msg': 'success'}}
Generated audio saved as generated_audio.mp3

Listen to the track we generated:

Last updated

Was this helpful?