AI/ML API Documentation
API KeyModelsPlaygroundGitHubGet Support
  • 📞Contact Sales
  • 🗯️Send Feedback
  • Quickstart
    • 🧭Documentation Map
    • Setting Up
    • Supported SDKs
  • API REFERENCES
    • 📒All Model IDs
    • Text Models (LLM)
      • Alibaba Cloud
        • qwen-max
        • qwen-plus
        • qwen-turbo
        • Qwen2-72B-Instruct
        • Qwen2.5-7B-Instruct-Turbo
        • Qwen2.5-72B-Instruct-Turbo
        • Qwen2.5-Coder-32B-Instruct
        • Qwen-QwQ-32B
        • Qwen3-235B-A22B
      • Anthracite
        • magnum-v4
      • Anthropic
        • Claude 3 Haiku
        • Claude 3 Opus
        • Claude 3 Sonnet
        • Claude 3.5 Haiku
        • Claude 3.5 Sonnet
        • Claude 3.7 Sonnet
        • Claude 4 Opus
        • Claude 4 Sonnet
      • Cohere
        • command-r-plus
      • DeepSeek
        • DeepSeek V3
        • DeepSeek R1
        • DeepSeek Prover V2
      • Google
        • gemini-1.5-flash
        • gemini-1.5-pro
        • gemini-2.0-flash-exp
        • gemini-2.0-flash
        • gemini-2.5-flash-preview
        • gemini-2.5-pro-exp
        • gemini-2.5-pro-preview
        • gemma-2
        • gemma-3
        • gemma-3n-4b
      • Gryphe
        • MythoMax-L2-13b-Lite
      • Meta
        • Llama-3-chat-hf
        • Llama-3-8B-Instruct-Lite
        • Llama-3.1-8B-Instruct-Turbo
        • Llama-3.1-70B-Instruct-Turbo
        • Llama-3.1-405B-Instruct-Turbo
        • Llama-3.2-11B-Vision-Instruct-Turbo
        • Llama-3.2-90B-Vision-Instruct-Turbo
        • Llama-Vision-Free
        • Llama-3.2-3B-Instruct-Turbo
        • Llama-3.3-70B-Instruct-Turbo
        • Llama-4-scout
        • Llama-4-maverick
      • MiniMax
        • text-01
        • abab6.5s-chat
      • Mistral AI
        • codestral-2501
        • mistral-nemo
        • mistral-tiny
        • Mistral-7B-Instruct
        • Mixtral-8x22B-Instruct
        • Mixtral-8x7B-Instruct
      • NVIDIA
        • Llama-3.1-Nemotron-70B-Instruct-HF
        • llama-3.1-nemotron-70b
      • NeverSleep
        • llama-3.1-lumimaid
      • NousResearch
        • Nous-Hermes-2-Mixtral-8x7B-DPO
      • OpenAI
        • gpt-3.5-turbo
        • gpt-4
        • gpt-4-preview
        • gpt-4-turbo
        • gpt-4o
        • gpt-4o-mini
        • gpt-4o-audio-preview
        • gpt-4o-mini-audio-preview
        • gpt-4o-search-preview
        • gpt-4o-mini-search-preview
        • o1
        • o1-mini
        • o1-preview
        • o3
        • o3-mini
        • gpt-4.5-preview
        • gpt-4.1
        • gpt-4.1-mini
        • gpt-4.1-nano
        • o4-mini
      • xAI
        • grok-beta
        • grok-3-beta
        • grok-3-mini-beta
    • Image Models
      • Flux
        • flux-pro
        • flux-pro/v1.1
        • flux-pro/v1.1-ultra
        • flux-realism
        • flux/dev
        • flux/dev/image-to-image
        • flux/schnell
        • flux/kontext-max/text-to-image
        • flux/kontext-max/image-to-image
        • flux/kontext-pro/text-to-image
        • flux/kontext-pro/image-to-image
      • Google
        • Imagen 3
        • Imagen 4 Preview
      • OpenAI
        • DALL·E 2
        • DALL·E 3
        • gpt-image-1
      • RecraftAI
        • Recraft v3
      • Stability AI
        • Stable Diffusion v3 Medium
        • Stable Diffusion v3.5 Large
    • Video Models
      • Alibaba Cloud
        • Wan 2.1 (Text-to-Video)
      • Google
        • Veo2 (Image-to-Video)
        • Veo2 (Text-to-Video)
        • Veo3 (Text-to-Video)
      • Kling AI
        • v1-standard/image-to-video
        • v1-standard/text-to-video
        • v1-pro/image-to-video
        • v1-pro/text-to-video
        • v1.6-standard/text-to-video
        • v1.6-standard/image-to-video
        • v1.6-pro/image-to-video
        • v1.6-pro/text-to-video
        • v1.6-standard/effects
        • v1.6-pro/effects
        • v2-master/image-to-video
        • v2-master/text-to-video
      • Luma AI
        • Text-to-Video v2
        • Text-to-Video v1 (legacy)
      • MiniMax
        • video-01
        • video-01-live2d
        • hailuo-02
      • Runway
        • gen3a_turbo
        • gen4_turbo
    • Music Models
      • Google
        • Lyria 2
      • MiniMax
        • minimax-music [legacy]
        • music-01
      • Stability AI
        • stable-audio
    • Voice/Speech Models
      • Speech-to-Text
        • stt [legacy]
        • Deepgram
          • nova-2
        • OpenAI
          • whisper-base
          • whisper-large
          • whisper-medium
          • whisper-small
          • whisper-tiny
      • Text-to-Speech
        • Deepgram
          • aura
    • Content Moderation Models
      • Meta
        • Llama-Guard-3-11B-Vision-Turbo
        • LlamaGuard-2-8b
        • Meta-Llama-Guard-3-8B
    • 3D-Generating Models
      • Stability AI
        • triposr
    • Vision Models
      • Image Analysis
      • OCR: Optical Character Recognition
        • Google
          • Google OCR
        • Mistral AI
          • mistral-ocr-latest
      • OFR: Optical Feature Recognition
    • Embedding Models
      • Anthropic
        • voyage-2
        • voyage-code-2
        • voyage-finance-2
        • voyage-large-2
        • voyage-large-2-instruct
        • voyage-law-2
        • voyage-multilingual-2
      • BAAI
        • bge-base-en
        • bge-large-en
      • Google
        • textembedding-gecko
        • text-multilingual-embedding-002
      • OpenAI
        • text-embedding-3-large
        • text-embedding-3-small
        • text-embedding-ada-002
      • Together AI
        • m2-bert-80M-retrieval
  • Solutions
    • Bagoodex
      • AI Search Engine
        • Find Links
        • Find Images
        • Find Videos
        • Find the Weather
        • Find a Local Map
        • Get a Knowledge Structure
    • OpenAI
      • Assistants
        • Assistant API
        • Thread API
        • Message API
        • Run and Run Step API
        • Events
  • Use Cases
    • Create Images: Illustrate an Article
    • Animate Images: A Children’s Encyclopedia
    • Create an Assistant to Discuss a Specific Document
    • Create a 3D Model from an Image
    • Create a Looped GIF for a Web Banner
    • Read Text Aloud and Describe Images: Support People with Visual Impairments
    • Find Relevant Answers: Semantic Search with Text Embeddings
    • Summarize Websites with AI-Powered Chrome Extension
  • Capabilities
    • Completion and Chat Completion
    • Streaming Mode
    • Code Generation
    • Thinking / Reasoning
    • Function Calling
    • Vision in Text Models (Image-To-Text)
    • Web Search
    • Features of Anthropic Models
    • Model comparison
  • FAQ
    • Can I use API in Python?
    • Can I use API in NodeJS?
    • What are the Pro Models?
    • How to use the Free Tier?
    • Are my requests cropped?
    • Can I call API in the asynchronous mode?
    • OpenAI SDK doesn't work?
  • Errors and Messages
    • General Info
    • Errors with status code 4xx
    • Errors with status code 5xx
  • Glossary
    • Concepts
  • Integrations
    • 🧩Our Integration List
    • Cline
    • Langflow
    • LiteLLM
    • Roo Code
Powered by GitBook
On this page

Was this helpful?

  1. API REFERENCES
  2. Music Models
  3. MiniMax

minimax-music [legacy]

PreviousMiniMaxNextmusic-01

Last updated 14 days ago

Was this helpful?

This documentation is valid for the following list of our models:

  • minimax-music

Model Overview

An advanced AI model that generates diverse high-quality audio compositions by analyzing and reproducing musical patterns, rhythms, and vocal styles from the reference track. Refine the process using a text prompt.

How to Make a Call

1

Setup You Can’t Skip

Create an Account: Visit the AI/ML API website and create an account (if you don’t have one yet). Generate an API Key: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

2

Copy the code example

At the bottom of this page, you'll find a code example that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

Generating an audio sample using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a music generation task to the server (returns a generation ID).

  • The second one is for requesting the generated audio sample from the server using the generation ID received from the first endpoint.

The code example combines both endpoint calls.

3

Modify the code example

Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. Provide your lyrics via the prompt parameter. The model will use it to generate a song.

Keep in mind that the maximum length of generated audio is 1 minute. If you provide a prompt that’s too long (which the model tries to use as song lyrics), it might exceed the time limit and result in a "Downstream service error."

Via reference_audio_url parameter, provide a URL to a reference track from which the model will extract the genre, style, tempo, vocal and instrument timbres, and the overall mood of the piece.

4

Run your modified code

Run your modified code in your development environment. Response time depends on various factors, but it rarely exceeds 1 minute.

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

API Schemas

Generate a music sample

This endpoint creates and sends a music generation task to the server — and returns a generation ID and the task status.

Retrieve the generated music sample from the server

After sending a request for music generation, this task is added to the queue. Based on the service's load, the generation can be completed in 50-60 seconds or take a bit more.

Quick Code Example

Here is an example of generation an audio file based on a sample and a prompt using the music model minimax-music.

Full example explanation

As an example, we will generate a song using the popular minimax-music model from the Chinese company MiniMax. As you can verify in its API Schemas above, this model accepts an audio sample as input—extracting information about its vocals and instruments for use in the generation process—along with a text prompt where we can provide lyrics for our song.

We used a publicly available sample from royalty-free sample database and generated some lyrics in Chat GPT:

Side by side, through thick and thin, With a laugh, we always win. Storms may come, but we stay true, Friends forever—me and you!

To turn this into a model-friendly prompt (as a single string), we added hash symbols and line breaks.

'''\ ##Side by side, through thick and thin, \n\nWith a laugh, we always win. \n\n Storms may come, but we stay true, \n\nFriends forever—me and you!##\ '''

A notable feature of our audio and video models is that uploading the prompt or sample, generating the content, and retrieving the final file from the server are handled through separate API calls. (AIML API tokens are only consumed during the first step—i.e., the actual content generation.)

We’ve written a complete code example that sequentially calls both endpoints — you can view and copy it below. Don’t forget to replace <YOUR_AIMLAPI_KEY> with your actual AIML API Key from your account!

The structure of the code is simple: there are two separate functions for calling each endpoint, and a main function that orchestrates everything.

Execution starts automatically from main(). It first runs the function that creates and sends a music generation task to the server — this is where you pass your prompt describing the desired musical fragment. This function returns a generation ID and the initial task status:

Generation: {'id': '906aec79-b0af-40c4-adae-15e6c4410e29:minimax-music', 'status': 'queued'}

This indicates that the file upload and our generation has been queued on the server (which took 7 seconds in our case).

Next, main() launches the second function — the one that checks the task status and, once ready, retrieves the download URL from the server. This second function is called in a loop every 10 seconds.

During execution, you’ll see messages in the output:

  • If the file is not yet ready:

Still waiting... Checking again in 10 seconds.
  • Once the file is ready, a completion message appears with the download info. In our case, after five reruns of the second code block (waiting a total of about 50-60 seconds), we saw the following output:

Generation: {'id': '906aec79-b0af-40c4-adae-15e6c4410e29:minimax-music', 'status': 'completed', 'audio_file': {'url': 'https://cdn.aimlapi.com/squirrel/files/koala/Oa2XHFE1hEsUn1qbcAL2s_output.mp3', 'content_type': 'audio/mpeg', 'file_name': 'output.mp3', 'file_size': 1014804}}

As you can see, the 'status' is now 'completed', and further in the output line, we have a URL where the generated audio file can be downloaded.


Listen to the track we generated below the code and response blocks.

import time
import requests

# Insert your AI/ML API key instead of <YOUR_AIMLAPI_KEY>:
aimlapi_key = '<YOUR_AIMLAPI_KEY>'

# Creating and sending an audio generation task to the server (returns a generation ID)
def generate_audio():
    url = "https://api.aimlapi.com/v2/generate/audio"
    payload = {
        "model": "minimax-music",
        "reference_audio_url": 'https://tand-dev.github.io/audio-hosting/spinning-head-271171.mp3',
        "prompt": '''
##Side by side, through thick and thin, \n\nWith a laugh, we always win. \n\n Storms may come, but we stay true, \n\nFriends forever—me and you!##
''', 
    }
    headers = {"Authorization": f"Bearer {aimlapi_key}", "Content-Type": "application/json"}

    response = requests.post(url, json=payload, headers=headers)

    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print("Generation:", response_data)
        return response_data


# Requesting the result of the generation task from the server using the generation_id:
def retrieve_audio(gen_id):
    url = "https://api.aimlapi.com/v2/generate/audio"
    params = {
        "generation_id": gen_id,
    }
    headers = {"Authorization": f"Bearer {aimlapi_key}", "Content-Type": "application/json"}

    response = requests.get(url, params=params, headers=headers)
    return response.json()
    
    
# This is the main function of the program. From here, we sequentially call the audio generation and then repeatedly request the result from the server every 10 seconds:
def main():
    generation_response = generate_audio()
    gen_id = generation_response.get("id")
        
    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = retrieve_audio(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")

            if status == "generating" or status == "queued" or status == "waiting":
                print("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Generation complete:/n", response_data)
                return response_data
   
        print("Timeout reached. Stopping.")
        return None    


if __name__ == "__main__":
    main()
// Insert your AI/ML API key instead of <YOUR_AIMLAPI_KEY>:
const AIMLAPI_KEY = '<YOUR_AIMLAPI_KEY>';

async function generateAudio() {
  const url = 'https://api.aimlapi.com/v2/generate/audio';
  const payload = {
    model: 'minimax-music',
    reference_audio_url: 'https://tand-dev.github.io/audio-hosting/spinning-head-271171.mp3',
    prompt: `##Side by side, through thick and thin,

With a laugh, we always win.

Storms may come, but we stay true,

Friends forever—me and you!##`
  };

  const response = await fetch(url, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${AIMLAPI_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(payload)
  });

  if (!response.ok) {
    console.error(`Error: ${response.status} - ${await response.text()}`);
    return null;
  }

  const data = await response.json();
  console.log('Generation:', data);
  return data;
}

async function retrieveAudio(generationId) {
  const url = `https://api.aimlapi.com/v2/generate/audio?generation_id=${generationId}`;

  const response = await fetch(url, {
    method: 'GET',
    headers: {
      'Authorization': `Bearer ${AIMLAPI_KEY}`,
      'Content-Type': 'application/json'
    }
  });

  if (!response.ok) {
    console.error(`Error: ${response.status} - ${await response.text()}`);
    return null;
  }

  return await response.json();
}

async function main() {
  const generationResponse = await generateAudio();

  if (!generationResponse || !generationResponse.id) {
    console.error('No generation ID received.');
    return;
  }

  const genId = generationResponse.id;
  const timeout = 600000; // 10 minutes
  const interval = 10000; // 10 seconds
  const start = Date.now();

  const intervalId = setInterval(async () => {
    if (Date.now() - start > timeout) {
      console.log('Timeout reached. Stopping.');
      clearInterval(intervalId);
      return;
    }

    const result = await retrieveAudio(genId);

    if (!result) {
      console.error('No response from API.');
      clearInterval(intervalId);
      return;
    }

    const status = result.status;
    if (['generating', 'queued', 'waiting'].includes(status)) {
      console.log('Still waiting... Checking again in 10 seconds.');
    } else {
      console.log('Generation complete:\n', result);
      clearInterval(intervalId);
    }
  }, interval);
}

main();
Response
Generation: {'id': '906aec79-b0af-40c4-adae-15e6c4410e29:minimax-music', 'status': 'queued'}
Still waiting... Checking again in 10 seconds.
Still waiting... Checking again in 10 seconds.
Still waiting... Checking again in 10 seconds.
Still waiting... Checking again in 10 seconds.
Still waiting... Checking again in 10 seconds.
Generation: {'id': '906aec79-b0af-40c4-adae-15e6c4410e29:minimax-music', 'status': 'completed', 'audio_file': {'url': 'https://cdn.aimlapi.com/squirrel/files/koala/Oa2XHFE1hEsUn1qbcAL2s_output.mp3', 'content_type': 'audio/mpeg', 'file_name': 'output.mp3', 'file_size': 1014804}}

Listen to the track we generated:

▪️
▪️
▪️
▪️
▪️
get
Authorizations
Query parameters
generation_idstringRequired
Responses
default
application/json
get
GET /v2/generate/audio HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer <YOUR_AIMLAPI_KEY>
Accept: */*
default
{
  "audio_file": {
    "url": "https://example.com"
  },
  "id": "text",
  "status": "queued",
  "error": null
}
  • Model Overview
  • How to Make a Call
  • Setup You Can’t Skip
  • Copy the code example
  • Modify the code example
  • Run your modified code
  • API Schemas
  • Generate a music sample
  • POST/v2/generate/audio
  • Retrieve the generated music sample from the server
  • GET/v2/generate/audio
  • Quick Code Example
post
Authorizations
Body
modelundefined · enumRequiredPossible values:
promptstringRequired
reference_audio_urlstring · uriRequired

Reference song, should contain music and vocals. Must be a .wav or .mp3 file longer than 15 seconds

Responses
default
application/json
post
POST /v2/generate/audio HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer <YOUR_AIMLAPI_KEY>
Content-Type: application/json
Accept: */*
Content-Length: 85

{
  "model": "minimax-music",
  "prompt": "text",
  "reference_audio_url": "https://example.com"
}
default
{
  "id": "text",
  "status": "queued"
}