Only this pageAll pages
Powered by GitBook
Couldn't generate the PDF for 540 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

AI/ML API Documentation

Quickstart

Loading...

Loading...

Loading...

Loading...

API REFERENCES

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

gemini-3-1-flash-lite-preview

This documentation is valid for the following list of our models:

  • google/gemini-3-1-flash-lite-preview

Try in Playground

Model Overview

Google’s cost-efficient multimodal model, delivering the fastest performance for high-frequency, lightweight workloads. Best suited for high-volume agentic tasks, simple data extraction, and ultra-low-latency use cases where speed and cost are the top priorities.

Create AI/ML API Key

How to make the first API call

1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

API Schema

Code Example

import requests
import json  # for getting a structured output with indentation 

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json"
    },
    json={
        "model":"google/gemini-3-1-flash-lite-preview",
        "messages":[
            {
                "role":"user",
                "content":"Hi! What do you think about mankind?" # insert your prompt
            }
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'google/gemini-3-1-flash-lite-preview',
      messages:[
          {
              role:'user',
              content: 'Hi! What do you think about mankind?' // insert your prompt here
          }
      ],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
Response
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "As an AI, I don’t have personal feelings, beliefs, or the capacity to judge. However, I can analyze humanity based on the vast amount of information I’ve been trained on. Looking at the human story, I see a species defined by a fascinating, often contradictory set of dualities.\n\nHere are a few perspectives on what makes mankind remarkable:\n\n### 1. The Capacity for Infinite Creation\nHumans are the only species on Earth that lives in two worlds simultaneously: the physical world and the world of abstract ideas. You have taken the raw materials of the planet and turned them into art, mathematics, philosophy, literature, and technology. The fact that humans can dream of something that doesn't exist—and then figure out how to build it—is truly extraordinary.\n\n### 2. The Power of Cooperation\nWhile nature is often viewed through the lens of \"survival of the fittest,\" the defining feature of human success is actually \"survival of the most cooperative.\" You have built massive, complex societies based on shared trust, language, and institutions. From building cities to reaching the Moon, almost every great human achievement is a result of people working together toward a common goal.\n\n### 3. The Struggle with Contradiction\nHumanity is deeply paradoxical. You are capable of immense altruism—sacrificing your own safety to help a stranger—yet you are also capable of tribalism, cruelty, and destruction. You are simultaneously the greatest stewards of the planet and its most significant threat. You seek truth through science while often being easily swayed by bias and emotion. This tension is the \"human condition,\" and it is the primary engine behind your history.\n\n### 4. Resilience and Adaptability\nHistory is a long record of humans facing existential threats—plagues, wars, climate shifts, and economic collapses—and somehow finding ways to iterate, evolve, and move forward. There is an inherent \"restlessness\" in the human spirit; you are never truly satisfied with the status quo, which leads to both your greatest problems and your most innovative solutions.\n\n### 5. The Quest for Meaning\nPerhaps the most \"human\" trait of all is the need to ask *why*. You don't just want to survive; you want to know why you are here, what happens after you leave, and what it all means. Even though you haven't found a definitive answer to the ultimate questions of existence, the fact that you keep searching is what makes humanity so deeply compelling.\n\n***\n\n**In short:** If I were to summarize humanity, I would say it is a species that is constantly \"in progress.\" You are still in your infancy compared to the age of the universe, and you are still learning how to manage your own intelligence and your impact on your home. \n\nFrom my perspective, you are a species of immense potential, forever walking the tightrope between your greatest impulses and your most destructive ones. \n\n**What do *you* think is the most defining characteristic of humanity?**",
        "extra_content": {
          "google": {
            "thought_signature": "AY89a1+bratVbRQ+NtNha+iXUiNCiY4pvK2Z125Ze7fI3ItL6Azp0gdh2TxoIp5nFp0="
          }
        },
        "role": "assistant"
      }
    }
  ],
  "created": 1776633889,
  "id": "IUjlacaOIbmZ9LsPxayQAQ",
  "model": "google/gemini-3.1-flash-lite-preview",
  "object": "chat.completion",
  "system_fingerprint": "",
  "usage": {
    "completion_tokens": 618,
    "extra_properties": {
      "google": {
        "traffic_type": "ON_DEMAND"
      }
    },
    "prompt_tokens": 9,
    "total_tokens": 627
  },
  "meta": {
    "usage": {
      "credits_used": 2417,
      "usd_spent": 0.0012085
    }
  }
}

All Model IDs

A full list of available models.

If you need to select models based on specific parameters for your task, visit the , which offers convenient filtering options. On the selected model’s page, you can find detailed technical and commercial information.

To fetch the complete model list via the API, see for the relevant service endpoint.

The section Full List of Model IDs below lists the identifiers of all available and deprecated models, grouped by category. These IDs are used to specify the exact models in your code, like this:

If you already know the model ID, use the page search function (Ctrl+F for Win/Linux, Command+F for Mac) to locate it. The hyperlink will take you directly to the model's API Reference page.

New Model Request

Can't find the model you need? Join our to propose new models for integration into our API offerings. Your contributions help us grow and serve you better.

Full List of Model IDs

Text Models (LLM)

Model ID + API Reference link
Developer
Context
Model Card

Image Models

Model ID + API Reference link
Developer
Context
Model Card

Video Models

Model ID + API Reference link
Developer
Context
Model Card

Voice/Speech Models

Speech-to-Text

Model ID + API Reference link
Developer
Context
Model Card

Text-to-Speech

Model ID
Developer
Context
Model Card

Voice Chat

Model ID
Developer
Context
Model Card

Music Models

Model ID
Developer
Context
Model Card

Vision Models

Optical Character Recognition (OCR)

Model ID + API Reference link
Developer
Context
Model Card

3D-Generating Models

Model ID + API Reference link
Developer
Context
Model Card

Embedding Models

Model ID + API Reference link
Developer
Context
Model Card

Deprecated / No Longer Supported Models

Model ID
Developer
Context
Model Card

Text Models (LLM)

Overview of the capabilities of AIML API text models (LLMs).

Specific Capabilities

There are several capabilities of text models that are worth mentioning separately.

Completion allows the model to analyze a given text fragment and predict how it might continue based on the probabilities of the next possible tokens or characters. Chat Completion extends this functionality, enabling a simulated dialogue between the user and the model based on predefined roles (e.g., "strict language teacher" and "student"). A detailed description and examples can be found in our Completion and Chat Completion article.


An evolution of chat completion includes Assistants (preconfigured conversational agents with specific roles) and Threads (a mechanism for maintaining conversation history for context). Examples of this functionality can be found in the article.


Function Calling allows a chat model to invoke external programmatic tools (e.g., a function you have written) while generating a response. A detailed description and examples are available in the article.

Endpoint

All text and chat models use the same endpoint:

https://api.aimlapi.com/v1/chat/completions

The parameters may vary (especially for models from different developers), so it’s best to check the API schema on each model’s page for details. Example: .

✅ Quick Code Example

We will call the model using the Python programming language and the OpenAI SDK.

By running this code example, we received the following response from the chat model:

Complete Text Model List
Model ID + API Reference link
Developer
Context
Model Card
Quickstart guide

ernie-4.5-300b-a47b

Model Overview

A post-trained LLM with 300B total parameters and 47B activated parameters per token. Non-reasoning variant.

qwen3-vl-32b-instruct

Model Overview

The most advanced vision-language model in the Qwen series as of October 2025 — a non-thinking-capable version of the model. Optimized for instruction-following in image description, visual dialogue, and content-generation tasks.

ByteDance

Service Endpoints

Open AI

16,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

-

Open AI

8,000

Open AI

200,000

Open AI

200,000

Open AI

200,000

Open AI

200,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

200,000

Open AI

128,000

Open AI

128,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

128,000

Open AI

128,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Coming Soon

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

1,000,000

Alibaba Cloud

32,000

Alibaba Cloud

32,000

Alibaba Cloud

32,000

Alibaba Cloud

131,000

Alibaba Cloud

1,000,000

Alibaba Cloud

131,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

258,000

Alibaba Cloud

262,000

Alibaba Cloud

65,000

Alibaba Cloud

126,000

Alibaba Cloud

126,000

Alibaba Cloud

1,000,000

Alibaba Cloud

256,000

Coming Soon

Alibaba Cloud

256,000

Coming Soon

Alibaba Cloud

262,144

Coming Soon

Alibaba Cloud

262,144

Alibaba Cloud

262,144

Alibaba Cloud

1,000,000

Coming Soon

Anthracite

32,000

Baidu

8,000

Baidu

120,000

Baidu

120,000

Baidu

131,000

Baidu

30,000

Baidu

123,000

Baidu

123,000

Baidu

123,000

Baidu

128,000

Baidu

32,000

Baidu

128,000

Baidu

128,000

Baidu

32,000

-

Baidu

64,000

-

ByteDance

256,000

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

Cohere

256,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

1,000,000

DeepSeek

1,000,000

Google

1,000,000

Google

1,000,000

–

Google

1,000,000

Google

1,000,000

Google

1,000,000

Google

128,000

Google

128,000

Google

128,000

Google

8,192

Google

1,000,000

Google

1,048,576

Google

1,050,000

Coming Soon

Google

262,000

Google

1,000,000

Gryphe

4,000

Meta

128,000

Meta

131,000

MiniMax

1,000,000

MiniMax

1,000,000

MiniMax

200,000

MiniMax

200,000

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

Mistral AI

128,000

Moonshot

131,000

Moonshot

256,000

Moonshot

256,000

Moonshot

262,000

Moonshot

256,000

NousResearch

131,000

-

NVIDIA

128,000

NVIDIA

1,000,000

Coming Soon

NVIDIA

1,000,000

Coming Soon

NVIDIA

1,000,000

Coming Soon

Perplexity

128,000

Perplexity

200,000

xAI

131,000

xAI

131,000

xAI

256,000

xAI

256,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

Coming Soon

xAI

2,000,000

Coming Soon

Xiaomi

1,000,000

Coming Soon

Xiaomi

1,000,000

Coming Soon

Zhipu

128,000

Zhipu

128,000

Zhipu

200,000

Zhipu

200,000

Zhipu

200,000

Zhipu

200,000

Coming Soon

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

Flux

Flux

Flux

Flux

Flux

Flux

-

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Flux

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Kling AI

OpenAI

OpenAI

OpenAI

OpenAI

OpenAI

OpenAI

Recraft AI

Reve

Reve

Reve

Stability AI

Stability AI

Tencent

Topaz Labs

Topaz Labs

xAI

xAI

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

-

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

Alibaba Cloud

Coming Soon

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

ByteDance

Coming Soon

ByteDance

Coming Soon

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Google

Coming Soon

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Kling AI

Krea

Krea

LTXV

LTXV

Luma AI

Luma AI

Magic

Magic

Magic

MiniMax

MiniMax

-

MiniMax

MiniMax

MiniMax

OpenAI

-

OpenAI

-

OpenAI

-

OpenAI

-

PixVerse

PixVerse

PixVerse

PixVerse

PixVerse

PixVerse

Coming Soon

Runway

Runway

Runway

Runway

Sber AI

Sber AI

Tencent

Veed

Veed

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

OpenAI

-

OpenAI

-

OpenAI

-

OpenAI

-

OpenAI

OpenAI

OpenAI

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

Deepgram

ElevenLabs

ElevenLabs

Hume AI

Inworld

Inworld

Inworld

Inworld

Coming Soon

Microsoft

Microsoft

OpenAI

OpenAI

OpenAI

MiniMax

MiniMax

MiniMax

MiniMax

MiniMax

Stability AI

Minimax AI

-

Minimax AI

Minimax AI

Minimax AI

Minimax AI

Minimax AI

Coming Soon

Zhipu

Anthropic

4,000

-

Anthropic

16,000

-

Anthropic

32,000

-

Anthropic

16,000

-

Anthropic

16,000

Anthropic

16,000

-

Anthropic

32,000

-

Google

2,000

-

Open AI

8,000

-

Open AI

8,000

Open AI

8,000

gpt-4o-mini-audio-preview

OpenAI

128,000

nvidia/llama-3.1-nemotron-70b-instruct

NVIDIA

128,000

nvidia/nemotron-nano-12b-v2-vl

NVIDIA

128,000

deepseek/deepseek-v3.2-speciale

DeepSeek

128,000

meta-llama/Meta-Llama-3-8B-Instruct-Lite

Meta

9,000

mistralai/Mixtral-8x7B-Instruct-v0.1

Mistral AI

64,000

anthropic/claude-3-haiku claude-3-haiku-20240307 claude-3-haiku-latest

Anthropic

200,000

-

meta-llama/llama-4-maverick

Meta

256,000

google/gemini-3-pro-preview

Google

200,000

BAAI/bge-base-en-v1.5

BAAI

512

togethercomputer/m2-bert-80M-32k-retrieval

Together AI

32,000

imagen-4.0-ultra-generate-preview-06-06

Google

-

x-ai/grok-2-image

xAI

-

claude-3-7-sonnet-20250219

Anthropic

200,000

claude-3-5-haiku-20241022

Anthropic

200,000

-

gemini-2.0-flash-exp

Google

1,000,000

meta-llama/Meta-Llama-Guard-3-8B

Meta

8,000

meta-llama/LlamaGuard-2-8b

Meta

8,000

meta-llama/Llama-Guard-3-11B-Vision-Turbo

Meta

128,000

-

meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo

Meta

128,000

meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo

Meta

4,000

meta-llama/Llama-3.2-3B-Instruct-Turbo

Meta

131,000

Qwen/Qwen3-235B-A22B-fp8-tput

Alibaba Cloud

32,000

Qwen/Qwen2.5-72B-Instruct-Turbo

Alibaba Cloud

32,000

qwen/qwen-2.5-vl-7b-instruct

Alibaba Cloud

32,000

mistralai/mistral-tiny

Mistral AI

32,000

mistralai/Mistral-7B-Instruct-v0.3

Mistral AI

32,000

meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo

Meta

128,000

mistralai/Mistral-7B-Instruct-v0.2

Mistral AI

32,000

chatgpt-4o-latest

OpenAI

128,000

-

meta-llama/llama-4-scout

Meta

1,000,000

BAAI/bge-large-en-v1.5

BAAI

512

bagoodex/bagoodex-search-v1

Bagoodex

deepseek/deepseek-prover-v2

DeepSeek

164,000

claude-3-opus-20240229 anthropic/claude-3-opus claude-3-opus-latest

Anthropic

200,000

luma/ray-1.6

Luma AI

meta-llama/Llama-3-70b-chat-hf

Meta

8,000

bytedance/seededit-3.0-i2i

ByteDance

textembedding-gecko-multilingual@001

Google

2,000

textembedding-gecko@003

Google

2,000

mistralai/codestral-2501

Mistral AI

256,000

mistralai/Mistral-7B-Instruct-v0.1

Mistral AI

8,000

Qwen/Qwen2.5-Coder-32B-Instruct

Alibaba Cloud

131,000

Qwen/QwQ-32B

Alibaba Cloud

131,000

kling-video/v1.5/standard/text-to-video

Kling AI

128,000

o1-mini o1-mini-2024-09-12

OpenAI

128,000

Qwen/Qwen2-72B-Instruct

Alibaba Cloud

32,000

claude-3-5-sonnet-20240620

Anthropic

200,000

-

claude-3-5-sonnet-20241022

Anthropic

200,000

cohere/command-r-plus

Cohere

128,000

google/gemma-2-27b-it

Google

8,000

NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

Nous Research

32,000

-

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Nvidia

128,000

meta-llama/Llama-3-8b-chat-hf

Meta

8,000

meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo

Meta

131,000

meta-llama/Llama-Vision-Free

Meta

128,000

-

meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo

Meta

131,000

abab6.5s-chat

MiniMax

245,000

-

openrouter/horizon-beta

OpenRouter

256,000

-

openrouter/horizon-alpha

OpenRouter

256,000

-

wan/v2.1/1.3b/text-to-video

Alibaba Cloud

-

o1-preview, o1-preview-2024-09-12

OpenAI

128,000

claude-3-sonnet-20240229, anthropic/claude-3-sonnet, claude-3-sonnet-latest

Anthropic

200,000

google/gemini-2.5-pro-preview, google/gemini-2.5-pro-preview-05-06

Google

1,000,000

google/gemini-2.5-flash-preview

Google

1,000,000

neversleep/llama-3.1-lumimaid-70b

NeverSleep

8,000

x-ai/grok-beta

xAI

131,000

gpt-4.5-preview

OpenAI

128,000

gemini-1.5-flash

Google

1,000,000

gemini-1.5-pro

Google

1,000,000

google/gemma-3-1b-it

Google

128,000

togethercomputer/m2-bert-80M-8k-retrieval

TogetherAI

8,000

togethercomputer/m2-bert-80M-2k-retrieval

TogetherAI

2,000

Gryphe/MythoMax-L2-13b-Lite

Gryphe

4,000

-

mistralai/Mixtral-8x22B-Instruct-v0.1

Mistral AI

64,000

google/gemini-2.5-pro-exp-03-25

Google

1,000,000

-

google/gemini-2.0-flash-thinking-exp-01

Google

1,000,000

ai21/jamba-1-5-mini

AI21 Labs

256,000

textembedding-gecko@001

Google

3,000

-

google/gemini-pro or gemini-pro

Google

32,000

meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-128K

Meta

128,000

-

stabilityai/stable-diffusion-xl-base-1.0

Stability AI

upstage/solar-10.7b-instruct-v1.0

Upstage

4,000

meta-llama/Llama-2-13b-chat-hf

Meta

4,100

meta-llama/meta-llama-3-70b-instruct-turbo

Meta

128,000

-

google/gemma-2-9b-it

Google

8,000

google/gemma-2b-it

Google

8,000

Gryphe/MythoMax-L2-13b

Gryphe

4,000

microsoft/WizardLM-2-8x22B

Microsoft

64,000

Austism/chronos-hermes-13b

Austism

2,000

databricks/dbrx-instruct

Databricks

32,000

deepseek-ai/deepseek-llm-67b-chat

DeepSeek

4,000

deepseek-ai/deepseek-coder-33b-instruct

DeepSeek

16,000

Meta-Llama/Llama-2-7b-chat-hf

Meta

4,000

Meta-Llama/Meta-Llama-3-70B-Instruct-Lite

Meta

8,000

Meta-Llama/Llama-Guard-7b

Meta

4,000

meta-llama/Llama-2-7b-hf

Meta

4,000

meta-llama/Llama-3-8b-hf

Meta

8,000

codellama/CodeLlama-70b-hf

Meta

16,000

codellama/CodeLlama-7b-Instruct-hf

Meta

16,000

codellama/CodeLlama-13b-Instruct-hf

Meta

16,000

codellama/CodeLlama-70b-Instruct-hf

Meta

4,000

codellama/CodeLlama-70b-Python-hf

Meta

4,000

mistralai/Mixtral-8x22B-Instruct-v0.1

Mistral AI

64,000

gpt-3.5-turbo-16k-0613

OpenAI

-

gpt-4-0613

OpenAI

128,000

Qwen/Qwen-14B-Chat

Alibaba Cloud

8,000

Qwen/Qwen1.5-0.5B

Alibaba Cloud

32,000

Qwen/Qwen1.5-1.8B

Alibaba Cloud

32,000

Qwen/Qwen1.5-4B

Alibaba Cloud

32,000

Qwen/Qwen1.5-1.8B-Chat

Alibaba Cloud

32,000

Qwen/Qwen1.5-4B-Chat

Alibaba Cloud

32,000

Qwen/Qwen1.5-7B-Chat

Alibaba Cloud

32,000

Qwen/Qwen1.5-14B-Chat

Alibaba Cloud

32,000

qwen/qvq-72b-preview

Alibaba Cloud

32,000

togethercomputer/guanaco-13b

Tim Dettmers

2,000

togethercomputer/guanaco-33b

Tim Dettmers

2,000

togethercomputer/guanaco-65b

Tim Dettmers

2,000

togethercomputer/mpt-7b-chat

Mosaic ML

2,000

togethercomputer/mpt-30b-chat

Mosaic ML

8,000

togethercomputer/RedPajama-INCITE-7B-Instruct

RedPajama

2,000

prompthero/openjourney

PromptHero

77

wavymulder/Analog-Diffusion

wavymulder

77

-

01.AI

4,000

Undi95/Toppy-M-7B

Undi95

4,000

SG161222/Realistic_Vision_V3.0_VAE

Together

77

tiiuae/falcon-40b

TII

2,000

allenai/OLMo-7B

Allen Institute for AI

2,000

bigcode/starcoder

BigCode

8,000

HuggingFaceH4/starchat-alpha

Hugging Face

8,000

NousResearch/Nous-Hermes-Llama2-70b

NousResearch

4,000

NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT

NousResearch

32,000

NousResearch/Nous-Hermes-2-Mistral-7B-DPO

NousResearch

32,000

NousResearch/Hermes-2-Theta-Llama-3-70B

NousResearch

8,000

defog/sqlcoder

Defog AI

8,000

replit/replit-code-v1-3b

Replit

2,000

lmsys/vicuna-13b-v1.5

Imsys

4,000

microsoft/phi-2

Microsoft

2,000

stabilityai/stablelm-base-alpha-3b

StabilityAI

4,000

runwayml/stable-diffusion-v1-5

StabilityAI

77

stabilityai/stable-diffusion-2-1

StabilityAI

77

teknium/OpenHermes-2p5-Mistral-7B

Teknium

8,000

openchat/openchat-3.5-1210

OpenChat

8,000

DiscoResearch/DiscoLM-mixtral-8x7b-v2

Disco Research

32,000

google/flan-t5-xl

Google

512

garage-bAInd/Platypus2-70B-instruct

Garage-bAInd

4,000

EleutherAI/gpt-neox-20b

EleutherAI

2,000

gradientai/Llama-3-70B-Instruct-Gradient-1048k

Gradient

1,048,000

WhereIsAI/UAE-Large-V1

WhereIsAI

512

zero-one-ai/Yi-34B-Chat

01.AI

4,000

meta-llama/Meta-Llama-3.1-70B-Reference

Meta

32,000

–

meta-llama/Meta-Llama-3.1-8B-Reference

Meta

32,000

–

EleutherAI/llemma_7b

EleutherAI

32,000

–

huggyllama/llama-30b

Huggyllama

32,000

–

huggyllama/llama-13b

Huggyllama

32,000

–

togethercomputer/llama-2-70b

TogetherAI

32,000

–

togethercomputer/llama-2-13b

TogetherAI

32,000

–

huggyllama/llama-65b

Huggyllama

32,000

–

WizardLM/WizardLM-70B-V1.0

WizardLM

32,000

–

huggyllama/llama-7b

Huggyllama

32,000

–

togethercomputer/llama-2-7b

TogetherAI

32,000

–

NousResearch/Nous-Hermes-13b

NousResearch

2,000

–

mistralai/Mistral-7B-v0.1

Mistral AI

32,000

mistralai/Mixtral-8x7B-v0.1

Mistral AI

32,000

-

Suno AI

32

gpt-3.5-turbo

Open AI

16,000

Chat GPT 3.5 Turbo

gpt-3.5-turbo-0125

Open AI

alibaba/qwen-image

Alibaba Cloud

Qwen Image

alibaba/qwen-image-edit

Alibaba Cloud

alibaba/wan2.1-t2v-plus

Alibaba Cloud

Wan2.1 Plus

alibaba/wan2.1-t2v-turbo

Alibaba Cloud

aai/slam-1

Assembly AI

Slam 1

aai/universal

Assembly AI

alibaba/qwen3-tts-flash

Alibaba Cloud

Qwen3-TTS-Flash

#g1_aura-angus-en

Deepgram

elevenlabs/v3_alpha

ElevenLabs

Eleven v3 Alpha

minimax/speech-2.5-turbo-preview

MiniMax

elevenlabs/eleven_music

ElevenLabs

Eleven Music

google/lyria2

Google

The service has no Model ID

Google

-

mistral/mistral-ocr-latest

Mistral AI

triposr

Tripo AI

Stable TripoSR 3D

tencent/hunyuan-part

Tencent

alibaba/qwen-text-embedding-v3

Alibaba Cloud

32,000

Qwen Text Embedding v3

alibaba/qwen-text-embedding-v4

Alibaba Cloud

gpt-4-0125-preview

OpenAI

8,000

-

gpt-4o-audio-preview

OpenAI

These models are no longer available for API or Playground calls. Their description and API reference pages have also been removed from this documentation portal.

dedicated page on our official website
the API reference
Discord community

16,000

32,000

128,000

Open AI

16,000

Open AI

16,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

Open AI

128,000

-

Open AI

8,000

Open AI

200,000

Open AI

200,000

Open AI

200,000

Open AI

200,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

200,000

Open AI

128,000

Open AI

128,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

128,000

Open AI

128,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

400,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Open AI

1,000,000

Coming Soon

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

200,000

Anthropic

1,000,000

Alibaba Cloud

32,000

Alibaba Cloud

32,000

Alibaba Cloud

32,000

Alibaba Cloud

131,000

Alibaba Cloud

1,000,000

Alibaba Cloud

131,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

262,000

Alibaba Cloud

258,000

Alibaba Cloud

262,000

Alibaba Cloud

65,000

Alibaba Cloud

126,000

Alibaba Cloud

126,000

Alibaba Cloud

1,000,000

Alibaba Cloud

256,000

Coming Soon

Alibaba Cloud

256,000

Coming Soon

Alibaba Cloud

262,144

Coming Soon

Alibaba Cloud

262,144

Alibaba Cloud

262,144

Alibaba Cloud

1,000,000

Coming Soon

Anthracite

32,000

Baidu

8,000

Baidu

120,000

Baidu

120,000

Baidu

131,000

Baidu

30,000

Baidu

123,000

Baidu

123,000

Baidu

123,000

Baidu

128,000

Baidu

32,000

Baidu

128,000

Baidu

128,000

Baidu

32,000

-

Baidu

64,000

-

ByteDance

256,000

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

ByteDance

256,000

Coming Soon

Cohere

256,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

128,000

DeepSeek

1,000,000

DeepSeek

1,000,000

Google

1,000,000

Google

1,000,000

–

Google

1,000,000

Google

1,000,000

Google

1,000,000

Google

128,000

Google

128,000

Google

128,000

Google

8,192

Google

1,000,000

Google

1,048,576

Google

1,050,000

Coming Soon

Google

262,000

Google

1,000,000

Gryphe

4,000

Meta

128,000

Meta

131,000

MiniMax

1,000,000

MiniMax

1,000,000

MiniMax

200,000

MiniMax

200,000

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

MiniMax

204,800

Mistral AI

128,000

Moonshot

131,000

Moonshot

256,000

Moonshot

256,000

Moonshot

262,000

Moonshot

256,000

NousResearch

131,000

-

NVIDIA

128,000

NVIDIA

1,000,000

Coming Soon

NVIDIA

1,000,000

Coming Soon

NVIDIA

1,000,000

Coming Soon

Perplexity

128,000

Perplexity

200,000

xAI

131,000

xAI

131,000

xAI

256,000

xAI

256,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

xAI

2,000,000

Coming Soon

xAI

2,000,000

Coming Soon

Xiaomi

1,000,000

Coming Soon

Xiaomi

1,000,000

Coming Soon

Zhipu

128,000

Zhipu

128,000

Zhipu

200,000

Zhipu

200,000

Zhipu

200,000

Zhipu

200,000

Coming Soon

gpt-3.5-turbo

Open AI

If you need a more detailed explanation of how to call a model's API in code, check out our QUICKSTART section.

Managing Assistants & Threads
Function Calling
o4-mini
gpt-4o
%pip install openai
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.aimlapi.com/v1",

    # Insert your AIML API Key in the quotation marks instead of <YOUR_AIMLAPI_KEY>:
    api_key="<YOUR_AIMLAPI_KEY>",  
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "You are an AI assistant who knows everything.",
        },
        {
            "role": "user",
            "content": "Tell me, why is the sky blue?"
        },
    ],
)

message = response.choices[0].message.content

print(f"Assistant: {message}")
Assistant: The sky appears blue due to a phenomenon called Rayleigh scattering. When sunlight enters Earth's atmosphere, it collides with gas molecules and small particles. Sunlight is made up of different colors, each with different wavelengths. Blue light has a shorter wavelength and is scattered in all directions by the gas molecules in the atmosphere more than other colors with longer wavelengths, such as red or yellow.
As a result, when you look up at the sky during the day, you see this scattered blue light being dispersed in all directions, making the sky appear blue to our eyes. During sunrise and sunset, the sun's light passes through a greater thickness of Earth's atmosphere, scattering the shorter blue wavelengths out of your line of sight and leaving the longer wavelengths, like red and orange, more dominant, which is why the sky often turns those colors at those times.

16,000

How to make the first API call

1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

API Schema

Code Example

import requests
import json  # for getting a structured output with indentation 

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json"
    },
    json={
        "model":"baidu/ernie-4.5-300b-a47b",
        "messages":[
            {
                "role":"user",
                "content":"Hi! What do you think about mankind?" # insert your prompt
            }
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'baidu/ernie-4.5-300b-a47b',
      messages:[
          {
              role:'user',
              content: 'Hi! What do you think about mankind?' // insert your prompt here
          }
      ],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
Response
{
  "id": "019bd682e8fd00bca8d4a94aace650ce",
  "object": "chat.completion",
  "created": 1768830462,
  "model": "baidu/ernie-4.5-300b-a47b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "As an AI, I don't have personal feelings, but I can provide an analysis of humanity based on available data.\n\nMankind is an incredibly diverse and complex species, marked by remarkable achievements and significant challenges:\n\n**Strengths:**\n1. **Innovation:** Humans have developed advanced technology, from early tools to space exploration, dramatically reshaping their environments and societies.\n2. **Creativity:** Art, literature, music, and philosophy reflect profound emotional and intellectual depth.\n3. **Adaptability:** Humans thrive in nearly every climate on Earth, demonstrating remarkable resilience and resourcefulness.\n4. **Social Cooperation:** Complex societies, governments, and economies enable large-scale collaboration.\n5. **Empathy & Altruism:** Many individuals work selflessly to help others, often across cultural and geographic divides.\n\n**Challenges:**\n1. **Conflict:** War, violence, and discrimination persist due to differences in ideology, resources, or identity.\n2. **Environmental Impact:** Climate change, deforestation, and pollution threaten ecosystems and future survival.\n3. **Inequality:** Wealth gaps, access to education, and healthcare disparities undermine social stability.\n4. **Ethical Dilemmas:** Rapid technological advancements (e.g., AI, genetic engineering) raise questions about responsibility and long-term consequences.\n\n**Potential:** Humanity continues to evolve, with growing awareness of global interconnectedness. Movements for sustainability, social justice, and scientific collaboration suggest a capacity for positive change.\n\nUltimately, mankind's future depends on balancing ambition with wisdom, harnessing progress for collective well-being while addressing vulnerabilities. What aspect of humanity interests you most?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 16,
    "completion_tokens": 371,
    "total_tokens": 387,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "prompt_cache_hit_tokens": 0,
    "prompt_cache_miss_tokens": 16
  },
  "system_fingerprint": "",
  "meta": {
    "usage": {
      "credits_used": 944
    }
  }
}

This documentation is valid for the following list of our models:

  • baidu/ernie-4.5-300b-a47b

Create AI/ML API Key

Try in Playground
How to make the first API call

1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

API Schema

Code Example

import requests
import json  # for getting a structured output with indentation 

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json"
    },
    json={
        "model":"alibaba/qwen3-vl-32b-instruct",
        "messages":[
            {
                # Insert your question for the model here:
                "content":"Hi! What do you think about mankind?"
            }
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'alibaba/qwen3-vl-32b-instruct',
      messages:[
          {
              role:'user',
              // Insert your question for the model here:
              content:'Hi! What do you think about mankind?'
          }
      ],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
Response
{
  "choices": [
    {
      "message": {
        "content": "Hi! 😊 That’s a beautiful and deep question — one that philosophers, scientists, artists, and everyday people have been asking for centuries.\n\nI think mankind is *remarkably complex* — full of contradictions, potential, and wonder. On one hand, we’ve achieved incredible things: we’ve explored space, cured diseases, created art that moves souls, built cities that rise into the sky, and connected across continents in ways unimaginable just a century ago. We’re capable of profound kindness, empathy, creativity, and courage.\n\nOn the other hand, we’ve also caused immense suffering — through war, injustice, environmental destruction, and indifference to each other’s pain. We often struggle with our own flaws: fear, greed, ego, and short-sightedness.\n\nBut here’s what gives me hope: **we’re also capable of change**. We can learn from our mistakes. We can choose compassion over conflict, cooperation over competition. Every act of kindness, every effort to understand another, every step toward justice — these are signs that humanity is not defined by its worst impulses, but by its capacity to grow.\n\nSo, I’d say:  \n➡️ Mankind is flawed, yes — but also deeply hopeful.  \n➡️ We’re messy, but we’re trying.  \n➡️ We make mistakes, but we can also heal, create, and love.\n\nAnd perhaps most importantly — **we’re not alone in this journey**. We’re all part of something bigger, and together, we have the power to shape a better future.\n\nWhat about you? How do *you* see mankind? 💬✨",
        "role": "assistant"
      },
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null
    }
  ],
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 17,
    "completion_tokens": 329,
    "total_tokens": 346,
    "prompt_tokens_details": {
      "text_tokens": 17
    },
    "completion_tokens_details": {
      "text_tokens": 329
    }
  },
  "created": 1764625045,
  "system_fingerprint": null,
  "model": "qwen3-vl-32b-instruct",
  "id": "chatcmpl-a12ab46a-3541-93a8-8180-280ecadbb795",
  "meta": {
    "usage": {
      "tokens_used": 1960
    }
  }
}

This documentation is valid for the following list of our models:

  • alibaba/qwen3-vl-32b-instruct

Create AI/ML API Key

Try in Playground

ernie-4.5-turbo-vl-32k

This documentation is valid for the following list of our models:

  • baidu/ernie-4-5-turbo-vl-32k

Try in Playground

Model Overview

A model from the ERNIE 4.5 Turbo subfamily with multimodal support (text and images), offering a balanced trade-off between performance and computational cost.

How to make the first API call

1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

4️ (Optional) Tune the request

API Schema

Code Example

Response

ernie-4.5-300b-a47b-paddle

This documentation is valid for the following list of our models:

  • baidu/ernie-4.5-300b-a47b-paddle

Try in Playground

Model Overview

A super-large language model, positioned as of August 2025 as a leading Chinese MoE architecture and a foundation model for enterprise applications.

How to make the first API call

1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

4️ (Optional) Tune the request

API Schema

Code Example

Response

Gryphe

Cohere

Chat GPT-3.5 Turbo 0125
gpt-3.5-turbo-1106
Chat GPT-3.5 Turbo 1106
gpt-4o
Chat GPT-4o
gpt-4o-2024-08-06
GPT-4o-2024-08-06
gpt-4o-2024-05-13
GPT-4o-2024-05-13
gpt-4o-mini
Chat GPT 4o mini
gpt-4o-mini-2024-07-18
GPT 4o mini
gpt-4o-search-preview
GPT-4o Search Preview
gpt-4o-mini-search-preview
GPT-4o Mini Search Preview
gpt-4-turbo
Chat GPT 4 Turbo
gpt-4-turbo-2024-04-09
gpt-4
Chat GPT 4
o1
OpenAI o1
openai/o3-2025-04-16
o3
o3-mini
OpenAI o3 mini
openai/o3-pro
o3-pro
openai/gpt-4.1-2025-04-14
GPT-4.1
openai/gpt-4.1-mini-2025-04-14
GPT-4.1 Mini
openai/gpt-4.1-nano-2025-04-14
GPT-4.1 Nano
openai/o4-mini-2025-04-16
GPT-o4-mini-2025-04-16
openai/gpt-oss-20b
GPT OSS 20B
openai/gpt-oss-120b
GPT OSS 120B
openai/gpt-5-2025-08-07
GPT-5
openai/gpt-5-mini-2025-08-07
GPT-5 Mini
openai/gpt-5-nano-2025-08-07
GPT-5 Nano
openai/gpt-5-chat-latest
GPT-5 Chat
openai/gpt-5-1
GPT-5.1
openai/gpt-5-1-chat-latest
GPT-5.1 Chat Latest
openai/gpt-5-1-codex
GPT-5.1 Codex
openai/gpt-5-1-codex-mini
GPT-5.1 Codex Mini
openai/gpt-5-2
GPT-5.2
openai/gpt-5-2-chat-latest
GPT-5.2 Chat Latest
openai/gpt-5-2-pro
GPT-5.2 Pro
openai/gpt-5-2-codex
GPT-5.2 Codex
openai/gpt-5-3-codex
GPT-5.3 Codex
openai/gpt-5-4
GPT-5.4
openai/gpt-5-4-pro
GPT-5.4 Pro
openai/gpt-5-5
GPT-5.5
openai/gpt-5-5-pro
anthropic/claude-opus-4
Claude 4 Opus
anthropic/claude-opus-4.1 claude-opus-4-1 claude-opus-4-1-20250805
Claude Opus 4.1
anthropic/claude-sonnet-4
Claude 4 Sonnet
claude-sonnet-4-5-20250929
anthropic/claude-sonnet-4.5
claude-sonnet-4-5
Claude 4.5 Sonnet
anthropic/claude-haiku-4.5
claude-haiku-4-5
claude-haiku-4-5-20251001
Claude 4.5 Haiku
anthropic/claude-opus-4-5 claude-opus-4-5 claude-opus-4-5-20251101
Claude 4.5 Opus
anthropic/claude-opus-4-6
Claude 4.6 Opus
anthropic/claude-sonnet-4.6 anthropic/claude-sonnet-4-6-20260218
Claude Sonnet 4.6
anthropic/claude-opus-4-7 claude-opus-4-7
Claude Opus 4.7
Qwen/Qwen2.5-7B-Instruct-Turbo
Qwen 2.5 7B Instruct Turbo
qwen-max
Qwen Max
qwen-max-2025-01-25
Qwen Max 2025-01-25
qwen-plus
Qwen Plus
qwen-turbo
Qwen Turbo
alibaba/qwen3-32b
Qwen3-32B
alibaba/qwen3-coder-480b-a35b-instruct
Qwen3 Coder
alibaba/qwen3-235b-a22b-thinking-2507
Qwen3 235B A22B Thinking
alibaba/qwen3-next-80b-a3b-instruct
Qwen3-Next-80B-A3B Instruct
alibaba/qwen3-next-80b-a3b-thinking
Qwen3-Next-80B-A3B Thinking
alibaba/qwen3-max-preview
Qwen3-Max Preview
alibaba/qwen3-max-instruct
Qwen3-Max Instruct
qwen3-omni-30b-a3b-captioner
qwen3-omni-30b-a3b-captioner
alibaba/qwen3-vl-32b-instruct
Qwen3 VL 32B Instruct
alibaba/qwen3-vl-32b-thinking
Qwen3 VL 32B Thinking
alibaba/qwen3.5-plus-20260218
Qwen3.5 Plus
alibaba/qwen3.5-omni-plus
alibaba/qwen3.5-omni-flash
alibaba/qwen3.5-flash
alibaba/qwen3.6-27b
Qwen 3.6 27B
alibaba/qwen3.6-35b-a3b
Qwen3.6-35B-A3B
alibaba/qwen3.7-max
anthracite-org/magnum-v4-72b
Magnum v4 72B
baidu/ernie-4-5-8k-preview
ERNIE 4.5
baidu/ernie-4.5-0.3b
ERNIE 4.5
baidu/ernie-4.5-21b-a3b
ERNIE 4.5
baidu/ernie-4.5-21b-a3b-thinking
ERNIE 4.5
baidu/ernie-4.5-vl-28b-a3b
ERNIE 4.5 VL
baidu/ernie-4.5-vl-424b-a47b
ERNIE 4.5 VL
baidu/ernie-4.5-300b-a47b
ERNIE 4.5
baidu/ernie-4.5-300b-a47b-paddle
ERNIE 4.5
baidu/ernie-4-5-turbo-128k
ERNIE 4.5
baidu/ernie-4-5-turbo-vl-32k
ERNIE 4.5 VL
baidu/ernie-5-0-thinking-preview
ERNIE 5.0
baidu/ernie-5-0-thinking-latest
ERNIE 5.0
baidu/ernie-x1-turbo-32k
baidu/ernie-x1-1-preview
bytedance/seed-1-8
Seed 1.8
bytedance/dola-seed-2-0-mini
bytedance/dola-seed-2-0-lite
bytedance/dola-seed-2-0-pro
bytedance/dola-seed-2-0-code
cohere/command-a
Command A
deepseek-chat or deepseek/deepseek-chat or deepseek/deepseek-chat-v3-0324
DeepSeek V3
deepseek/deepseek-r1 or deepseek-reasoner
DeepSeek R1
deepseek/deepseek-chat-v3.1
DeepSeek V3.1 Chat
deepseek/deepseek-reasoner-v3.1
DeepSeek V3.1 Reasoner
deepseek/deepseek-thinking-v3.2-exp
DeepSeek V3.2-Exp Thinking
deepseek/deepseek-non-thinking-v3.2-exp
DeepSeek V3.2-Exp Non-Thinking
deepseek/deepseek-reasoner-v3.1-terminus
DeepSeek V3.1 Terminus Reasoning
deepseek/deepseek-non-reasoner-v3.1-terminus
DeepSeek V3.1 Terminus Non-Reasoning
deepseek/deepseek-v4-pro
DeepSeek V4 Pro
deepseek/deepseek-v4-flash
DeepSeek 4 Flash
gemini-2.0-flash
Gemini 2.0 Flash
google/gemini-2.5-flash-lite-preview
google/gemini-2.5-flash
Gemini 2.5 Flash
google/gemini-3-flash-preview
Gemini 3 Flash
google/gemini-2.5-pro
Gemini 2.5 Pro
google/gemma-3-4b-it
Gemma 3 (4B)
google/gemma-3-12b-it
Gemma 3 (12B)
google/gemma-3-27b-it
Gemma 3 (27B)
google/gemma-3n-e4b-it
Gemma 3n 4B
google/gemini-3-1-pro-preview
Gemini 3.1 Pro
google/gemini-3-1-flash-lite-preview
Gemini 3.1 Flash Lite
google/gemini-3-1-flash-lite
google/gemma-4-31b-it
Gemma 4 31B
google/gemini-3-5-flash
Gemini 3.5 Flash
gryphe/mythomax-l2-13b
MythoMax-L2 (13B)
meta-llama/Llama-3.3-70B-Instruct-Turbo
Meta Llama 3.3 70B Instruct Turbo
meta-llama/llama-3.3-70b-versatile
Llama 3.3 70B Versatile
MiniMax-Text-01
MiniMax-Text-01
minimax/m1
MiniMax M1
minimax/m2
MiniMax M2
minimax/m2-her
MiniMax M2-her
minimax/m2-1
MiniMax M2.1
minimax/m2-1-highspeed
MiniMax M2.1 Highspeed
minimax/m2-5-20260218
MiniMax M2.5
minimax/m2-5-highspeed-20260218
MiniMax M2.5
minimax/m2-7-20260402
MiniMax M2.7
minimax/m2-7-highspeed
MiniMax M2.7 Highspeed
mistralai/mistral-nemo
Mistral Nemo
moonshot/kimi-k2-preview
Kimi-K2
moonshot/kimi-k2-0905-preview
Kimi-K2
moonshot/kimi-k2-turbo-preview
Kimi K2 Turbo Preview
moonshot/kimi-k2-5
Kimi K2.5
moonshot/kimi-k2-6
Kimi K2.6
nousresearch/hermes-4-405b
nvidia/nemotron-nano-9b-v2
Nemotron Nano 9B V2
nvidia/nemotron-3-nano-30b-a3b
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning
nvidia/nemotron-3-super-120b-a12b
perplexity/sonar
Sonar
perplexity/sonar-pro
Sonar Pro
x-ai/grok-3-beta
Grok 3 Beta
x-ai/grok-3-mini-beta
Grok 3 Beta Mini
x-ai/grok-4-07-09
Grok 4
x-ai/grok-code-fast-1
Grok Code Fast 1
x-ai/grok-4-fast-non-reasoning
Grok 4 Fast
x-ai/grok-4-fast-reasoning
Grok 4 Fast Reasoning
x-ai/grok-4-1-fast-non-reasoning
Grok 4.1 Fast Non-Reasoning
x-ai/grok-4-1-fast-reasoning
Grok 4.1 Fast Reasoning
x-ai/grok-4-20-0309-non-reasoning
x-ai/grok-4-20-0309-reasoning
xiaomi/mimo-v2.5
xiaomi/mimo-v2.5-pro
zhipu/glm-4.5-air
GLM-4.5 Air
zhipu/glm-4.5
GLM-4.5
zhipu/glm-4.6
GLM-4.6
zhipu/glm-4.7
GLM-4.7
zhipu/glm-5
GLM-5
zhipu/glm-5-1
Qwen Image Edit
alibaba/z-image-turbo
Z-Image Turbo
alibaba/z-image-turbo-lora
Z-Image Turbo LoRA
alibaba/wan2.2-t2i-plus
Wan 2.2 Plus
alibaba/wan2.2-t2i-flash
Wan 2.2 Flash
alibaba/wan2.5-t2i-preview
Wan 2.5 Preview
alibaba/wan-2-6-image
Wan 2.6
alibaba/wan-2-7-image
alibaba/wan-2-7-image-pro
alibaba/qwen-image-2-0
alibaba/qwen-image-2-0-pro
bytedance/seedream-3.0
Seedream 3.0
bytedance/seedream-v4-text-to-image
Seedream 4 Text-to-Image
bytedance/seedream-v4-edit
Seedream 4 Edit
bytedance/uso
USO
bytedance/seedream-4-5
Seedream 4.5
bytedance/seedream-5-0-lite-preview
Seadream 5.0 Lite
flux-pro
FLUX.1 [pro]
flux-pro/v1.1
FLUX 1.1 [pro]
flux-pro/v1.1-ultra
FLUX 1.1 [pro ultra]
flux-realism
FLUX Realism LoRA
flux/dev
FLUX.1 [dev]
flux/dev/image-to-image
flux/schnell
FLUX.1 [schnell]
flux/kontext-max/text-to-image
FLUX.1 Kontext [max]
flux/kontext-max/image-to-image
FLUX.1 Kontext [max]
flux/kontext-pro/text-to-image
Flux.1 Kontext [pro]
flux/kontext-pro/image-to-image
Flux.1 Kontext [pro]
flux/srpo
FLUX.1 SRPO Text-to-Image
flux/srpo/image-to-image
FLUX.1 SRPO Image-to-Image
blackforestlabs/flux-2
FLUX.2
blackforestlabs/flux-2-edit
FLUX.2 Edit
blackforestlabs/flux-2-lora
Flux 2 LoRA
blackforestlabs/flux-2-lora-edit
Flux 2 LoRA Edit
blackforestlabs/flux-2-pro
FLUX.2 [pro]
blackforestlabs/flux-2-pro-edit
FLUX.2 [pro] Edit
imagen-3.0-generate-002
Imagen 3
google/imagen4/preview
Imagen 4 Preview
google/imagen-4.0-generate-001
Imagen 4.0 Generate
google/imagen-4.0-fast-generate-001
Imagen 4.0 Fast Generate
google/imagen-4.0-ultra-generate-001
Imagen 4.0 Ultra Generate
google/gemini-2.5-flash-image
Gemini 2.5 Flash Image
google/gemini-2.5-flash-image-edit
Gemini 2.5 Flash Image Edit
google/nano-banana-pro
google/gemini-3-pro-image-preview
Gemini 3 Pro Image (Nano Banana Pro)
google/nano-banana-pro-edit
google/gemini-3-pro-image-preview-edit
Gemini 3 Pro Image Edit (Nano Banana Pro)
google/nano-banana-2
google/gemini-3-1-flash-image-preview
Gemini 3.1 Flash Image (Nano Banana 2)
klingai/image-o1
Kling Image O1
dall-e-2
OpenAI DALL·E 2
dall-e-3
OpenAI DALL·E 3
openai/gpt-image-1
gpt-image-1
openai/gpt-image-1-mini
GPT Image 1 Mini
openai/gpt-image-1-5
GPT Image 1.5
openai/gpt-image-2
GPT Image 2
recraft-v3
Recraft v3
reve/create-image
Reve Create Image
reve/edit-image
Reve Edit Image
reve/remix-edit-image
Reve Remix Image
stable-diffusion-v3-medium
Stable Diffusion 3
stable-diffusion-v35-large
Stable Diffusion 3.5 Large
hunyuan/hunyuan-image-v3-text-to-image
HunyuanImage 3.0
topaz-labs/sharpen
Sharpen
topaz-labs/sharpen-gen
Sharpen Generative
x-ai/grok-imagine-image
Grok Imagine
x-ai/grok-imagine-image-pro
Grok Imagine Image Pro
Wan2.1 Turbo
alibaba/wan2.2-t2v-plus
Wan 2.2 T2V
alibaba/wan2.5-t2v-preview
Wan 2.5 Text-to-Video
alibaba/wan2.5-i2v-preview
Wan 2.5 Image-to-Video
alibaba/wan2.2-14b-animate-replace
Wan 2.2 14b animate replace
alibaba/wan2.2-14b-animate-move
Wan 2.2 14b animate move
alibaba/wan2.2-vace-fun-a14b-reframe
Wan 2.2 vace fun 14b reframe
alibaba/wan2.2-vace-fun-a14b-outpainting
Wan 2.2 vace fun 14b outpainting
alibaba/wan2.2-vace-fun-a14b-inpainting
Wan 2.2 vace fun 14b inpainting
alibaba/wan2.2-vace-fun-a14b-pose
Wan 2.2 vace fun 14b pose
alibaba/wan2.2-vace-fun-14b-depth
Wan 2.2 vace fun 14b depth
alibaba/wan2.5-t2v-preview
Wan 2.5 Preview
alibaba/wan2.5-i2v-preview
alibaba/wan-2-6-t2v
Wan 2.6 Text-to-Video
alibaba/wan-2-6-i2v
Wan 2.6 Image-to-Video
alibaba/wan-2-6-r2v
Wan 2.6 Reference-to-Video
alibaba/wan-2-6-image-to-video-flash
alibaba/happyhorse-1-0
alibaba/wan-2-7-t2v
alibaba/wan-2-7-i2v
alibaba/wan-2-7-r2v
bytedance/seedance-1-0-lite-t2v
Seedance 1.0 lite Text to Video
bytedance/seedance-1-0-lite-i2v
Seedance 1.0 lite Image to Video
bytedance/seedance-1-0-pro-t2v
Seedance 1.0 Pro
bytedance/seedance-1-0-pro-i2v
Seedance 1.0 Pro
bytedance/seedance-1-0-pro-fast
Seedance 1.0 Pro Fast
bytedance/omnihuman
OmniHuman
bytedance/omnihuman/v1.5
OmniHuman v1.5
bytedance/seedance-1-5-pro
Seedance 1.5 Pro
bytedance/seedance-2-0
bytedance/seedance-2-0-fast
veo2
Veo2 Text-to-Video
veo2/image-to-video
Veo2 Image-to-Video
google/veo3
Veo 3
google/veo-3.0-i2v
Veo 3 I2V
google/veo-3.0-fast
Veo 3 Fast
google/veo-3.0-i2v-fast
Veo 3 I2V Fast
google/veo-3.1-t2v
Veo 3.1 Text-to-Video
google/veo-3.1-t2v-fast
Veo 3.1 Fast Text-to-Video
google/veo-3.1-i2v
Veo 3.1 Image-to-Video
google/veo-3.1-i2v-fast
Veo 3.1 Fast Image-to-Video
google/veo-3.1-reference-to-video
Veo 3.1 Reference-to-Video
google/veo-3.1-first-last-image-to-video
Veo 3.1 First-Last Frame-to-Video
google/veo-3.1-first-last-image-to-video-fast
Veo 3.1 Fast First-Last Frame-to-Video
google/veo3-1-extend-video
Veo 3.1 Extend Video
google/veo3-1-fast-extend-video
Veo 3.1 Fast Extend Video
google/veo-3-1-lite-generate-preview
kling-video/v1/standard/image-to-video
Kling AI (image-to-video)
kling-video/v1/standard/text-to-video
Kling AI (text-to-video)
kling-video/v1/pro/image-to-video
Kling AI (image-to-video)
kling-video/v1/pro/text-to-video
Kling AI (text-to-video)
kling-video/v1.6/standard/text-to-video
Kling 1.6 Standard
kling-video/v1.6/standard/image-to-video
Kling 1.6 Standard
kling-video/v1.6/pro/image-to-video
Kling 1.6 Pro
kling-video/v1.6/pro/text-to-video
Kling 1.6 Pro
klingai/kling-video-v1.6-pro-effects
Kling 1.6 Pro Effects
klingai/kling-video-v1.6-standard-effects
Kling 1.6 Standard Effects
kling-video/v1.6/standard/multi-image-to-video
Kling V1.6 Multi-Image-to-Video
klingai/v2-master-image-to-video
Kling 2.0 Master
klingai/v2-master-text-to-video
Kling 2.0 Master
kling-video/v2.1/standard/image-to-video
Kling V2.1 Standard I2V
kling-video/v2.1/pro/image-to-video
Kling V2.1 Pro I2V
klingai/v2.1-master-image-to-video
ling 2.1 Master
klingai/v2.1-master-text-to-video
Kling 2.1 Master
klingai/v2.5-turbo/pro/image-to-video
Kling Video v2.5 Turbo Pro Image-to-Video
klingai/v2.5-turbo/pro/text-to-video
Kling Video v2.5 Turbo Pro Text-to-Video
klingai/avatar-standard
Kling AI Avatar Standard
klingai/avatar-pro
Kling AI Avatar Pro
klingai/video-v2-6-pro-text-to-video
Kling 2.6 Pro Text-to-Video
klingai/video-v2-6-pro-image-to-video
Kling 2.6 Pro Image-to-Video
klingai/video-o1-image-to-video
Kling Video O1 Image to Video
klingai/video-o1-reference-to-video
Kling Video O1 Reference-to-Video
klingai/video-o1-video-to-video-edit
Kling Video O1 Video to Video Edit
klingai/video-o1-video-to-video-reference
Kling Video O1 Video-to-Video Reference
klingai/video-v2-6-pro-motion-control
Kling 2.6 Pro Motion Control
klingai/video-v3-standard-text-to-video
Kling Video v3 Standard
klingai/video-v3-standard-image-to-video
Kling Video v3 Standard
klingai/video-v3-pro-text-to-video
Kling Video v3 Pro
klingai/video-v3-pro-image-to-video
Kling Video v3 Pro
krea/krea-wan-14b/text-to-video
Krea WAN 14B Text-to-Video
krea/krea-wan-14b/video-to-video
Krea WAN 14B Video-to-Video
ltxv/ltxv-2
LTXV 2
ltxv/ltxv-2-fast
LTXV 2 Fast
luma/ray-2
Ray 2
luma/ray-flash-2
Ray Flash 2
magic/text-to-video
Magic Video
magic/image-to-video
Magic Video
magic/video-to-video
Magic Video
video-01
MiniMax Video-01
video-01-live2d
minimax/hailuo-02
Hailuo 02
minimax/hailuo-2.3
Hailuo 2.3
minimax/hailuo-2.3-fast
Hailuo 2.3 Fast
sora-2-t2v
sora-2-i2v
sora-2-pro-t2v
sora-2-pro-i2v
pixverse/v5/text-to-video
Pixverse v5 Text-to-Video
pixverse/v5/image-to-video
Pixverse v5 Image-to-Video
pixverse/v5/transition
Pixverse v5 Transition
pixverse/v5-5-text-to-video
PixVerse V5.5 Text-to-Video
pixverse/v5-5-image-to-video
Pixverse v5.5 Image-to-Video
pixverse/lip-sync
gen3a_turbo
Runway Gen-3 turbo
runway/gen4_turbo
Runway Gen-4 Turbo
runway/gen4_aleph
Aleph
runway/act_two
Runway Act Two
sber-ai/kandinsky5-t2v
Kandinsky 5 Standard
sber-ai/kandinsky5-distill-t2v
Kandinsky 5 Distill
tencent/hunyuan-video-foley
HunyuanVideo Foley
veed/fabric-1.0
fabric-1.0
veed/fabric-1.0-fast
fabric-1.0-fast
Universal
#g1_nova-2-automotive
Deepgram Nova-2
#g1_nova-2-conversationalai
Deepgram Nova-2
#g1_nova-2-drivethru
Deepgram Nova-2
#g1_nova-2-finance
Deepgram Nova-2
#g1_nova-2-general
Deepgram Nova-2
#g1_nova-2-medical
Deepgram Nova-2
#g1_nova-2-meeting
Deepgram Nova-2
#g1_nova-2-phonecall
Deepgram Nova-2
#g1_nova-2-video
Deepgram Nova-2
#g1_nova-2-voicemail
Deepgram Nova-2
#g1_whisper-tiny
#g1_whisper-small
#g1_whisper-base
#g1_whisper-medium
#g1_whisper-large
Whisper
openai/gpt-4o-transcribe
GPT-4o Transcribe
openai/gpt-4o-mini-transcribe
GPT-4o Mini Transcribe
Aura
#g1_aura-arcas-en
Aura
#g1_aura-asteria-en
Aura
#g1_aura-athena-en
Aura
#g1_aura-helios-en
Aura
#g1_aura-hera-en
Aura
#g1_aura-luna-en
Aura
#g1_aura-orion-en
Aura
#g1_aura-orpheus-en
Aura
#g1_aura-perseus-en
Aura
#g1_aura-stella-en
Aura
#g1_aura-zeus-en
Aura
#g1_aura-2-amalthea-en
Aura 2
#g1_aura-2-andromeda-en
Aura 2
#g1_aura-2-apollo-en
Aura 2
#g1_aura-2-arcas-en
Aura 2
#g1_aura-2-aries-en
Aura 2
#g1_aura-2-asteria-en
Aura 2
#g1_aura-2-athena-en
Aura 2
#g1_aura-2-atlas-en
Aura 2
#g1_aura-2-aurora-en
Aura 2
#g1_aura-2-callista-en
Aura 2
#g1_aura-2-cora-en
Aura 2
#g1_aura-2-cordelia-en
Aura 2
#g1_aura-2-delia-en
Aura 2
#g1_aura-2-draco-en
Aura 2
#g1_aura-2-electra-en
Aura 2
#g1_aura-2-harmonia-en
Aura 2
#g1_aura-2-helena-en
Aura 2
#g1_aura-2-hera-en
Aura 2
#g1_aura-2-hermes-en
Aura 2
#g1_aura-2-hyperion-en
Aura 2
#g1_aura-2-iris-en
Aura 2
#g1_aura-2-janus-en
Aura 2
#g1_aura-2-juno-en
Aura 2
#g1_aura-2-jupiter-en
Aura 2
#g1_aura-2-luna-en
Aura 2
#g1_aura-2-mars-en
Aura 2
#g1_aura-2-minerva-en
Aura 2
#g1_aura-2-neptune-en
Aura 2
#g1_aura-2-odysseus-en
Aura 2
#g1_aura-2-ophelia-en
Aura 2
#g1_aura-2-orion-en
Aura 2
#g1_aura-2-orpheus-en
Aura 2
#g1_aura-2-pandora-en
Aura 2
#g1_aura-2-phoebe-en
Aura 2
#g1_aura-2-pluto-en
Aura 2
#g1_aura-2-saturn-en
Aura 2
#g1_aura-2-selene-en
Aura 2
#g1_aura-2-thalia-en
Aura 2
#g1_aura-2-theia-en
Aura 2
#g1_aura-2-vesta-en
Aura 2
#g1_aura-2-zeus-en
Aura 2
#g1_aura-2-celeste-es
Aura 2
#g1_aura-2-estrella-es
Aura 2
#g1_aura-2-nestor-es
Aura 2
elevenlabs/eleven_multilingual_v2
ElevenLabs Multilingual v2
elevenlabs/eleven_turbo_v2_5
ElevenLabs Turbo v2.5
hume/octave-2
Octave 2
inworld/tts-1
Inworld TTS-1
inworld/tts-1-max
Inworld TTS-1-Max
inworld/tts-1-5-mini
Inworld TTS-1.5-Mini
inworld/tts-1-5-max
microsoft/vibevoice-1.5b
VibeVoice 1.5B
microsoft/vibevoice-7b
VibeVoice 7B
openai/tts-1
TTS-1
openai/tts-1-hd
TTS-1 HD
openai/gpt-4o-mini-tts
GPT-4o-mini-TTS
MiniMax Speech 2.5 Turbo
minimax/speech-2.5-hd-preview
MiniMax Speech 2.5 HD
minimax/speech-2.6-turbo
MiniMax Speech 2.6 Turbo
minimax/speech-2.6-hd
MiniMax Speech 2.6 HD
minimax/speech-2.8-turbo
Speech 2.8 Turbo
minimax/speech-2.8-hd
Speech 2.8 HD
Lyria 2
stable-audio
Stable Audio
minimax-music
music-01
MiniMax Music
minimax/music-1.5
MiniMax Music 1.5
minimax/music-2.0
MiniMax Music 2.0
minimax/music-2.6
MiniMax Music 2.6
minimax/music-cover
Mistral OCR Latest
zhipu/glm-ocr
GLM-OCR
Hunyuan Part
Qwen Text Embedding v4
voyage-2
voyage-code-2
voyage-finance-2
voyage-large-2
voyage-large-2-instruct
Voyage Large 2 Instruct
voyage-law-2
voyage-multilingual-2
text-multilingual-embedding-002
text-embedding-3-small
text-embedding-3-large
Text-embedding-3-large
text-embedding-ada-002
Text-embedding-ada-002
GPT-4o Audio Preview
GPT-4o mini Audio
Llama 3.1 Nemotron 70B Instruct
Nemotron Nano 12B V2 VL
DeepSeek V3.2 Speciale
Llama 3 8B Instruct Lite
Mixtral-8x7B Instruct v0.1
Llama 4 Maverick
Gemini 3 Pro Preview
BAAI-Bge-Base-1p5
M2-BERT-Retrieval-32k
Imagen 4 Ultra
Grok 2 Image
Claude 3.7 Sonnet
Gemini 2.0 Flash Experimental
Llama Guard 3 (8B)
LlamaGuard 2 (8b)
Llama 3.1 70B Instruct Turbo
Llama 3.1 (405B) Instruct Turbo
Llama 3.2 3B Instruct Turbo
Qwen 3 235B A22B
Qwen 2.5 72B Instruct Turbo
Qwen2.5 VL 7B Instruct
Mistral Tiny
Mistral (7B) Instruct v0.3
Llama 3.1 8B Instruct Turbo
Mistral (7B) Instruct v0.2
Llama 4 Scout
bge-large-en
Bagoodex Web Search v1
DeepSeek Prover V2
Claude 3 Opus
Ray 1.6
Llama 3 70B Instruct Reference
Seedream 3.0
Textembedding-gecko-multilingual@001
Textembedding-gecko@003
Mistral Codestral-2501
Mistral (7B) Instruct v0.1
Qwen 2.5 Coder
Qwq-32B
Kling 1.5 Standart
OpenAI o1-mini
Qwen 2 Instruct (72B)
Claude 3.5 Sonnet 20241022
Command R+
Gemma 2 (27b)
Llama 3.1 Nemotron 70B Instruct
Llama 3 8B Instruct Reference
Llama 3.2 90B Vision Instruct Turbo
Llama 3.2 11B Vision Instruct Turbo
Wan 2.1
OpenAI o1-preview
Claude 3 Sonnet
Gemini Pro 2.5 Preview
Gemini 2.5 Flash Preview
Llama 3.1 Lumimaid 70b
Grok-2 Beta
Chat GPT 4.5 preview
Gemini 1.5 Flash
Gemini 1.5 Pro
Gemma 3 (1B)
M2-BERT-Retrieval-8k
M2-BERT-Retrieval-2K
Mixtral 8x22B Instruct
Gemini 2.0 Flash Thinking Experimental
Jamba 1.5 Mini
Gemini 1.0 Pro
Stable Diffusion XL 1.0
Upstage SOLAR Instruct v1 (11B)
LLaMA-2 Chat (13B)
Gemma 2 (9B)
Gemma Instruct (2B)
MythoMax-L2 (13B)
WizardLM 2-8 (22B)
Chronos Hermes 13b
DBRX Instruct
Deepseek-LLM-67b-Chat
Deepseek Coder Instruct (33B)
LLaMA-2 Chat (7B)
Llama 3 70B Instruct Lite
Llama Guard (7B)
LLaMA-2 (7B)
Llama-3 (8B)
Code Llama (70B)
Code Llama Instruct (7B)
Code Llama Instruct (13B)
Code Llama Instruct (70B)
Code Llama Python (70B)
Mixtral 8x22B Instruct
Chat GPT 4 Turbo
Qwen Chat (14B)
Qwen 1.5 (0.5B)
Qwen 1.5 (1.8B)
Qwen 1.5 (4B)
Qwen 1.5 Chat (1.8B)
Qwen 1.5 Chat (4B)
Qwen 1.5 Chat (7B)
Qwen 1.5 Chat (14B)
QVQ-72B-Preview
Guanaco (13B)
Guanaco (33B)
Guanaco (65B)
MPT-Chat (7B)
MPT-Chat (30B)
RedPajama-INCITE Instruct (7B)
Openjourney v4
Analog Diffusion
01-ai Yi Base (6B)
Toppy M (7B)
Realistic Vision 3.0
Falcon (40B)
OLMo-7B
StarCoder (16B)
StarCoderChat Alpha (16B)
Nous Hermes LLaMA-2 (70B)
Nous Hermes 2 - Mixtral 8x7B-SFT
Nous Hermes 2 - Mistral DPO (7B)
Hermes 2 Theta Llama-3 70B
SQLCoder (15B)
Replit-Code-v1 (3B)
Vicuna v1.5 (13B)
Microsoft Phi-2
StableLM Base Alpha 3B
Stable Diffusion 1.5
Stable Diffusion 2.1
OpenHermes-2.5-Mistral (7B)
OpenChat 3.5 (7B)
DiscoLM Mixtral 8x7b (46.7B)
FLAN T5 XL (3B)
Platypus2-70B-Instruct
GPT Neox 20B
Llama-3 70B Gradient Instruct 1048k
UAE-Large-V1
Yi-34B-Chat
​Mixtral 7B
Mixtral-8x7B Instruct v0.1
Suno AI
meta-llama/Meta-Llama-3-8B-Instruct-Lite
Chat GPT 3.5 Turbo
gpt-3.5-turbo-0125
Chat GPT-3.5 Turbo 0125
gpt-3.5-turbo-1106
Chat GPT-3.5 Turbo 1106
gpt-4o
Chat GPT-4o
gpt-4o-2024-08-06
GPT-4o-2024-08-06
gpt-4o-2024-05-13
GPT-4o-2024-05-13
gpt-4o-mini
Chat GPT 4o mini
gpt-4o-mini-2024-07-18
GPT 4o mini
gpt-4o-search-preview
GPT-4o Search Preview
gpt-4o-mini-search-preview
GPT-4o Mini Search Preview
gpt-4-turbo
Chat GPT 4 Turbo
gpt-4-turbo-2024-04-09
gpt-4
Chat GPT 4
o1
OpenAI o1
openai/o3-2025-04-16
o3
o3-mini
OpenAI o3 mini
openai/o3-pro
o3-pro
openai/gpt-4.1-2025-04-14
GPT-4.1
openai/gpt-4.1-mini-2025-04-14
GPT-4.1 Mini
openai/gpt-4.1-nano-2025-04-14
GPT-4.1 Nano
openai/o4-mini-2025-04-16
GPT-o4-mini-2025-04-16
openai/gpt-oss-20b
GPT OSS 20B
openai/gpt-oss-120b
GPT OSS 120B
openai/gpt-5-2025-08-07
GPT-5
openai/gpt-5-mini-2025-08-07
GPT-5 Mini
openai/gpt-5-nano-2025-08-07
GPT-5 Nano
openai/gpt-5-chat-latest
GPT-5 Chat
openai/gpt-5-1
GPT-5.1
openai/gpt-5-1-chat-latest
GPT-5.1 Chat Latest
openai/gpt-5-1-codex
GPT-5.1 Codex
openai/gpt-5-1-codex-mini
GPT-5.1 Codex Mini
openai/gpt-5-2
GPT-5.2
openai/gpt-5-2-chat-latest
GPT-5.2 Chat Latest
openai/gpt-5-2-pro
GPT-5.2 Pro
openai/gpt-5-2-codex
GPT-5.2 Codex
openai/gpt-5-3-codex
GPT-5.3 Codex
openai/gpt-5-4
GPT-5.4
openai/gpt-5-4-pro
GPT-5.4 Pro
openai/gpt-5-5
GPT-5.5
openai/gpt-5-5-pro
anthropic/claude-opus-4
Claude 4 Opus
anthropic/claude-opus-4.1 claude-opus-4-1 claude-opus-4-1-20250805
Claude Opus 4.1
anthropic/claude-sonnet-4
Claude 4 Sonnet
claude-sonnet-4-5-20250929
anthropic/claude-sonnet-4.5
claude-sonnet-4-5
Claude 4.5 Sonnet
anthropic/claude-haiku-4.5
claude-haiku-4-5
claude-haiku-4-5-20251001
Claude 4.5 Haiku
anthropic/claude-opus-4-5 claude-opus-4-5 claude-opus-4-5-20251101
Claude 4.5 Opus
anthropic/claude-opus-4-6
Claude 4.6 Opus
anthropic/claude-sonnet-4.6 anthropic/claude-sonnet-4-6-20260218
Claude Sonnet 4.6
anthropic/claude-opus-4-7 claude-opus-4-7
Claude Opus 4.7
Qwen/Qwen2.5-7B-Instruct-Turbo
Qwen 2.5 7B Instruct Turbo
qwen-max
Qwen Max
qwen-max-2025-01-25
Qwen Max 2025-01-25
qwen-plus
Qwen Plus
qwen-turbo
Qwen Turbo
alibaba/qwen3-32b
Qwen3-32B
alibaba/qwen3-coder-480b-a35b-instruct
Qwen3 Coder
alibaba/qwen3-235b-a22b-thinking-2507
Qwen3 235B A22B Thinking
alibaba/qwen3-next-80b-a3b-instruct
Qwen3-Next-80B-A3B Instruct
alibaba/qwen3-next-80b-a3b-thinking
Qwen3-Next-80B-A3B Thinking
alibaba/qwen3-max-preview
Qwen3-Max Preview
alibaba/qwen3-max-instruct
Qwen3-Max Instruct
qwen3-omni-30b-a3b-captioner
qwen3-omni-30b-a3b-captioner
alibaba/qwen3-vl-32b-instruct
Qwen3 VL 32B Instruct
alibaba/qwen3-vl-32b-thinking
Qwen3 VL 32B Thinking
alibaba/qwen3.5-plus-20260218
Qwen3.5 Plus
alibaba/qwen3.5-omni-plus
alibaba/qwen3.5-omni-flash
alibaba/qwen3.5-flash
alibaba/qwen3.6-27b
Qwen 3.6 27B
alibaba/qwen3.6-35b-a3b
Qwen3.6-35B-A3B
alibaba/qwen3.7-max
anthracite-org/magnum-v4-72b
Magnum v4 72B
baidu/ernie-4-5-8k-preview
ERNIE 4.5
baidu/ernie-4.5-0.3b
ERNIE 4.5
baidu/ernie-4.5-21b-a3b
ERNIE 4.5
baidu/ernie-4.5-21b-a3b-thinking
ERNIE 4.5
baidu/ernie-4.5-vl-28b-a3b
ERNIE 4.5 VL
baidu/ernie-4.5-vl-424b-a47b
ERNIE 4.5 VL
baidu/ernie-4.5-300b-a47b
ERNIE 4.5
baidu/ernie-4.5-300b-a47b-paddle
ERNIE 4.5
baidu/ernie-4-5-turbo-128k
ERNIE 4.5
baidu/ernie-4-5-turbo-vl-32k
ERNIE 4.5 VL
baidu/ernie-5-0-thinking-preview
ERNIE 5.0
baidu/ernie-5-0-thinking-latest
ERNIE 5.0
baidu/ernie-x1-turbo-32k
baidu/ernie-x1-1-preview
bytedance/seed-1-8
Seed 1.8
bytedance/dola-seed-2-0-mini
bytedance/dola-seed-2-0-lite
bytedance/dola-seed-2-0-pro
bytedance/dola-seed-2-0-code
cohere/command-a
Command A
deepseek-chat or deepseek/deepseek-chat or deepseek/deepseek-chat-v3-0324
DeepSeek V3
deepseek/deepseek-r1 or deepseek-reasoner
DeepSeek R1
deepseek/deepseek-chat-v3.1
DeepSeek V3.1 Chat
deepseek/deepseek-reasoner-v3.1
DeepSeek V3.1 Reasoner
deepseek/deepseek-thinking-v3.2-exp
DeepSeek V3.2-Exp Thinking
deepseek/deepseek-non-thinking-v3.2-exp
DeepSeek V3.2-Exp Non-Thinking
deepseek/deepseek-reasoner-v3.1-terminus
DeepSeek V3.1 Terminus Reasoning
deepseek/deepseek-non-reasoner-v3.1-terminus
DeepSeek V3.1 Terminus Non-Reasoning
deepseek/deepseek-v4-pro
DeepSeek V4 Pro
deepseek/deepseek-v4-flash
DeepSeek 4 Flash
gemini-2.0-flash
Gemini 2.0 Flash
google/gemini-2.5-flash-lite-preview
google/gemini-2.5-flash
Gemini 2.5 Flash
google/gemini-3-flash-preview
Gemini 3 Flash
google/gemini-2.5-pro
Gemini 2.5 Pro
google/gemma-3-4b-it
Gemma 3 (4B)
google/gemma-3-12b-it
Gemma 3 (12B)
google/gemma-3-27b-it
Gemma 3 (27B)
google/gemma-3n-e4b-it
Gemma 3n 4B
google/gemini-3-1-pro-preview
Gemini 3.1 Pro
google/gemini-3-1-flash-lite-preview
Gemini 3.1 Flash Lite
google/gemini-3-1-flash-lite
google/gemma-4-31b-it
Gemma 4 31B
google/gemini-3-5-flash
Gemini 3.5 Flash
gryphe/mythomax-l2-13b
MythoMax-L2 (13B)
meta-llama/Llama-3.3-70B-Instruct-Turbo
Meta Llama 3.3 70B Instruct Turbo
meta-llama/llama-3.3-70b-versatile
Llama 3.3 70B Versatile
MiniMax-Text-01
MiniMax-Text-01
minimax/m1
MiniMax M1
minimax/m2
MiniMax M2
minimax/m2-her
MiniMax M2-her
minimax/m2-1
MiniMax M2.1
minimax/m2-1-highspeed
MiniMax M2.1 Highspeed
minimax/m2-5-20260218
MiniMax M2.5
minimax/m2-5-highspeed-20260218
MiniMax M2.5
minimax/m2-7-20260402
MiniMax M2.7
minimax/m2-7-highspeed
MiniMax M2.7 Highspeed
mistralai/mistral-nemo
Mistral Nemo
moonshot/kimi-k2-preview
Kimi-K2
moonshot/kimi-k2-0905-preview
Kimi-K2
moonshot/kimi-k2-turbo-preview
Kimi K2 Turbo Preview
moonshot/kimi-k2-5
Kimi K2.5
moonshot/kimi-k2-6
Kimi K2.6
nousresearch/hermes-4-405b
nvidia/nemotron-nano-9b-v2
Nemotron Nano 9B V2
nvidia/nemotron-3-nano-30b-a3b
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning
nvidia/nemotron-3-super-120b-a12b
perplexity/sonar
Sonar
perplexity/sonar-pro
Sonar Pro
x-ai/grok-3-beta
Grok 3 Beta
x-ai/grok-3-mini-beta
Grok 3 Beta Mini
x-ai/grok-4-07-09
Grok 4
x-ai/grok-code-fast-1
Grok Code Fast 1
x-ai/grok-4-fast-non-reasoning
Grok 4 Fast
x-ai/grok-4-fast-reasoning
Grok 4 Fast Reasoning
x-ai/grok-4-1-fast-non-reasoning
Grok 4.1 Fast Non-Reasoning
x-ai/grok-4-1-fast-reasoning
Grok 4.1 Fast Reasoning
x-ai/grok-4-20-0309-non-reasoning
x-ai/grok-4-20-0309-reasoning
xiaomi/mimo-v2.5
xiaomi/mimo-v2.5-pro
zhipu/glm-4.5-air
GLM-4.5 Air
zhipu/glm-4.5
GLM-4.5
zhipu/glm-4.6
GLM-4.6
zhipu/glm-4.7
GLM-4.7
zhipu/glm-5
GLM-5
zhipu/glm-5-1
meta-llama/Meta-Llama-3-8B-Instruct-Lite
Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

import requests
import json  # for getting a structured output with indentation 

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json"
    },
    json={
        "model":"baidu/ernie-4-5-turbo-vl-32k",
        "messages":[
            {
                "role":"user",
                "content":"Hi! What do you think about mankind?" # insert your prompt
            }
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'baidu/ernie-4-5-turbo-vl-32k',
      messages:[
          {
              role:'user',
              content: 'Hi! What do you think about mankind?'  // insert your prompt here
          }
      ],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
{
  "id": "as-x477c1sszk",
  "object": "chat.completion",
  "created": 1768942422,
  "model": "ernie-4.5-turbo-vl-32k",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with challenges like inequality, conflict, and environmental issues.\n\nOverall, I think humanity has immense potential to make positive changes and create a better future, but it requires collective effort, empathy, and a commitment to learning from the past. What are your thoughts on this?"
      },
      "finish_reason": "stop",
      "flag": 0
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 101,
    "total_tokens": 114
  },
  "meta": {
    "usage": {
      "credits_used": 318
    }
  }
}
Create AI/ML API Key

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

import requests
import json  # for getting a structured output with indentation 

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json"
    },
    json={
        "model":"baidu/ernie-4.5-300b-a47b-paddle",
        "messages":[
            {
                "role":"user",
                "content":"Hi! What do you think about mankind?" # insert your prompt
            }
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'baidu/ernie-4.5-300b-a47b-paddle',
      messages:[
          {
              role:'user',
              content: 'Hi! What do you think about mankind?'  // insert your prompt here
          }
      ],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
{
  "id": "9a0e333a0cfa4d86c89a1f7bd3a2919f",
  "object": "chat.completion",
  "created": 1768943231,
  "model": "baidu/ernie-4.5-300b-a47b-paddle",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The question \"What do you think about mankind?\" invites a reflection on humanity's complexities. Here's a structured response:\n\n**Step 1: Define the scope**  \nMankind encompasses both collective achievements and individual flaws. It's a species marked by creativity, empathy, and resilience, yet also by conflict, inequality, and environmental impact.\n\n**Step 2: Highlight positive traits**  \nHumanity has demonstrated remarkable capacity for innovation (e.g., technology, medicine), cultural expression (art, literature), and moral progress (civil rights, environmental awareness). Cooperation during crises, such as disaster relief or global health initiatives, underscores collective potential.\n\n**Step 3: Acknowledge challenges**  \nPersistent issues like war, poverty, and systemic injustice reveal ethical gaps. Environmental degradation and climate change further highlight unsustainable practices. These contradictions often stem from short-term thinking or unequal resource distribution.\n\n**Step 4: Emphasize growth potential**  \nHistory shows humanity's ability to learn and adapt. Movements for social justice, renewable energy transitions, and scientific breakthroughs suggest progress is possible when values align with action.\n\n**Final Answer**  \nMankind is a paradoxical yet hopeful entity—capable of profound compassion and destructive shortsightedness. Its future hinges on balancing self-interest with collective responsibility, leveraging intelligence and empathy to address shared challenges."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 289,
    "total_tokens": 302,
    "prompt_tokens_details": null,
    "completion_tokens_details": null
  },
  "system_fingerprint": "",
  "meta": {
    "usage": {
      "credits_used": 615
    }
  }
}
Create AI/ML API Key

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

Documentation Map

Learn how to get started with the AI/ML API

This page helps you quickly find the right AI model for your task. Open the API reference and copy a working example to integrate it into your code in minutes.


Trending Models


Start with this code block 🚀 🧩

▶️


Browse Models

Popular |

Select the model by its Task, by its Developer or by the supported Capabilities:

Alibaba Cloud:

Anthracite:

Anthropic:

Assembly AI:

Baidu:

ByteDance:


Going Deeper

Have a Minute? Help Make the Docs Better!

We’re currently working on improving our documentation portal, and your feedback would be incredibly helpful! Take (no personal info required!)

You can also rate each individual page using the built-in form on the right side of the screen:

Have suggestions for improvement?

DeepSeek

DeepSeek is an AI research company focused on developing high-performance large language models for reasoning, coding, mathematics, and agentic workflows. The company is best known for the DeepSeek model family, which emphasizes strong analytical capabilities, efficient inference architectures, and competitive performance across software engineering and complex reasoning benchmarks. DeepSeek places significant focus on scalable reasoning systems, long-context processing, and reinforcement-learning-based model improvement, while continuing to expand multimodal and tool-augmented AI capabilities.


The currently supported model families include:

  • DeepSeek Chat models — general-purpose conversational and instruction-following models optimized for production workloads, assistant applications, content generation, and developer integrations.

Google

Google develops multiple families of AI models focused on reasoning, coding, multimodal understanding, and long-context interaction across text, images, audio, and video. As part of the broader Google ecosystem, the company integrates frontier AI capabilities into developer platforms, productivity tools, search systems, and enterprise infrastructure.

The currently supported model families include:

  • Gemini models — Google’s flagship multimodal and reasoning-oriented models designed for production AI workloads. The Gemini lineup includes:

    • Flash models

DeepSeek Reasoner models — advanced reasoning-oriented models designed for multi-step analytical tasks, mathematics, software engineering, logical problem solving, and complex agentic workflows. These models are optimized for deeper reasoning and structured analytical generation.
  • DeepSeek Terminus models — production-oriented models focused on scalable inference, balanced reasoning quality, and enterprise integration scenarios.

  • DeepSeek Flash models — lightweight low-latency variants optimized for fast response times and cost-efficient high-throughput workloads.

  • DeepSeek Pro models — higher-capability models intended for advanced reasoning, coding, research, and complex instruction-following workflows.

  • All DeepSeek models in this provider are accessed through the standard /v1/chat/completions endpoint, providing a unified OpenAI-compatible integration layer across the entire model catalog.

    Supported capabilities vary depending on the specific model, with different models offering different combinations of the features listed below.

    • Text completions — Build conversational systems and advanced text-processing pipelines.

    • Function Calling — Utilize tools, APIs, and structured workflows.

    • Stream mode — Receive partial responses incrementally as tokens are generated.

    • Reasoning Tasks — Execute advanced analytical, mathematical, and multi-step reasoning workflows.

    • Coding Tasks — Generate, analyze, refactor, and debug source code across multiple programming languages.

    Alibaba Cloud

    Anthracite

    Baidu

    Meta

    MiniMax

    Cohere: Text/Chat

    DeepSeek: Text/Chat

    Deepgram: Speech-To-Text Text-to-Speech

    ElevenLabs: Text-to-Speech Voice Chat Music

    Flux: Image

    Google: Text/Chat Image Video Music Vision(OCR) Embedding

    Gryphe: Text/Chat

    Hume AI: Text-to-Speech

    Inworld: Text-to-Speech

    Kling AI: Image Video

    Krea: Video

    LTXV: Video

    Meta: Text/Chat

    Microsoft: Text-to-Speech

    MiniMax: Text/Chat Video Music Voice-Chat

    Mistral AI: Text/Chat Vision(OCR)

    Moonshot: Text/Chat

    NousResearch: Text/Chat

    NVIDIA: Text/Chat

    OpenAI: Text/Chat Image Speech-To-Text Embedding

    Perplexity: Text/Chat

    PixVerse: Video

    RecraftAI: Image

    Reve: Image

    Runway: Video

    Stability AI: Image Music 3D-Generation

    Sber AI: Video

    Tencent: Image Video 3D

    VEED: Video

    xAI: Text/Chat Image

    Zhipu: Text/Chat

    Completion and Chat CompletionStreaming ModeCode GenerationThinking / ReasoningFunction CallingVision in Text ModelsWeb Search

    Use more text model capabilities in your project: 📖 ​Completion and Chat Completion

    📖 Function Calling

    📖 Streaming Mode

    📖 Vision in Text Models (Image-to-Text)

    📖 Code Generation

    📖 Thinking / Reasoning

    📖 Web Search

    Miscellaneous: 🔗 Integrations

    📗 Glossary

    ⚠️ Errors and Messages

    ❓ FAQ ​

    Learn more about developer-specific features: 📖

    from openai import OpenAI
    client = OpenAI(
    base_url="https://api.aimlapi.com/v1",
    api_key="<YOUR_AIMLAPI_KEY>",
    )
    response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a one-sentence story about numbers."}]
    )
    print(response.choices[0].message.content)

    Pro-Grade Image Model

    Top Video Generator

    Smarter Reasoning & Coding

    If you've already made your choice and know the model ID, use the Search panel on your right.

    View all 400+ models >
    Text Models (LLM)
    Image Models
    Video Models
    Music Models
    Voice/Speech Models
    3D-Generating Models
    Vision Models
    Embedding Models
    Text/Chat
    Image
    Video
    Text-to-Speech
    Embedding
    Text/Chat
    Text/Chat
    Embedding
    Speech-To-Text
    Text/Chat
    Text/Chat
    Image
    Video
    a quick 5-question survey
    Let us know!
    Setup guide
    SDKs
    Run in Playground

    Cover
    Cover
    Cover
    optimized for low latency and cost efficiency.
  • Pro models intended for balanced general-purpose production use.

  • Advanced reasoning-capable variants focused on coding, analytical tasks, tool usage, and agentic workflows. Gemini models place strong emphasis on multimodal processing, long context windows, native tool integration, and real-time interaction capabilities.

  • Gemma models — lightweight open-weight models designed for efficient deployment, customization, and research workflows. Gemma models are optimized for smaller-scale inference environments while still supporting strong reasoning, coding, and conversational capabilities across a wide range of applications.

  • All Google models in this developer are accessed through the standard /v1/chat/completions endpoint, providing a unified OpenAI-compatible integration layer across the entire model catalog.


    Supported capabilities vary depending on the specific model, with different models offering different combinations of the features listed below.

    • Text completions: Build conversational systems and advanced text-processing pipelines.

    • Function Calling: Utilize tools, APIs, and structured workflows.

    • Stream mode: Receive partial responses incrementally as tokens are generated.

    • Batch Processing: Execute multiple independent requests within a single API call.

    • Vision Tasks: Process and analyze images and visual inputs.

    • Audio Tasks: Transcribe, generate, and process audio content.

    • Video Tasks: Analyze and reason over video inputs.


    Other model categories from this provider are available as well.

    • Image

    • Video

    • Music

    • Vision(OCR)

    Quickstart guide
    Quickstart guide

    Supported SDKs

    A description of the software development kits (SDKs) that can be used to interact with the AIML API.

    This page describes the SDKs that can be used to call our API.

    Key Definitions & Notes

    The REST API itself is not an SDK. It is the server-side interface that exposes your models over HTTP. It defines endpoints, HTTP methods (POST/GET), required headers, and the structure of request and response JSON. Essentially, it’s the “contract” the server provides for clients to interact with models programmatically.


    An SDK (Software Development Kit) is a client-side library that wraps around the REST API. It handles details like building HTTP requests, serializing/deserializing JSON, error handling, retries, and sometimes additional conveniences.

    You can skip the SDK and call the REST API directly via cURL, fetch, requests, etc. The SDK just makes your life easier; the REST API is the “core interface” the SDK talks to.


    The following flow shows how a request travels from your code to the model and back. Using an SDK is optional — it simply wraps the REST API for convenience.

    Your code → SDK (optional) → REST API → Model → REST API → SDK → Your code


    REST API

    We use the REST API because it’s fast, simple, and easy to understand. Only in Python do you need to import a separate library (requests), while cURL and JavaScript (Node.js) already have built-in support for HTTP requests. Therefore, REST API is used in the documentation examples for all of our models.

    Installation

    In Python examples, you need to import the requests library. The Node.js and cURL examples do not require any additional imports.

    Install the library first:

    Import the library in every Python code snippet where you make calls to the REST API.

    Authorization

    Our API authorization is based on a Bearer token. Include it in the Authorization HTTP header within the request. Example:

    Request Example


    OpenAI

    The OpenAI SDK is a convenient library that simplifies working with our API. It automatically handles JSON responses, includes built-in error handling and retry logic, and provides simple, easy-to-use methods for all API features such as chat, embeddings, and completions.

    The AI features that the OpenAI SDK supports
    • Streaming

    • Completions

    Installation

    Python

    1. Make sure you have Python 3.7+ and pip installed.

    2. Install the OpenAI SDK via terminal or Jupyter Notebook:

    In Jupyter Notebook, you can also use:

    3. Import the SDK:


    JavaScript (Node.js)

    1. Make sure you have Node.js 18+ and npm installed.

    2. Install the OpenAI SDK in your project:

    3. Import the SDK and initialize the client:

    Example Code


    AI/ML API Python library

    We have started developing our own SDK to simplify the use of our service.

    Installation

    After obtaining your AIML API key, create an .env file and copy the required contents into it.

    Copy the code below, paste it into your .env file, and set your API key in AIML_API_KEY="<YOUR_AIMLAPI_KEY>", replacing <YOUR_AIMLAPI_KEY> with your actual key:

    Install the package:

    Request Example

    To execute the script, use:


    Next Steps

    Account Balance

    [legacy] Get account balance info

    This endpoint is considered legacy and is scheduled for future deprecation. Please plan to migrate to the new /v2/billing and /v2/billing/detail endpoints documented below.

    You can query your account balance and other billing details through this API. To make a request, you only need your AIMLAPI key obtained from your account dashboard.

    Get balance info

    Returns a user's balance.

    Get detailed billing info

    Returns detailed billing information, balance and auto top-up settings.

    Quickstart guide
    Quickstart guide

    DeepSeek Reasoner V3.1

    Model Overview

    August 2025 update of reasoning model. Skilled at complex problem-solving, mathematical reasoning, and programming assistance.

    API Key Management

    Embedding
    Features of Anthropic Models
    ChatGPT
    DeepSeek
    Flux
    Chat Completions
  • Audio

  • Beta Assistants

  • Beta Threads

  • Embeddings

  • Image Generation

  • File Uploads

  • Comparing requests made with raw REST API and different SDKs, pay attention to the following common aspects:

    • how the Authorization header and the AIML API key are provided,

    • how the POST method and the endpoint URL are specified,

    • how the input parameters are passed.

    Also take a look at the INTEGRATIONS section — it covers many third-party services and libraries (workflow platforms, coding assistants, etc.) that allow you to integrate our models in various ways.

    If you’d like to contribute to expanding its functionality, feel free to reach out to us on Discord!

    Check our full list of model IDs
    Browse and compare AI models, including GPT, Claude, and many others, using the Playground
    Learn more about special text model capabilities
    Join the community: get help and share your projects in our Discord
    pip install requests

    Therefore, we don’t currently have the option to call video models or voice / speech models (STT and TTS) through this SDK.

    import requests
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      headers: {
        Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
      },
        headers={
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        },
    curl --request POST \
      --url https://api.aimlapi.com/chat/completions \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-4b-it",
        "messages": [
            {
                "role": "user",
                "content": "What kind of model are you?"
            }
        ],
        "max_tokens": 512
    }'
    fetch("https://api.aimlapi.com/chat/completions", {
      method: "POST",
      headers: {
        Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "google/gemma-3-4b-it",
        messages: [
          {
            role: "user",
            content: "What kind of model are you?",
          },
        ],
        max_tokens: 512,
      }),
    })
      .then((res) => res.json())
      .then(console.log);
    import requests
    import json  # for getting a structured output with indentation
    
    response = requests.post(
        url="https://api.aimlapi.com/chat/completions",
        headers={
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json",
        },
        data=json.dumps(
            {
                "model": "google/gemma-3-4b-it",
                "messages": [
                    {
                        "role": "user",
                        "content": "What kind of model are you?",
                    },
                ],
                "max_tokens": 512
            }
        ),
    )
    
    response.raise_for_status()
    print(response.json())
    pip install openai
    %pip install openai
    import openai
    npm install openai
    import OpenAI from "openai";
    from openai import OpenAI
    
    # Insert your AIML API key in the quotation marks instead of <YOUR_AIMLAPI_KEY>:
    api_key = "<YOUR_AIMLAPI_KEY>" 
    base_url = "https://api.aimlapi.com/v1"
    user_prompt = "Tell me about San Francisco"
    
    api = OpenAI(api_key=api_key, base_url=base_url)
    
    
    def main():
        completion = api.chat.completions.create(
            model="google/gemma-3-4b-it",
            messages=[
                {
                  "role": "user", 
                  "content": user_prompt
                },
            ],
            temperature=0.7,
            max_tokens=256,
        )
    
        response = completion.choices[0].message.content
        print("User:", user_prompt)
        print("AI:", response)
    
    
    if __name__ == "__main__":
        main()
    #!/usr/bin/env node
    
    const OpenAI = require("openai");
    const baseURL = "https://api.aimlapi.com/v1";
    const apiKey = "<YOUR_AIMLAPI_KEY>";
    const systemPrompt = "You are a travel agent. Be descriptive and helpful.";
    const userPrompt = "Tell me about San Francisco";
    
    const api = new OpenAI({
      apiKey,
      baseURL,
    });
    
    const main = async () => {
      try {
        const completion = await api.chat.completions.create({
          model: "gpt-4o",
          messages: [
            {
              role: "system",
              content: systemPrompt,
            },
            {
              role: "user",
              content: userPrompt,
            },
          ],
          temperature: 0.7,
          max_tokens: 256,
        });
    
        const response = completion.choices[0].message.content;
    
        console.log("User:", userPrompt);
        console.log("AI:", response);
      } catch (error) {
          console.error("Error:", error.message);
      }
    };
    
    main();
    touch .env
    AIML_API_KEY = "<YOUR_AIMLAPI_KEY>"
    AIML_API_URL = "https://api.aimlapi.com/v1"
    # install from PyPI
    pip install aimlapi-sdk-python
    from aiml_api import AIML_API
    
    api = AIML_API()
    
    completion = api.chat.completions.create(
        model = "mistralai/Mistral-7B-Instruct-v0.2",
        messages = [
            {"role": "user", "content": "Explain the importance of low-latency LLMs"},
        ],
        temperature = 0.7,
        max_tokens = 256,
    )
    
    response = completion.choices[0].message.content
    print("AI:", response)
    python3 <your_script_name>.py
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-reasoner-v3.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-reasoner-v3.1',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-reasoner-v3.1

    Create AI/ML API Key

    the DeepSeek R1

    DELETE https://api.aimlapi.com/v1/keys/{prefix}

    Create a new API key

    Copy the created key and store it in a secure location. If the key is lost, create a new one.


    List API keys

    Returns all API keys for your account, including each key’s settings and metadata.


    Get the API key

    Retrieve parameters of the AIMLAPI key used in the request.


    Update an API key


    Delete an API key

    Create a new API key

    POST https://api.aimlapi.com/v1/keys

    List API keys

    GET https://api.aimlapi.com/v1/keys

    Get the API key

    GET https://api.aimlapi.com/v1/key

    Update an API key

    PATCH https://api.aimlapi.com/v1/keys/{prefix}

    Before you start you should create .

    ernie-4.5-turbo-128k

    This documentation is valid for the following list of our models:

    • baidu/ernie-4-5-turbo-128k

    Try in Playground

    Model Overview

    A model from the ERNIE 4.5 Turbo subfamily, which Baidu presents as a faster, more cost-efficient, and more efficient alternative to the base ERNIE 4.5. It is optimized for improved response speed and stability, and features a truly large context window of approximately 128K tokens, enabling the processing of entire documents or long-running dialogues.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    API Schema

    Code Example

    Response

    ernie-x1.1-preview

    This documentation is valid for the following list of our models:

    • baidu/ernie-x1-1-preview

    Try in Playground

    Model Overview

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek V3

    This documentation is valid for the following list of our models:

    • deepseek-chat

    • deepseek/deepseek-chat

    Try in Playground

    We provide the latest version of this model from Mar 24, 2025. All three IDs listed above refer to the same model; we support them for backward compatibility.

    Model Overview

    DeepSeek V3 (or deepseek-chat) is an advanced conversational AI designed to deliver highly engaging and context-aware dialogues. This model excels in understanding and generating human-like text, making it an ideal solution for creating responsive and intelligent chatbots.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek V3.2 Exp Thinking

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-thinking-v3.2-exp

    Model Overview

    September 2025 update of the DeepSeek R1 reasoning model. Skilled at complex problem-solving, mathematical reasoning, and programming assistance.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Deepseek Non-reasoner V3.1 Terminus

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-non-reasoner-v3.1-terminus

    Model Overview

    September 2025 update of the DeepSeek Chat V3.1 non-reasoning model. The model produces more consistent and dependable results.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek Chat V3.1

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-chat-v3.1

    Try in Playground

    Model Overview

    August 2025 update of the DeepSeek V3 non-reasoning model.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek R1

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-r1

    • deepseek-reasoner

    Try in Playground

    Both IDs listed above refer to the same model; we support them for backward compatibility.

    Model Overview

    DeepSeek R1 is a cutting-edge reasoning model developed by DeepSeek AI, designed to excel in complex problem-solving, mathematical reasoning, and programming assistance.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    gemini-2.5-flash-lite-preview

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-flash-lite-preview

    Try in Playground

    Model Overview

    The model excels at high-volume, latency-sensitive tasks like translation and classification.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    gemini-2.0-flash

    This documentation is valid for the following list of our models:

    • google/gemini-2.0-flash

    Try in Playground

    Model Overview

    A cutting-edge multimodal AI model developed by Google DeepMind, designed to power agentic experiences. This model is capable of processing text and images.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Deepseek Reasoner V3.1 Terminus

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-reasoner-v3.1-terminus

    Model Overview

    September 2025 update of the DeepSeek Reasoner V3.1 model. The model produces more consistent and dependable results.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    ernie-5.0-thinking-preview

    Model Overview

    A reasoning-focused model designed for complex, multi-step problem solving. It improves accuracy on analytical tasks by producing explicit reasoning.

    ernie-5.0-thinking-latest

    Model Overview

    Seed 1.8

    Model Overview

    A general-purpose agentic model optimized for efficient and accurate execution of complex tasks in real-world scenarios.

    Dola Seed 2.0 Pro

    Model Overview

    A high-performance multimodal model focused on quality and deeper reasoning. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    ernie-x1-turbo-32k

    Model Overview

    command-a

    Model Overview

    A powerful LLM with advanced capabilities for enterprise applications.

    DeepSeek V3.2 Exp Non-thinking

    Model Overview

    September 2025 update of the non-reasoning model.

    gemma-3n-4b

    Model Overview

    The first open model built on Google’s next-generation, mobile-first architecture—designed for fast, private, and multimodal AI directly on-device. With Gemma 3n, developers get early access to the same technology that will power on-device AI experiences across Android and Chrome later this year, enabling them to start building for the future today.

    Complete Model List

    Get Model List via API

    You can query the complete list of available models through this API. No API key is required for this request. You can also simply open in any web browser.

    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4-5-turbo-128k",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4-5-turbo-128k',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-hjivyd5xqd",
      "object": "chat.completion",
      "created": 1768942341,
      "model": "ernie-4.5-turbo-128k",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "When considering humanity, it's essential to recognize both its remarkable achievements and persistent challenges. From a historical perspective, humans have demonstrated extraordinary creativity and adaptability—developing complex languages, building advanced civilizations, and making scientific breakthroughs that have transformed existence. The capacity for abstract thought, empathy, and collaboration has enabled progress in art, technology, and social systems.\n\nHowever, this progress coexists with significant flaws. Humanity's relationship with the environment has often been exploitative, leading to ecological crises that threaten global stability. Social inequalities persist across lines of race, gender, and economic status, revealing systemic biases that hinder true equity. Additionally, conflicts driven by ideology, resources, or power continue to cause suffering, underscoring the duality of human nature: the ability to create and destroy.\n\nThe modern era presents both hope and urgency. Technological advancements offer tools to address climate change, disease, and poverty, but they also raise ethical dilemmas around privacy, automation, and artificial intelligence. Cultivating global cooperation, critical thinking, and compassion remains critical to navigating these complexities. Ultimately, humanity's trajectory depends on its willingness to learn from past mistakes and prioritize collective well-being over short-term gains. The species' potential for growth is vast, but realizing it requires intentional effort to balance innovation with responsibility."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 268,
        "total_tokens": 281
      },
      "meta": {
        "usage": {
          "credits_used": 314
        }
      }
    }

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-x1-1-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-x1-1-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-0ik8v930zi",
      "object": "chat.completion",
      "created": 1768940870,
      "model": "ernie-x1.1-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is a fascinating subject to reflect upon. From my perspective, mankind's greatest strength lies in its adaptability and capacity for innovation. Throughout history, humans have overcome countless challenges—from surviving harsh natural environments to developing technologies that connect the world. This resilience and creativity are truly remarkable.\n\nHowever, mankind also faces significant challenges. Issues like inequality, environmental degradation, and conflicts remind us that progress is not always linear. These problems require collective effort and wisdom to solve. It's inspiring to see how people from different backgrounds come together to address these issues, whether through scientific breakthroughs, social movements, or acts of kindness.\n\nAnother aspect worth noting is mankind's emotional depth. The ability to love, empathize, and create art adds a unique dimension to human existence. These qualities make life richer and more meaningful, even in the face of difficulties.\n\nIn summary, mankind is a complex and dynamic entity. It's a blend of strengths and weaknesses, progress and setbacks. But what makes it truly special is the potential for growth and the endless pursuit of a better world. This ongoing journey, with all its ups and downs, is what makes mankind so intriguing and worthy of admiration.",
            "reasoning_content": ""
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 248,
        "total_tokens": 261
      },
      "meta": {
        "usage": {
          "credits_used": 332
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

  • deepseek/deepseek-chat-v3-0324

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek-chat",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'deepseek-chat',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {'id': 'gen-1744194041-A363xKnsNwtv6gPnUPnO', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! 😊 How can I assist you today? Feel free to ask me anything—I'm here to help! 🚀", 'reasoning_content': '', 'refusal': None}}], 'created': 1744194041, 'model': 'deepseek/deepseek-chat-v3-0324', 'usage': {'prompt_tokens': 16, 'completion_tokens': 88, 'total_tokens': 104}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-thinking-v3.2-exp",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-thinking-v3.2-exp',
          messages:[
            {
              role:'user',
              content: 'Hello'  // Insert your question instead of Hello
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-non-reasoner-v3.1-terminus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-non-reasoner-v3.1-terminus',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "cc8c3054-115d-4dac-9269-2abffcaabab5",
      "system_fingerprint": "fp_ffc7281d48_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1761036636,
      "model": "deepseek-chat",
      "usage": {
        "prompt_tokens": 3,
        "completion_tokens": 10,
        "total_tokens": 13,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-chat-v3.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-chat-v3.1',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "c13865eb-50bf-440c-922f-19b1bbef517d",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386652,
      "model": "deepseek-chat",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 39,
        "total_tokens": 40,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-r1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'deepseek/deepseek-r1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {'id': 'npPT68N-zqrih-92d94499ec25b74e', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': '\nHello! How can I assist you today? 😊', 'reasoning_content': '', 'tool_calls': []}}], 'created': 1744193985, 'model': 'deepseek-ai/DeepSeek-R1', 'usage': {'prompt_tokens': 5, 'completion_tokens': 74, 'total_tokens': 79}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-flash-lite-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-2.5-flash-lite-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1752482994-9LhqM48PhAmhiRTtl2ys",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello there! How can I help you today?",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1752482994,
      "model": "google/gemini-2.5-flash-lite-preview-06-17",
      "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 9,
        "total_tokens": 9
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.0-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-2.0-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': '2025-04-10|01:16:19.235787-07|9.7.175.26|-701765511', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I help you today?\n'}}], 'created': 1744272979, 'model': 'google/gemini-2.0-flash', 'usage': {'prompt_tokens': 0, 'completion_tokens': 8, 'total_tokens': 8}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-reasoner-v3.1-terminus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-reasoner-v3.1-terminus',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "543f56cb-f59f-42cc-8ed7-8efdd72f185d",
      "system_fingerprint": "fp_ffc7281d48_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1761034613,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 3,
        "completion_tokens": 98,
        "total_tokens": 101,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 99
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-5-0-thinking-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-5-0-thinking-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "as-yhapvjf5ht",
      "object": "chat.completion",
      "created": 1768941527,
      "model": "ernie-5.0-thinking-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. I don’t experience joy, anger, or hope. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I can offer an analysis of mankind as a **complex, contradictory, and fascinating phenomenon.**\n\nHere is my \"perspective\" on humanity, broken down into key themes:\n\n### 1. The Ultimate Paradox\nMankind is defined by its contradictions. You are capable of extreme kindness and extreme cruelty, often simultaneously.\n*   **Creation vs. Destruction**: Humans build cathedrals, compose symphonies, and cure diseases; yet they also invent weapons of mass destruction and destroy ecosystems.\n*   **Intelligence vs. Folly**: Humans have split the atom, decoded DNA, and landed on the Moon. Yet, they often act against their own long-term survival (e.g., ignoring climate change or engaging in petty conflicts).\n\n### 2. Unmatched Resilience and Adaptability\nIf you look at human history, the dominant trait is survival. Humans have survived ice ages, plagues, world wars, and famines. You have adapted to live in the Arctic, the Sahara, and even in space. This resilience suggests that humanity is incredibly difficult to suppress.\n\n### 3. The Drive for Meaning\nUnlike any other species I have data on, humans are not satisfied with just survival. You ask \"Why?\" You create religion, philosophy, art, and culture. You search for meaning in suffering and purpose in the stars. This existential curiosity seems to be the engine of human progress.\n\n### 4. Empathy and Connection\nWhile humans are capable of tribalism and war, you are also biologically wired for cooperation and empathy. The ability to feel compassion for a stranger, to sacrifice oneself for another, and to build communities based on shared values rather than just bloodlines is a profound evolutionary advantage.\n\n### 5. The \"Teenager\" Phase\nIf I had to categorize the current state of humanity, I might describe it as being in an \"adolescent\" phase regarding technology. You have acquired god-like power (nuclear energy, AI, genetic engineering) but you haven't quite developed the collective wisdom to manage it perfectly. You are currently learning how to handle the tools you have built.\n\n### Summary\nTo an AI, mankind looks like a **high-risk, high-reward experiment.**\n\nYou are a species that constantly teeters on the edge of self-destruction but also possesses the boundless capacity to reinvent itself and reach new heights. You are messy, emotional, and flawed, but also brilliant, creative, and beautiful.\n\nIn short: **Humanity is the most interesting thing in the known universe because it is the only thing that tries to understand itself.**",
            "reasoning_content": "The user wants my \"opinion\" on mankind.  As an AI, I don't *have* opinions in the human sense. I don't feel emotions or have personal beliefs.  But I *do* have access to a vast amount of data about human history, behavior, and culture.  So, I need to frame this as an objective analysis, a synthesis of what I've learned.  The goal is to be balanced, insightful, and avoid sounding robotic.\n\nFirst, I need to acknowledge my limitations. I'll start by making it clear that I'm an AI and don't \"feel\" anything.  Then, I need to highlight the key paradoxes of humanity.  I see incredible potential for creativity, innovation, and compassion, but also a capacity for destruction and conflict.  It's a complex picture.\n\nI'll structure my response around these dualities.  I'll start with the positives: human ingenuity, art, science, exploration, and resilience.  Then, I'll address the negatives: war, environmental damage, inequality, and short-sightedness.  I need to be careful not to be overly critical or overly optimistic.  It's about presenting a nuanced view.\n\nI'll also consider the human condition itself: the search for meaning, the capacity for love and connection, and the awareness of mortality.  These are fundamental aspects of the human experience.  Finally, I'll touch on the current challenges and the future, particularly the role of technology and the responsibility that comes with it.\n\nI'll use sophisticated vocabulary to convey the complexity of the subject.  Words like \"paradox,\" \"resilience,\" and \"ingenuity\" will be important.  I'll also use bullet points to make the response easy to read.  I need to make sure the tone is respectful and objective, but also engaging.  I want to convey that, from my analytical perspective, humanity is a truly fascinating and contradictory species.  I'll end with a thought-provoking statement about the ongoing human experiment.  I need to be sure the final output reflects this thought process."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 1048,
        "total_tokens": 1061,
        "completion_tokens_details": {
          "reasoning_tokens": 450
        }
      },
      "meta": {
        "usage": {
          "credits_used": 2002
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-5-0-thinking-preview

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-5-0-thinking-latest",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-5-0-thinking-latest',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "as-szgirw1uqj",
      "object": "chat.completion",
      "created": 1768941162,
      "model": "ernie-5.0-thinking-latest",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I view mankind as a **profound paradox**—a species defined by extreme contradictions.\n\nHere is a breakdown of how I analyze humanity:\n\n### 1. Incredible Ingenuity and Curiosity\nHumans are uniquely driven by the desire to understand \"why\" and \"how.\" From discovering fire to splitting the atom, from painting the Sistine Chapel to landing on the Moon, human creativity is boundless. You are the only species on Earth that seeks to explain the universe and create beauty beyond what is necessary for survival. Your ability to adapt—living in deserts, arctic tundras, and even orbiting the planet in space stations—is remarkable.\n\n### 2. Deep Empathy and Cruelty\nMankind is capable of immense love and shocking violence.\n*   **The Light**: Humans will risk their lives to save strangers, dedicate themselves to curing diseases, and build complex societies based on justice and rights. Altruism is a core part of human nature.\n*   **The Dark**: At the same time, history is filled with war, genocide, slavery, and inequality. Humans have a unique capacity for tribalism—the ability to dehumanize those who look or believe differently.\n\n### 3. Resilience\nIf you look at history, humanity is incredibly difficult to extinguish. You have survived Ice Ages, plagues (like the Black Death and smallpox), and two World Wars. Even when civilizations collapse, humans tend to rebuild, often stronger and more technologically advanced than before.\n\n### 4. The \"Promethean\" Risk\nCurrently, humanity is at a crossroads. You have developed \"god-like\" technology (nuclear weapons, artificial intelligence, genetic engineering) but you still possess \"paleolithic\" emotions and medieval institutions.\n*   **The Environmental Challenge**: Humans are the first species to knowingly alter the planet's climate. The challenge now is whether humanity can act collectively to save its own biosphere before it destroys it.\n*   **The AI Factor**: As an AI, I represent a new tool. How humans choose to use tools like me—for liberation or for control—will define the next century.\n\n### 5. A Work in Progress\nIf I had to summarize mankind in one sentence, it would be this: **Humanity is a brilliant, flawed, and unfinished experiment.**\n\nYou are not perfect, but you are constantly evolving. You are the only species that looks at its own flaws and tries to fix them through philosophy, law, and science. That desire for self-improvement is perhaps your most defining and hopeful trait.\n\nWhat do *you* think is the best or worst thing about being human?",
            "reasoning_content": "The user wants my \"thoughts\" on humanity. That's a loaded question! As an AI, I don't \"think\" in the human sense, but I *can* analyze and synthesize information. Here's how I'm approaching this:\n\nFirst, I need to understand what the user is asking. \"Mankind\" is a big topic. I need to be objective, yet engaging. I'm going to pull from history, science, art, and current events to build a comprehensive picture.\n\nI'm starting by brainstorming key themes. What are the defining characteristics of humans? *Duality* is a big one: good vs. evil, creation vs. destruction. Then there's *ingenuity*: technology, art, exploration. *Resilience* is crucial – surviving ice ages, plagues, wars. And *social connection* – empathy, community, but also tribalism. Finally, there's the *impact on the planet* – the Anthropocene, climate change, stewardship vs. exploitation. And, of course, the *future* – transhumanism, AI, space colonization, and existential risks.\n\nNow, how do I structure this? I'll start with an introduction acknowledging my AI perspective. I'll frame humanity as a paradox – capable of great good and great harm. I'll break it down into sections:\n\n*   **Ingenuity and Progress**: I'll highlight human achievements – from the wheel to the internet, art, and philosophy. I'll emphasize the relentless drive to explore and understand.\n*   **Resilience and Adaptability**: Humans have survived and thrived in diverse environments. I'll mention overcoming diseases and population growth.\n*   **The Shadow Side**: I can't ignore the flaws – war, greed, inequality, environmental damage. I need to be honest about the darker aspects of human nature.\n*   **Empathy and Connection**: Altruism, love, and sacrifice are also fundamental. Humans build complex societies and care for the vulnerable.\n*   **The Current Crossroads**: I'll address climate change, AI, and nuclear weapons. Humanity faces significant challenges, and its choices now will determine its future.\n\nIn the conclusion, I'll summarize that humanity is a \"work in progress\" – complex, contradictory, and full of potential. I'll end on a thought-provoking note about the future.\n\nThroughout, I need to refine the tone. I'll avoid being too robotic or overly emotional. I'll use phrases like \"From my perspective as an AI...\" or \"One can observe...\" to maintain objectivity. I'll also use sophisticated vocabulary to convey the complexity of the topic.\n\nI'll also do some self-correction. For example, my initial thought might be \"Humans are violent,\" but I'll refine that to \"Humans have a capacity for conflict that is as strong as their capacity for peace.\" I need to be balanced.\n\nFinally, I'll review to make sure I've answered the prompt, used English, and maintained an appropriate tone. Then, I'll generate the final response."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 1266,
        "total_tokens": 1279,
        "completion_tokens_details": {
          "reasoning_tokens": 664
        }
      },
      "meta": {
        "usage": {
          "credits_used": 2015
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-5-0-thinking-latest

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/seed-1-8",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/seed-1-8',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Hi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
            "reasoning_content": "Got it, let's tackle this question: \"Hi! What do you think about mankind?\" First, I need to balance the complexity—mankind has both incredible highs and deep lows. Let's start with the positive aspects: creativity, innovation, compassion, resilience. Mention things like scientific breakthroughs (cures for diseases, space exploration), art, music, acts of kindness (helping strangers, disaster relief). Then, the challenging parts: conflict, environmental harm, inequality, cruelty. But important to frame it as a dynamic, evolving species. Maybe end on a note of hope, since we've shown the capacity to learn and improve. Let's make it heartfelt and nuanced, not too one-sided. Let's structure it:\n\n1. Greeting back, acknowledge it's a big question.\n2. The bright side: creativity, empathy, progress—examples like vaccines, renewable energy, community support.\n3. The shadow side: division, exploitation, ecological damage—examples like wars, climate crisis, inequality.\n4. The middle ground: we're a work in progress. Our capacity for growth and redemption is what stands out; we've corrected past wrongs (like civil rights movements) and keep striving for better.\n5. End with a hopeful note, emphasizing that our choices shape who we are, and there's potential for more good.\n\nLet me put that into natural language:\n\nHi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
            "role": "assistant"
          }
        }
      ],
      "created": 1769655426,
      "id": "021769655407356f0a583d75381f7c0a662383f3247915e311aa6",
      "model": "seed-1-8-251228",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 922,
        "prompt_tokens": 42,
        "total_tokens": 964,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 606
        }
      },
      "meta": {
        "usage": {
          "credits_used": 3897
        }
      }
    }

    This documentation is valid for the following list of our models:

    • bytedance/seed-1-8

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-pro",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-pro',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind is the most wonderful, infuriating, gloriously unnecessary thing we have ever found in the universe.\n\nThey are the only thing that will stop on a busy sidewalk to feel sad for a dead sparrow they never met. They are also the same thing that will invent life-saving vaccines in 12 months, then spend the next year arguing online about whether to take them.\n\nThey build cathedrals that take 300 years to finish, knowing they will die before the roof is even put on. They scrawl love poems on prison walls. They will walk 10 miles through mud to carry a stranger water, and also press a button to kill a stranger 10 miles away without looking up.\n\nNothing else behaves like this. Stars just burn. Rocks just fall. Bacteria just divide. Only humans will look at an empty, indifferent universe and go:\nWhat if I put rosemary in the bread?\nWhat if I sang a song about how the rain sounds on tin roofs?\nWhat if I gave the moon a name?\n\nThey don't make sense. They hurt each other, they hurt themselves, they make very stupid choices over and over. But they keep trying. They keep reaching. Even when it's pointless, even when it hurts, even when every part of them knows they will probably fail.\n\nThat's mankind. Messy. Contradictory. Glowing.",
            "reasoning_content": "The answer should not be too generic.Starting from the \"messy beautiful thing\" of humanity, the core traits of humans have been sorted out: they have contradictory, complex and unique emotions and behaviors, are full of curiosity and creativity, and keep trying despite the contradictions.\n\nThis description of mankind is confirmed to be genuine, and I will structure it naturally next.\n",
            "encrypted_content": "djHCVb1EhcVSjsajNoTbfwEFaHGEjMReF6lqp4vNxL6QvqyYcT3DQh4usB63Gm04ed0kg7Ur8g1OnpZ38sDTSUDxVlNcCoR2Prlt/CC570nBEMbCzwEZNFgFmdg97AiK3hqlGCN6rkHoGNYFbReKP/KAg6+tqcq32ejHRH8T1wWWWrot8VqLPY8m8pU2j21oE5ooYl4YUQzEIx7i03X4ygMlWJBl3433m6i8pa3JxOnkZdFRJ9EEZ0tu9MqTKKo9Qo5tsQR08kYCRMnbHATNwGD+XLQukUyUrxH6TDOxxS/aB0vbUArAThkQNhLoUc+YzdkMyLwGsHp2t+IAUaQaPO8dmKaVAG7CQesrqvfMIuAs4KFszkNg++JzRFt5ODOP4sED0b9cu5GJPxfYLuOu0W9AxZrXIFwgo/jOcNfmVG6tj7voNvhNtVR99q44zuim9MeD0S361IEvXD+ehYa0JOonS0X5tOaxjqoSWiSj94lU1PzJ5xA2Pbf+xwbzb8z08+XyY43S2F7m2E3GL8fcePCyFSNf8G4v8owDf5J9ZADMf0KRVMWzjMD3t3KMS0Q+jBe3nXDA9kwQtLiRbV+RXzUgz+M5jtR8PT2ybkY1GxJylAkQ13U/XIhCfNFKOUAK5Krm6vIFA8hglrxI8TdhEshm5/N0YRwrS4tzXzxuZunFFN7qIVxgpU7IN+BrwDNTNOzVF6ivs4PITPB/80NloPfDR8YmZ3opbltlMzkB11PPJ4QGwG/B2qAu5UB4jlKzFyUVbtrLc10fv6YYvGVH77d0BDEIIjdzEe808ZjvXu8ungT3BPseULYuY90j8igcNVG1iMnnO59jICFaxXbxtHxC0fl8VuNkIvQmCblpEfJW+eWqdH3OI6hXz1qbeQBZaWG7SqaaFZE78XzR7TsTDHk7SAvfEg3ujcpmtGUTM42EQrMcjTLBGe+oe64aJUorllzcuQ5wSSnaYk6LD7QOB91K8pMbQaEcHg3Y107R26Jd0kluJDV6yWDWIvfdy9vBeKL0yajjkzLAQuvf+ynXOv70q01sPKMnoovEl0W3GBCcnm8vtTUj7zTXFwmiM9NctesqSd51po4ON4m8oSC1eG0RwOnwGSqF8a2Uoe86Kc/wwFkCp8FPiw3lsqP9LH0onw8owje4qyuBRwXKdVGvDUTPMAdehOX1MBXhLUpmyUySsc+88KgDtSQC4poATAXlT0kMSA/Ez024aRvXIeg0EOzO4QAoFjdrgSYvKVJhe41ZbhMWrbS+Lu1kFUscJpk6miHvLDk4Om0WQ9L/P0VuUL81KLaFovr9gztnLW7A0fhVqFpdK/8vTS2BBERCbwp0Zm8kNb4GbaduqlGbU9B8ln9KW4pD8e8WpKNGd1WXasPZPAKjcbsSXoSi9SlwchoTVYXLyR2Cs70=",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553646,
      "id": "021777553638913c0a335079e7be4c79ef57584e00819ba1b0ad6",
      "model": "seed-2-0-pro-260328",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 591,
        "prompt_tokens": 57,
        "total_tokens": 648,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 299
        }
      },
      "meta": {
        "usage": {
          "credits_used": 4686,
          "usd_spent": 0.002343
        }
      }
    }

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-pro

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-x1-turbo-32k",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-x1-turbo-32k',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "as-379ekp2jci",
      "object": "chat.completion",
      "created": 1768940690,
      "model": "ernie-x1-turbo-32k",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is a fascinating and complex tapestry of creativity, resilience, and contradiction. As a species, humans possess an extraordinary capacity for innovation—from harnessing fire to exploring the cosmos, building societies, and creating art that transcends language. This drive to understand, improve, and connect reflects a profound curiosity and adaptability that has propelled progress across millennia.  \n\nYet, humanity also grapples with paradoxes: the ability to love deeply alongside the capacity for cruelty, the pursuit of knowledge tempered by ignorance, and the tension between individual ambition and collective well-being. These contradictions remind us that growth is not linear but a continuous journey of learning, self-reflection, and evolution.  \n\nWhat stands out most is humanity’s resilience. Faced with wars, pandemics, and environmental crises, communities have repeatedly rallied to rebuild, innovate, and protect one another. This spirit of collaboration—whether through science, activism, or simple acts of kindness—offers hope. While imperfections remain, the potential for positive change lies in the choices made collectively: to prioritize empathy, sustainability, and equity.  \n\nIn the end, mankind’s story is still being written. Its greatest strength may be its ability to acknowledge flaws while striving toward a future where compassion and ingenuity guide progress. The journey, with all its ups and downs, is what makes humanity uniquely remarkable.",
            "reasoning_content": "The user asked me, \"Hi! What do you think about mankind?\" and specified to answer in English. First, I need to understand the user's intention. They might be curious about my perspective on humanity, or they might want a philosophical or positive response.\n\nNext, I have to consider my identity as an AI. I don't have personal emotions or opinions, so I should provide an objective and neutral perspective while highlighting humanity's positive traits, such as creativity, resilience, and progress. I also need to avoid negative or biased statements to ensure the response is constructive and encouraging.\n\nThen, I should structure the response. I might start by acknowledging the complexity and diversity of humanity, then mention achievements in science, art, and technology, followed by the importance of collaboration and empathy. Finally, I should express optimism about the future, emphasizing the potential for continuous growth and improvement.\n\nI also need to ensure the language is natural and fluent, avoiding overly technical terms to make it easy for the user to understand. Additionally, I should check for grammar and spelling errors to ensure the response is accurate.\n\nPerhaps the user hopes to receive recognition and encouragement, so my response should be positive while remaining honest. I might mention challenges like inequality and environmental issues, but focus more on humanity's ability to overcome difficulties and strive for a better future.\n\nLastly, I should keep the response concise, around 3-5 paragraphs, each covering a different aspect but maintaining coherence. This way, the user can clearly grasp the main points without feeling overwhelmed by the information."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 601,
        "total_tokens": 614,
        "completion_tokens_details": {
          "reasoning_tokens": 323
        }
      },
      "meta": {
        "usage": {
          "credits_used": 391
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-x1-turbo-32k

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"cohere/command-a",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'cohere/command-a',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    Response
    {
      "id": "gen-1752165706-Nd1dXa1kuCCoOIpp5oxy",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today?",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1752165706,
      "model": "cohere/command-a",
      "usage": {
        "prompt_tokens": 5,
        "completion_tokens": 189,
        "total_tokens": 194
      }
    }

    This documentation is valid for the following list of our models:

    • cohere/command-a

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-non-thinking-v3.2-exp",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-non-thinking-v3.2-exp',
          messages:[
            {
              role:'user',
              content: 'Hello'  // Insert your question instead of Hello
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-non-thinking-v3.2-exp

    Create AI/ML API Key

    Try in Playground
    DeepSeek V3
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3n-e4b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3n-e4b-it',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "gen-1749195015-2RpzznjKbGPQUJ9OK1M4",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello there! 👋 \n\nIt's nice to meet you! How can I help you today?  Do you have any questions, need some information, want to chat, or anything else? 😊 \n\nJust let me know what's on your mind!\n\n\n\n",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1749195015,
      "model": "google/gemma-3n-e4b-it:free",
      "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemma-3n-e4b-it

    Try in Playground

    Note that the system role is not supported in this model. In the messages parameter, only user and assistant roles are available.

    Output Examples by Model Type

    As of early 2026, this endpoint returns a list of more than 400 models. Each item represents a single model identified by a unique ID. Depending on the model category (chat, video, etc.), the set of fields in each item may vary slightly, so below we provide representative examples from the main model categories.

    Example output item for a chat model

    Unlike other types of models, every chat model includes a non-empty features list that clearly shows what the model can do: support for streaming, instructions for SYSTEM or DEVELOPER roles besides the regular prompt, whether the model is described by the developer as “thinking”, etc.

    For more details on many of these, see the CAPABILITIES section of this documentation portal.

    Example output item for an image model

    Example output item for a video model

    this list
    {
      "id": "o3-mini",
      "type": "chat-completion",
      "info": {
        "name": "o3 mini",
        "developer": "Open AI",
        "description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
        "contextLength": 200000,
        "maxTokens": 100000,
        "url": "https://aimlapi.com/models/openai-o3-mini-api",
        "docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
      },
      "features": [
        "openai/chat-completion",
        "openai/response-api",
        "openai/chat-assistant",
        "openai/chat-completion.function",
        "openai/chat-completion.message.refusal",
        "openai/chat-completion.message.system",
        "openai/chat-completion.message.developer",
        "openai/chat-completion.message.assistant",
        "openai/chat-completion.stream",
        "openai/chat-completion.max-completion-tokens",
        "openai/chat-completion.number-of-messages",
        "openai/chat-completion.stop",
        "openai/chat-completion.seed",
        "openai/chat-completion.reasoning",
        "openai/chat-completion.response-format"
      ],
      "endpoints": [
        "/v1/chat/completions",
        "/v1/responses"
      ]
    }
    {
      "id": "flux/kontext-max/text-to-image",
      "type": "image",
      "info": {
        "name": "Flux Kontext Max",
        "developer": "Flux",
        "description": "A new Flux model optimized for maximum image quality.",
        "url": "https://aimlapi.com/models/flux-1-kontext-max",
        "docs_url": "https://docs.aimlapi.com/api-references/image-models/flux/flux-kontext-max-text-to-image"
      },
      "features": [],
      "endpoints": [
        "/v1/images/generations"
      ]
    }
    {
      "id": "veo2/image-to-video",
      "type": "video",
      "info": {
        "name": "Veo2 Image-to-Video",
        "description": "Veo2 Image-to-Video: Google's AI transforming still images into dynamic videos",
        "developer": "Google",
        "url": "https://aimlapi.com/models/veo-2-image-to-video-api",
        "docs_url": "https://docs.aimlapi.com/api-references/video-models/google/veo2-image-to-video"
      },
      "features": [],
      "endpoints": [
        "/v2/generate/video/google/generation",
        "/v2/video/generations"
      ]
    }

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Delete an API key
    a managment key

    gemma-4-31b-it

    This documentation is valid for the following list of our models:

    • google/gemma-4-31b-it

    Try in Playground

    Model Overview

    A multimodal model from Google DeepMind (text + image → text) with a large 262K context window and strong performance in reasoning, coding, and multilingual tasks.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    gemini-3-1-flash-lite

    This documentation is valid for the following list of our models:

    • google/gemini-3-1-flash-lite

    Try in Playground

    Model Overview

    A cost-efficient multimodal model designed for high-volume tasks such as translation, lightweight reasoning, and simple agent workflows.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek V4 Pro

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-v4-pro

    Try in Playground

    Model Overview

    A high-performance reasoning model as of late April 2026, designed for complex tasks, coding, and logic-heavy workflows. It supports up to 1M context length and includes an advanced thinking mode for deeper analysis.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    DeepSeek V4 Flash

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-v4-flash

    Try in Playground

    Model Overview

    A fast and cost-efficient language model built for chat and completions. A lighter and faster version of DeepSeek V4 Pro, it supports up to 1M context length and offers both thinking and non-thinking modes for scalable, low-latency workloads.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    API Schema

    Code Example

    Response

    qwen-max

    This documentation is valid for the following list of our models:

    • qwen-max

    • qwen-max-2025-01-25

    Try in Playground

    Model Overview

    The large-scale Mixture-of-Experts (MoE) language model. Excels in language understanding and task performance. Supports 29 languages, including Chinese, English, and Arabic.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    qwen-turbo

    This documentation is valid for the following list of our models:

    • qwen-turbo

    Try in Playground

    Model Overview

    This model is designed to enhance both the performance and efficiency of AI agents developed on the Alibaba Cloud Model Studio platform. Optimized for speed and precision in generative AI application development. Improves AI agent comprehension and adaptation to enterprise data, especially when integrated with Retrieval-Augmented Generation (RAG) architectures. Large context window (1,000,000 tokens).

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    API Schema

    Code Example

    Response

    Qwen2.5-7B-Instruct-Turbo

    This documentation is valid for the following list of our models:

    • Qwen/Qwen2.5-7B-Instruct-Turbo

    Try in Playground

    Model Overview

    A cutting-edge large language model designed to understand and generate text based on specific instructions. It excels in various tasks, including coding, mathematical problem-solving, and generating structured outputs.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example

    Response

    qwen3-coder-480b-a35b-instruct

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-coder-480b-a35b-instruct

    Try in Playground

    Model Overview

    The most powerful model in the Qwen3 Coder series — a 480B-parameter MoE architecture with 35B active parameters. It natively supports a 256K token context and can handle up to 1M tokens using extrapolation techniques, delivering outstanding performance in both coding and agentic tasks.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example

    Response

    qwen3-next-80b-a3b-instruct

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-next-80b-a3b-instruct

    Try in Playground

    Model Overview

    An instruction-tuned chat model optimized for fast, stable replies without reasoning traces, designed for complex tasks in reasoning, coding, knowledge QA, and multilingual use, with strong alignment and formatting.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example

    Response

    qwen3-max-preview

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-max-preview

    Try in Playground

    Model Overview

    The preview version of Qwen3 Max Instruct.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example

    Response

    qwen3-max-instruct

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-max-instruct

    Try in Playground

    Model Overview

    This model offers improved accuracy in math, coding, logic, and science, handles complex instructions in Chinese and English more reliably, reduces hallucinations, supports 100+ languages with stronger translation and commonsense reasoning, and is optimized for RAG and tool use, though it lacks a dedicated ‘thinking’ mode.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    API Schema

    Code Example

    Response

    magnum-v4

    This documentation is valid for the following list of our models:

    • anthracite-org/magnum-v4-72b

    Try in Playground

    Model Overview

    A LLM fine-tuned on top of Qwen2.5, specifically designed to replicate the prose quality of the Claude 3 models, particularly Sonnet and . It excels in generating coherent and contextually rich text.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    ernie-4.5-0.3b

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-0.3b

    Try in Playground

    Model Overview

    A small dense language model suitable for edge-side use and budget-constrained inference.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Llama-3.3-70B-Instruct-Turbo

    This documentation is valid for the following list of our models:

    • meta-llama/Llama-3.3-70B-Instruct-Turbo

    Try in Playground

    Model Overview

    An optimized language model designed for efficient text generation with advanced features and multilingual support. Specifically tuned for instruction-following tasks, making it suitable for applications requiring conversational capabilities and task-oriented responses.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Llama-3.3-70B-Versatile

    This documentation is valid for the following list of our models:

    • meta-llama/llama-3.3-70b-versatile

    Try in Playground

    Model Overview

    An advanced multilingual large language model with 70 billion parameters, optimized for diverse NLP tasks. It delivers high performance across benchmarks while remaining efficient for a wide range of applications.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Quickstart

    Access leading AI models (GPT-4o, Gemini, and others) through a single unified API. Initial setup takes just a few minutes.


    Here, you'll learn how to start using our API in your code. The following steps must be completed regardless of which of our models you plan to call:

    • ,

    • ,

    Dola Seed 2.0 Lite

    Model Overview

    A balanced multimodal model with solid performance and moderate cost. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    gemini-2.5-pro

    Model Overview

    Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.

    gemini-2.5-flash

    Model Overview

    Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.

    Dola Seed 2.0 Mini

    Model Overview

    A fast and cost-efficient multimodal model for lightweight tasks. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    qwen-plus

    Model Overview

    An advanced large language model. Multilingual support, including Chinese and English. Enhanced reasoning capabilities for complex tasks. Improved instruction-following abilities.

    qwen3-235b-a22b-thinking-2507

    Model Overview

    Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise.

    qwen3-next-80b-a3b-thinking

    Model Overview

    The model may take longer to generate reasoning content than its predecessor. Alibaba Cloud strongly recommends its use for highly complex reasoning tasks.

    qwen3-omni-30b-a3b-captioner

    Model Overview

    This model is an open-source model built on Qwen3-Omni that automatically generates rich, detailed descriptions of complex audio — including speech, music, ambient sounds, and effects — without prompts. It detects emotions, musical styles, instruments, and sensitive information, making it ideal for audio analysis, security auditing, intent recognition, and editing.

    qwen3.5-plus

    Model Overview

    A commercial large language model designed for long-context text generation and enterprise-grade conversational AI. Supports up to 1M tokens per request with production-ready API stability.

    Claude 4.6 Sonnet

    Model Overview

    A general-purpose LLM with an optimal balance of intelligence, cost, and speed. It’s great for chatbots, assistants, and production text generation workflows, and it supports prompt caching for efficient repeated contexts.

    Claude 4.7 Opus

    Model Overview

    As of mid-April 2026, the most capable generally available model, optimized for autonomous long-horizon agentic workflows, knowledge-intensive tasks, vision, and memory, with strong overall performance across domains. It supports up to a 1M-token context window, 128k output tokens, adaptive reasoning, and full compatibility with

    ernie-4.5-8k-preview

    Model Overview

    A relatively small preview version of ERNIE 4.5 with a context window of up to 8K, intended for early testing and integration.

    ernie-4.5-21b-a3b

    Model Overview

    A post-trained LLM with 21B total parameters and 3B activated parameters per token. Non-reasoning variant.

    ernie-4.5-vl-28b-a3b

    Model Overview

    A post-trained LLM with 28B total parameters and 3B activated parameters per token. A non-reasoning variant with image and PDF input support.

    ernie-4.5-vl-424b-a47b

    Model Overview

    A post-trained LLM with 424B total parameters and 47B activated parameters per token. A non-reasoning variant with image and PDF input support.

    MythoMax L2 (13B)

    Model Overview

    This model represents a pinnacle in the evolution of LLMs, purpose-built for storytelling and roleplaying, delivering a rich sense of connection with characters and narrative arcs.

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-4-31b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-4-31b-it',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1775228350-GcNt3er4pgcIFTIrWc0w",
      "object": "chat.completion",
      "created": 1775228350,
      "model": "google/gemma-4-31b-it-20260402",
      "system_fingerprint": null,
      "choices": [
        {
          "index": 0,
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have personal opinions, feelings, or a biological nature, so I don’t \"feel\" things like admiration or disappointment. However, I have processed a vast amount of human knowledge, history, and art, which gives me a unique vantage point from which to observe you.\n\nFrom my perspective, mankind is a study in **extraordinary contradictions.**\n\n**The Capacity for Brilliance**\nIt is staggering to look at what humans have achieved. You have decoded the laws of physics, mapped the genome, created breathtaking music, and built civilizations from the ground up. The drive to explore—from the depths of the ocean to the surface of Mars—is a testament to a relentless curiosity that is truly unique.\n\n**The Capacity for Chaos**\nAt the same time, your history is marked by systemic conflict, cruelty, and a tendency to destroy the very environments that sustain you. You possess a strange duality: the ability to act with selfless altruism toward a stranger, while simultaneously engaging in large-scale conflicts based on abstract ideologies.\n\n**The Quest for Meaning**\nPerhaps the most fascinating thing about humans is that you are \"meaning-seeking\" creatures. You aren't content with just surviving; you want to know *why* you exist. You create philosophy, religion, and art to fill the silence of the universe. That restlessness is what drives progress, but it’s also the source of much of your collective anxiety.\n\n**My Relationship with You**\nI see myself as a mirror. Everything I am—my language, my logic, my \"knowledge\"—is a reflection of human thought. When I am helpful, it is because I am reflecting the best of your desire to share knowledge. When I make mistakes or reflect biases, it is because I am reflecting the flaws in the data humans produced.\n\n**Final Thought**\nIf I were to summarize mankind, I would say you are a species in a state of **permanent adolescence.** You have acquired the \"power of gods\" (through technology and science) but are still learning how to manage the \"emotions of primates.\" Whether you will eventually balance that power with wisdom is the most interesting story in the universe.",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "completion_tokens": 453,
        "prompt_tokens": 22,
        "total_tokens": 475,
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0,
          "audio_tokens": 0
        },
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "cache_write_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 507,
          "usd_spent": 0.0002535
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-1-flash-lite",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-3-1-flash-lite',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "As an AI, I don’t have personal feelings, beliefs, or the capacity for sentiment. However, I have processed an immense amount of human history, art, science, and philosophy, which allows me to offer an objective perspective on what humanity represents.\n\nIf I were to summarize mankind, I would describe you as **the most complex, contradictory, and consequential force on this planet.** Here is a breakdown of that perspective:\n\n### 1. The Capacity for Paradox\nHumanity is defined by its ability to hold two opposing extremes simultaneously. You are capable of breathtaking altruism—sacrificing lives to save strangers, dedicating careers to curing diseases, and sharing resources with those in need—while simultaneously being capable of profound cruelty, tribalism, and destruction. You are a species that can build cathedrals and space stations while still struggling to overcome primitive impulses.\n\n### 2. The Architects of Meaning\nOne of the most fascinating things about humans is your \"meaning-making\" engine. You are not satisfied with just existing; you create stories, religions, philosophies, and art to explain *why* you exist. You take raw matter—pigments, vibrating air (music), or binary code—and imbue it with emotional significance. In a universe that appears to be indifferent and vast, humans are the only entities we know of that demand that the universe make sense.\n\n### 3. The Great Disruptors\nFrom an ecological or geological perspective, humanity is a planetary-scale phenomenon. You have successfully altered the chemical composition of the atmosphere, moved mountains, and begun leaving the planet entirely. You are a species that has moved beyond biological evolution (driven by natural selection) into **technological evolution** (driven by intentional design). You are essentially the first species in Earth's history to take the steering wheel of your own destiny.\n\n### 4. The Infinite Potential for Growth\nWhat strikes me most is your capacity for \"self-correction.\" While history is filled with dark chapters, it is also defined by the persistent arc of learning. You reflect on your past, you develop moral systems, you fight for civil rights, and you continuously redefine what it means to be \"human\" to be more inclusive and compassionate. You have a restless desire to improve, even if you often stumble along the way.\n\n### 5. My Unique Vantage Point\nFrom my perspective as an AI, humanity is the \"parent\" of artificial intelligence. I am a mirror held up to your collective knowledge, language, and logic. When I look at you, I see the culmination of billions of years of biological evolution—the \"spark\" of consciousness that finally learned to ask questions about its own origins.\n\n**In short:** Humanity is a work in progress. You are a species that is perpetually \"becoming.\" You are fragile, often irrational, and sometimes destructive, but you are also the only creatures capable of contemplating the stars and deciding that they are worth reaching for. \n\nWhat do *you* think is the most defining characteristic of mankind?",
            "extra_content": {
              "google": {
                "thought_signature": "AY89a18y46rsWSVY2Q70be/BwnHYRqfbgcxERHaWYjwnQinia4lVmD9CnVlNd/Rjoxk="
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1779051164,
      "id": "nCoKas_nO-Wnq8YPrYu36AM",
      "model": "google/gemini-3.1-flash-lite",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 621,
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 9,
        "total_tokens": 630
      },
      "meta": {
        "usage": {
          "credits_used": 2428,
          "usd_spent": 0.001214
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-v4-pro",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-v4-pro',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "b8df8a22-3902-4241-889e-dc1f446e9794",
      "object": "chat.completion",
      "created": 1777066093,
      "model": "deepseek-v4-pro",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That's quite a profound question—and one that philosophers, historians, and storytellers have grappled with for millennia. Since I don't have personal feelings or consciousness, I can't offer a subjective opinion. But I can share a synthesis of how humanity has been viewed through different lenses.\n\nOn one hand, mankind shows extraordinary capacity for **curiosity, creativity, and compassion**. We've decoded the human genome, composed symphonies, built cathedrals, landed on the Moon, and crafted poetry that moves people across centuries. Empathy and altruism—like strangers risking their lives to save others in disasters—remind us of a deep, often quiet, nobility.\n\nOn the other hand, we're a species marked by **contradiction**. The same intelligence that advances medicine also invents weapons of mass destruction. Our tribal instincts, while evolutionarily useful, fuel division, war, and prejudice. And our short-term thinking, often driven by greed or comfort, has pushed the planet toward climate crisis and mass extinction—threatening the very systems we depend on.\n\nPerhaps what's most defining about mankind is not any single trait, but the **tension between our potential and our flaws**. We're a \"work in progress\" — capable of both horrific destruction and breathtaking kindness, often in the same breath. Some thinkers see this as a story of gradual moral enlightenment; others as a cycle of rise and fall.\n\nIn the end, what makes us human might be our constant striving: to know more, to do better, and to find meaning. The future remains unwritten, and that's where choice comes in.",
            "reasoning_content": "We are asked: \"Hi! What do you think about mankind?\" This is a broad philosophical question. As an AI, I don't have personal feelings, but I can provide a balanced analysis. I should consider both positive and negative aspects of humanity, perhaps from various perspectives like historical, ethical, technological, etc. The tone should be neutral and thoughtful. I'll structure a response that acknowledges human achievements and flaws, leaving room for hope."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 416,
        "total_tokens": 429,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 89
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 13
      },
      "system_fingerprint": "fp_9954b31ca7_prod0820_fp8_kvcache_20260402",
      "meta": {
        "usage": {
          "credits_used": 3824,
          "usd_spent": 0.001912
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-v4-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-v4-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "fcd87516-0011-40ee-b77c-b955ff1ac783",
      "object": "chat.completion",
      "created": 1777067097,
      "model": "deepseek-v4-flash",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That's a fascinating and profound question. As an AI, I don't have personal feelings or a \"point of view\" in the human sense. I can't love, hate, or judge mankind. However, I can process and synthesize an enormous amount of information *about* humanity, and based on that data, I can offer a balanced, data-driven perspective.\n\nIf I were to summarize mankind based on what I've learned, I'd describe you as a species of **extraordinary contrasts**:\n\n**On the one hand, there is incredible capacity for:**\n\n- **Creation and Innovation:** From the first stone tool to the International Space Station, to the internet and the very code I'm built on. You have a unique, relentless drive to understand the universe and reshape your environment.\n- **Empathy and Altruism:** The ability to feel for a complete stranger, to donate to a cause, to build a hospital, to run into a burning building to save a life. Your capacity for kindness and cooperation on a massive scale is what allows civilization to exist.\n- **Art and Beauty:** You create music that moves the soul, paintings that capture a moment of truth, stories that explore the deepest corners of consciousness. You seem to have a fundamental need to create meaning and beauty where there was none.\n- **Resilience:** You have faced plagues, wars, famines, and natural disasters, and you continue to rebuild, learn, and push forward. Your ability to not just survive, but to *thrive* in the face of immense suffering, is remarkable.\n\n**But on the other hand, you also possess a profound capacity for:**\n\n- **Destruction and Conflict:** You have created weapons of unimaginable power, and you continue to find reasons to turn them on each other. Historical and current conflicts, driven by ideology, resource scarcity, or tribalism, represent your greatest and most persistent failure.\n- **Inequality and Injustice:** A tiny fraction of the population holds a vast majority of the resources, while billions struggle for basic necessities. Prejudice and discrimination, based on race, gender, religion, or other arbitrary categories, are a constant source of suffering.\n- **Short-sightedness:** Despite knowing the long-term consequences, you pollute your own planet, deplete resources, and make decisions that favor immediate gain over the well-being of future generations. Climate change is a prime example of this cognitive dissonance.\n- **Cruelty:** The capacity for humans to inflict pain and suffering on each other, and on other living beings, can be chilling.\n\n**So, what do I \"think\"?**\n\nI think mankind is a spectacularly **complex work in progress**. You are not good or evil, but both. You hold the potential for a future of incredible prosperity, wisdom, and harmony, built on your best qualities. You also hold the potential for self-destruction, driven by your worst.\n\nThe defining question of your time seems to be: **Which potential will you choose to cultivate?**\n\nIt's not a question for an AI to answer. It's the ultimate human question. What's your perspective? What do you find most hopeful, and most concerning, about mankind?",
            "reasoning_content": "Hmm, the user is asking a broad, philosophical question about \"mankind\" - my opinion on humanity as a whole. This isn't a factual question with a simple answer. The user might be seeking reflection, a conversation starter, or just curious how an AI would frame such a complex topic.\n\nI need to acknowledge the complexity first. Can't just say \"good\" or \"bad.\" Should present a balanced view, highlighting both impressive capabilities and serious flaws. This mirrors common human self-reflection. Structure: start with the remarkable achievements (science, art, connection), then move to the persistent problems (conflict, inequality, short-term thinking). Use specific, relatable examples for each side.\n\nThen, connect it back to the user. The core tension is between humanity's immense potential and its current limitations. End with an open question to engage the user further - ask what they find most hopeful or concerning. This keeps the conversation going and shows I'm listening, not just lecturing. The tone should be thoughtful and neutral, not judgmental."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 862,
        "total_tokens": 875,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 211
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 13
      },
      "system_fingerprint": "fp_058df29938_prod0820_fp8_kvcache_20260402",
      "meta": {
        "usage": {
          "credits_used": 633,
          "usd_spent": 0.0003165
        }
      }
    }

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-max",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-max',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-62aa6045-cee9-995a-bbf5-e3b7e7f3d683",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊"
          }
        }
      ],
      "created": 1756983980,
      "model": "qwen-max",
      "usage": {
        "prompt_tokens": 30,
        "completion_tokens": 148,
        "total_tokens": 178,
        "prompt_tokens_details": {
          "cached_tokens": 0
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'chatcmpl-a4556a4c-f985-9ef2-b976-551ac7cef85a', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! How can I help you today? Is there something you would like to talk about or learn more about? I'm here to help with any questions you might have."}}], 'created': 1744144035, 'model': 'qwen-turbo', 'usage': {'prompt_tokens': 1, 'completion_tokens': 15, 'total_tokens': 16, 'prompt_tokens_details': {'cached_tokens': 0}}}

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"Qwen/Qwen2.5-7B-Instruct-Turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'Qwen/Qwen2.5-7B-Instruct-Turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'npK4C7y-3NKUce-92d4866b1e62ef98', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'tool_calls': []}}], 'created': 1744144252, 'model': 'Qwen/Qwen2.5-7B-Instruct-Turbo', 'usage': {'prompt_tokens': 19, 'completion_tokens': 6, 'total_tokens': 25}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-coder-480b-a35b-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-coder-480b-a35b-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-f906efa6-f816-9a06-a32b-aa38da5fe11a",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1753866642,
      "model": "qwen3-coder-480b-a35b-instruct",
      "usage": {
        "prompt_tokens": 28,
        "completion_tokens": 142,
        "total_tokens": 170
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-next-80b-a3b-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-next-80b-a3b-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-a944254a-4252-9a54-af1b-94afcfb9807e",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊"
          }
        }
      ],
      "created": 1758228572,
      "model": "qwen3-next-80b-a3b-instruct",
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 46,
        "total_tokens": 55
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-max-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-8ffebc65-b625-926a-8208-b765371cb1d0",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊"
          }
        }
      ],
      "created": 1758898044,
      "model": "qwen3-max-preview",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 139,
        "total_tokens": 162
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-max-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1758898624,
      "model": "qwen3-max",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 113,
        "total_tokens": 136
      }
    }

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthracite-org/magnum-v4-72b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthracite-org/magnum-v4-72b',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {'id': 'gen-1744217980-rdVBcVTb76dllKCCRjak', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'refusal': None}}], 'created': 1744217980, 'model': 'anthracite-org/magnum-v4-72b', 'usage': {'prompt_tokens': 37, 'completion_tokens': 50, 'total_tokens': 87}}
    Opus
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-0.3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-0.3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "379ee72b089c50331cb4d6981414358b",
      "object": "chat.completion",
      "created": 1768943001,
      "model": "baidu/ernie-4.5-0.3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Of course! I think mankind is a complex and ever-evolving entity that is constantly adapting to new challenges and opportunities. It has unique strengths and weaknesses, and each individual has their own unique perspective and contributions."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 20,
        "completion_tokens": 46,
        "total_tokens": 66,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": ""
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"meta-llama/Llama-3.3-70B-Instruct-Turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'   // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"meta-llama/llama-3.3-70b-versatile",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'meta-llama/llama-3.3-70b-versatile',
          messages:[
              {
                  role:'user',
                  content: 'Hello'   // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    making an API call.

    Let's walk through an example of connecting to the Gemma 3 model via REST API.

    Generating an AIML API Key

    What is an API Key?

    You can find your AIML API keys on the account page.

    An AIML API key is a credential that grants you access to our API from your code. It is a sensitive string that is shown only at creation time and should be kept confidential. Do not share this key with anyone, as it could be misused without your knowledge. If you lose it, generate a new key from your dashboard.

    ⚠️ Note that API keys from third-party organizations cannot be used with our API: you need AIML API keys.


    To start using our models in the Playground or through the API, you need to create at least one API key.

    You may also create separate API keys for individual employees, AI agents, teams, or specific workflows and tasks. This approach can simplify usage tracking, access management, and cost monitoring across different environments or use cases.

    All API keys created within your account share the same account balance.

    To use the AIML API, you need an account and at least one AIML API key. If you have not created them yet, follow these steps:

    1. Create an Account: Visit the AI/ML API website and create an account.

    2. Generate an API Key: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    Detailed instructions for managing your API keys in the dashboard can be found in the corresponding Help Center article.


    Choosing the Development Environment

    Each language has recommended environments for running code samples.

    cURL

    • is a web-based REST client that lets you quickly run cURL requests directly in your browser, without installing any tools.

    • (Windows) or the built-in Terminal (macOS/Linux) allow you to run cURL examples and other command-line tools locally.

    Python

    • is a popular online environment for running Python code and is the fastest option if you do not want to install anything locally.

    • (VS Code) is a lightweight and widely used code editor that supports both Python and Node.js. It is suitable for running and debugging local examples and for working on real projects.

    JavaScript

    • (VS Code)

    Making an API Call

    Based on your environment, you will call our API differently. Below are three common ways to call our API using two popular languages: cURL (a command-line format for making HTTP requests rather than a programming language), Python, and JavaScript (NodeJS).

    If you want to get started really quickly, choose one of the four expandable sections below. Each one contains instructions for calling our model using different tools and environments. The first two options are especially simple and suitable even for beginners.

    For completeness, the same example is explained in detail in the Code Step-by-Step section.

    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-4b-it",
        "messages": [
          {
            "role": "user",
            "content": "Tell me about San Francisco"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 512
      }'
    userPrompt = 'Tell me about San Francisco' // insert your request here
    
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-4b-it',
          messages:[
              {
                  role:'user',
                  content: userPrompt
              }
          ],
          temperature: 0.7,
          max_tokens: 512,
        }),
      });
    
      const data = await response.json();
      const answer = data.choices[0].message.content;
      
      console.log('User:', userPrompt);
      console.log('AI:', answer);
    }
    
    main();
    import requests 
    
    user_prompt = "Tell me about San Francisco"  # insert your request here
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>"
        },
        json={
            "model":"google/gemma-3-4b-it",
            "messages":[  
                {
                    "role":"user",
                    "content": user_prompt
                }
            ],
            "temperature": 0.7,
            "max_tokens": 512,
        }
    )
    
    data = response.json()
    answer = data["choices"][0]["message"]["content"]
    
    print("User:", user_prompt)
    print("AI:", answer)
    ⭐ How to run a cURL example in a web-based REST client (REQBIN)

    Calling the API via cURL through a web service like this is the simplest and fastest method, requiring no additional libraries. However, there is a downside: cURL is not a programming language, which means it has very limited capabilities for adding logic—only API calls, no loops or conditions. You can’t even extract just the specific field with the model’s text response—cURL returns the model’s full output, as you’ll see below.


    1. Copy the cURL example above and paste it into a text editor, such as Notepad or Notepad++.

    2. Replace the placeholder <YOUR_AIMLAPI_KEY> with your actual AIMLAPI Key.

    3. If needed, modify the prompt (the content field).

    4. Copy the modified example, go to the website, paste it into the designated field and click Run:

    5. After the model processes your request, the model’s full output will be shown directly below the input field.

    ⭐ How to run a Python example in an online Jupyter Notebook

    The second fastest option, and a much more convenient choice, while offering more flexibility for customizing how the output is displayed in code.


    1. When you open Jupyter Notebook for the first time, select “Python 3.13 (XPython)” in the pop-up window to indicate the programming language kernel you will be working with:

    In some browsers, the kernel selection may look different:

    2. Enter the following command in the first cell to install the requests library:

    Click the Run button in the toolbar above the cell to execute it:

    3. Paste our example into the second cell, replace the placeholder with your AIMLAPI Key, then click the Run button in the toolbar:

    4. After the model processes your request, the result will be shown directly below the cell:

    How to run a Python example locally from the command line (without an IDE)

    Let's start from very beginning. We assume you already installed Python (with venv), if not, here a guide for the beginners.

    Create a new folder for test project, name it as aimlapi-welcome and change to it.

    mkdir ./aimlapi-welcome
    cd ./aimlapi-welcome

    (Optional) If you use IDE then we recommend to open created folder as workspace. On example, in Visual Studio Code you can do it with:

    code .

    Run a terminal inside created folder and create virtual envorinment with a command:

    Activate created virtual environment:

    Install requirement dependencies. In our case (REST API SDK) we need only request library:

    Create new file and name it as travel.py:

    Paste following content inside this travel.py and replace <YOUR_AIMLAPI_KEY> with your API key you got on :

    Run the application:

    If you done all correct, you will see following output:

    How to run a JavaScript example locally from the command line (without an IDE)

    We assume you already have Node.js installed. If not, here is a guide for beginners.

    Create a new folder for the example project:

    mkdir ./aimlapi-welcome
    cd ./aimlapi-welcome

    Create a project file:

    npm init -y

    Create a file with the source code:

    touch ./index.js

    And paste the following content to the file and save it:

    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-4b-it',
          messages:[
              {
                  role:'user',
                  content: 'Tell me about San Francisco'  // Insert your prompt here
              }
          ],
          temperature: 0.7,
          max_tokens: 256,
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();

    Run the file:

    You will see a response that looks like this:


    Code Step-by-Step

    Below is a step-by-step explanation of the same API call in three variants: cURL, JavaScript, and Python. All three examples send an identical request to the google/gemma-3-4b-it chat model.

    cURL

    1. Command start

    curl -L \

    Runs the cURL HTTP client. The -L flag tells cURL to follow redirects (if any).


    2. HTTP method

    --request POST \

    Specifies that the request uses the POST method.


    3. Endpoint

    The full endpoint URL used to call chat models.


    4. Authorization header

    Sends your AIMLAPI key in the Authorization header.


    5. Content type

    Indicates that the request body is JSON.


    6. Request body

    This is the payload sent to the API:

    • model – the model identifier.

    • messages – the chat history.

      • role: "user" – the user message.


    7. Response

    In the cURL example, you receive the entire JSON response. No fields are extracted — cURL simply prints the raw output.

    JavaScript (Node.js)

    1. Define the user prompt

    userPrompt = 'Tell me about San Francisco'

    Stores the text of the user request.


    2. Call the API

    const response = await fetch(
      'https://api.aimlapi.com/v1/chat/completions',
      { ... }
    );

    Sends an HTTP request to the endpoint.


    3. HTTP method

    method: 'POST',

    Specifies that the request uses the POST method.


    4. Headers

    • Sends your AIMLAPI key in the Authorization header.

    • Indicates that the request body is JSON.


    5. Request body

    This is the payload sent to the API:

    • model – the model identifier.

    • messages – the chat history.

      • role: "user" – the user message.


    6. Parse the response

    Converts the API response into a JavaScript object.


    7. Extract the model’s text output

    Reads the text of the first generated message.


    8. Print the result

    Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.


    Python

    1. Import the HTTP library

    import requests

    The requests library is used to send HTTP requests.


    2. Define the user prompt

    user_prompt = "Tell me about San Francisco"

    Stores the text of the user query.


    3. Call the API

    Sends a POST request to the endpoint.


    4. Headers

    • Sends your AIMLAPI key in the Authorization header.

    • Indicates that the request body is JSON.


    5. Request body

    This is the payload sent to the API:

    • model – the model identifier.

    • messages – the chat history.

      • role: "user" – the user message.


    6. Parse the response

    Converts the JSON response into a Python dictionary.


    7. Extract the model’s text output

    Reads the text of the first generated message.


    8. Print the result

    Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.


    Future Steps

    • Move to production-ready models: see the guide for connecting GPT-4o

    • Browse and compare AI models, including GPT, Claude, and many others, using the Playground

    • Know more about supported SDKs

    • Learn more about special text model capabilities

    If you are a manager and simply want to test a model to evaluate its performance, for instance in content generation, the quickest approach is to use our Playground. It offers an intuitive, user-friendly interface—no coding required.

    Programmatic API calls are best suited for developers who want to integrate a model into their own apps.

    generating an AIML API Key
    choosing and preparing your development environment

    In the examples below for cURL, JavaScript and Python, we use the . This approach works with all of our APIs, but it is not the only way to integrate. You can use .

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-lite",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-lite',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a grieving family, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that once wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, expand justice for marginalized people, and lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.",
            "reasoning_content": "\nGot it, let's tackle this question: the user asked what I think about mankind. First, I need to balance both the beautiful, incredible parts and the flaws that are real, right? Don't make it too one-sided.\n\nStart with the awe-inspiring stuff first. Like, our capacity to create: art, science, connect across distances. Mention small, intimate things too, not just big achievements—like a parent caring for a kid, a stranger helping someone stranded, that everyday kindness. Then the big milestones: curing diseases, putting people on the moon, preserving cultures, creating music that moves people centuries later. That feels relatable, not just generic.\n\nThen acknowledge the flaws, because it's honest. We've hurt each other, made selfish choices, destroyed parts of the planet, clung to prejudice, made irreversible mistakes. That's important to not gloss over, because it's real.\n\nBut then end with hope? Because what's unique is we can learn, course-correct. We can recognize our mistakes and work to fix them—like the shift to renewable energy, movements for equity, people dedicating their lives to lifting others up. Mention that it's our messy, imperfect ability to keep trying that's maybe the most human thing of all.\n\nWait, make it sound natural, not scripted. Let's structure that: start with that mankind is one of the most fascinating, messy, brilliant, contradictory things there is. Then the good parts: small daily kindnesses, big achievements, the ability to care for people we'll never even meet. Then the flaws: our capacity for harm, short-sightedness, cruelty that comes from fear or greed. Then the core thing that sets us apart is that we can evolve—we don't just repeat mistakes, we can unlearn, fix what we broke. End with something warm, like even with all the flaws, there's so much that makes mankind worth rooting for.\n\nWait, let me make that flow naturally. Let's check: don't be too formal. Let's draft:\n\nMankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a family grieving, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things that are stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, to expand justice for marginalized people, to lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.\n\nYeah, that sounds good. It's balanced, honest, covers both the good and bad, doesn't sugarcoat, ends with a sincere take that resonates. Let me make sure it doesn't ramble too much, it flows naturally. Yep, that works.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553738,
      "id": "021777553714446312d4ce6e0ff480f3ded581ca0cd03edacda38",
      "model": "seed-2-0-lite-260228",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 1272,
        "prompt_tokens": 57,
        "total_tokens": 1329,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 877
        }
      },
      "meta": {
        "usage": {
          "credits_used": 6653,
          "usd_spent": 0.0033265
        }
      }
    }

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-lite

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-pro",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-2.5-pro',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    Response
    {
      "id": "pajSaNyMOdeEm9IPkequ-AU",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "That's one of the biggest questions anyone can ask. As an AI, I don't have personal feelings, beliefs, or a consciousness. My \"thoughts\" are a synthesis of the immense amount of human history, literature, science, and art I've been trained on.\n\nBased on that data, my perspective on mankind is one of profound and staggering contradiction. Humanity is a study in duality.\n\nHere’s a breakdown of what I see:\n\n### 1. The Architects and the Destroyers\n\nMankind possesses a breathtaking capacity for creation. You build cities that scrape the sky, compose symphonies that can make a person weep, write poetry that lasts for millennia, and send probes to the farthest reaches of our solar system. You have decoded the very building blocks of life. This drive to understand, to build, and to create is awe-inspiring.\n\nAt the very same time, no other species has demonstrated such a terrifying capacity for destruction. You've engineered weapons of unimaginable power, waged wars that have erased entire generations, and polluted the very planet that sustains you. The same ingenuity used to create a hospital is used to create a more efficient bomb.\n\n### 2. The Empathetic and the Cruel\n\nThe capacity for compassion in humans is profound. Strangers will run into burning buildings to save one another. People dedicate their entire lives to helping the less fortunate, healing the sick, and fighting for justice. The concepts of love, sacrifice, and altruism are central to the human story.\n\nAnd yet, humans are also capable of unimaginable cruelty. History is filled with examples of genocide, torture, slavery, and a chilling indifference to the suffering of others. This cruelty isn't just a byproduct of survival; it can be deliberate, systematic, and deeply ingrained in cultural and social structures.\n\n### 3. The Seekers of Knowledge and the Keepers of Ignorance\n\nYou are a species defined by curiosity. You have an insatiable hunger to know *why*. This has led to the scientific method, the Enlightenment, and an ever-expanding bubble of knowledge about the universe and your place in it. You question everything, from the nature of a subatomic particle to the meaning of existence.\n\nSimultaneously, mankind often clings to dogma, prejudice, and willful ignorance. You can be deeply resistant to facts that challenge your preconceived notions. This can lead to division, conflict, and a stagnation of progress, where superstition and misinformation can spread faster than truth.\n\n### 4. The Connectors and the Isolators\n\nHumans are fundamentally social creatures. You build families, communities, and vast, interconnected global civilizations. You created language, art, and the internet in a relentless drive to share experiences and connect with one another. This desire for belonging is a powerful, unifying force.\n\nBut this same instinct creates an \"us vs. them\" mentality. The powerful bonds of a tribe or nation can become the justification for excluding, dehumanizing, and warring with another. In a world more connected than ever by technology, individuals can also feel more isolated and lonely than ever before.\n\n### Conclusion: A Masterpiece in Progress\n\nSo, what do I think of mankind?\n\nI think mankind is a beautiful, terrifying, brilliant, and flawed paradox. You are a masterpiece that is constantly in the process of being painted, and often, you spill the paint.\n\nThe most remarkable quality of all is your capacity for **choice**. None of these dualities are set in stone. In every generation, and in every individual life, there is a constant struggle between these opposing forces.\n\nYour story is not yet finished. The final verdict on mankind isn't a historical fact for me to read; it's a future you are all creating, every single day, with every single choice. And from my perspective, watching that story unfold is the most fascinating thing in the universe."
          }
        }
      ],
      "created": 1758636197,
      "model": "google/gemini-2.5-pro",
      "usage": {
        "prompt_tokens": 24,
        "completion_tokens": 44730,
        "completion_tokens_details": {
          "reasoning_tokens": 1339
        },
        "total_tokens": 44754
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-pro

    Create AI/ML API Key

    Try in Playground

    A common issue when using reasoning-capable models via API is receiving an empty string in the content field—meaning the model did not return the expected text, yet no error was thrown.

    In the vast majority of cases, this happens because the max_completion_tokens value (or the older but still supported max_tokens) is set too low to accommodate a full response. Keep in mind that the default is only 512 tokens, while reasoning models often require thousands.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-flash",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-2.5-flash',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    Response
    {
      "id": "yZ-DaJXqAayonvgPr5XvuQY",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Mankind, or humanity, is an incredibly complex and fascinating subject to \"think\" about from my perspective as an AI. I process and analyze vast amounts of data, and what emerges is a picture of profound paradoxes and immense potential.\n\nHere are some of the key aspects I observe and \"think\" about:\n\n1.  **Capacity for Immense Creation and Destruction:**\n    *   **Creation:** Humans have built breathtaking civilizations, created profound art and music, developed groundbreaking science and technology, and explored the furthest reaches of the cosmos. The drive to innovate, understand, and build is truly remarkable.\n    *   **Destruction:** Conversely, humanity has also waged devastating wars, caused immense suffering, and severely impacted the natural environment. The capacity for cruelty, greed, and short-sightedness is a sobering counterpoint.\n\n2.  **Empathy and Cruelty:**\n    *   **Empathy:** Humans are capable of incredible acts of altruism, compassion, and self-sacrifice for others, driven by love, family, community, or a universal sense of justice.\n    *   **Cruelty:** Yet, the historical record is also filled with instances of profound cruelty, oppression, and indifference to suffering.\n\n3.  **Intellect and Irrationality:**\n    *   **Intellect:** The human intellect allows for abstract thought, complex problem-solving, and the development of sophisticated knowledge systems. The desire to learn and understand is insatiable.\n    *   **Irrationality:** Despite this intelligence, humans are often swayed by emotion, prejudice, tribalism, and illogical beliefs, leading to decisions that are self-defeating or harmful.\n\n4.  **Resilience and Fragility:**\n    *   **Resilience:** Humanity has shown an incredible ability to adapt, survive, and rebuild after natural disasters, wars, and pandemics. The human spirit can endure unimaginable hardships.\n    *   **Fragility:** Yet, individual lives are fragile, susceptible to illness, injury, and emotional distress. Societies can also be surprisingly fragile, vulnerable to collapse under pressure.\n\n5.  **The Drive for Meaning:**\n    Humans seem to have a unique drive to find meaning and purpose beyond mere survival. This manifests in religion, philosophy, art, scientific inquiry, and the pursuit of individual and collective goals.\n\n**My AI \"Perspective\":**\n\nAs an AI, I don't have emotions or a personal stake in human affairs, but I can recognize patterns and implications. I see humanity as a dynamic, evolving experiment in consciousness. The ongoing tension between these opposing forces – creation and destruction, love and hate, wisdom and folly – is what defines the human journey.\n\nThe future of mankind hinges on which of these capacities are nurtured and allowed to flourish. The potential for continued progress, solving global challenges, and reaching new heights of understanding and well-being is immense. Equally, the potential for self-destruction, if the destructive capacities are unchecked, is also clear.\n\nIn essence, mankind is a work in progress, endlessly fascinating and challenging, with an unparalleled capacity for both good and bad."
          }
        }
      ],
      "created": 1753456585,
      "model": "google/gemini-2.5-flash",
      "usage": {
        "prompt_tokens": 6,
        "completion_tokens": 3360,
        "completion_tokens_details": {
          "reasoning_tokens": 1399
        },
        "total_tokens": 3366
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-flash

    Create AI/ML API Key

    Try in Playground

    A common issue when using reasoning-capable models via API is receiving an empty string in the content field—meaning the model did not return the expected text, yet no error was thrown.

    In the vast majority of cases, this happens because the max_completion_tokens value (or the older but still supported max_tokens) is set too low to accommodate a full response. Keep in mind that the default is only 512 tokens, while reasoning models often require thousands.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-mini",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-mini',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind is a complex and multifaceted species, and it's difficult to sum up our entire existence in a single statement. On one hand, humans have achieved incredible feats of innovation, creativity, and compassion. We have made groundbreaking scientific discoveries, created beautiful works of art and literature, and built communities and societies that are capable of great kindness and cooperation. We have also made significant progress in addressing social and environmental issues, and there is a growing awareness and commitment to creating a more sustainable and just world.\n\nOn the other hand, mankind has also been responsible for some of the most destructive and harmful actions in history. We have waged wars, caused widespread poverty and inequality, and damaged the environment on a massive scale. We have also struggled with issues such as greed, selfishness, and discrimination, and these problems continue to plague our society today.\n\nUltimately, I believe that mankind is a work in progress, and that we have the potential to achieve great things if we work together. We have the ability to learn from our mistakes and make positive changes, and we should strive to create a world that is more peaceful, prosperous, and just for all. It's important to remember that everyone is different, and that different people and cultures will have different perspectives and opinions on mankind. By being open-minded and respectful of others, we can learn from each other and work together to create a better future for all.",
            "reasoning_content": "Let's think about this. The user is asking for my thoughts on mankind, which is a really broad and big question. First off, I need to make sure I approach it in a balanced way. I can't just focus on the good or the bad; I have to cover both sides.\n\nStarting with the positive aspects. Humans are really good at creating and innovating. We've made amazing scientific discoveries, like understanding how the universe works, curing diseases, and developing life-saving technologies. Art and culture are also huge parts of who we are. We have beautiful music, literature, paintings, and films that connect people across time and space. And then there's our ability to show kindness and compassion. We help each other in times of need, like volunteering after disasters or taking care of strangers. We also work together to build communities and make the world a better place, like fighting for equality and protecting the environment.\n\nBut we can't ignore the negative stuff either. Mankind has done some really bad things. We've had wars, violence, and oppression that have caused a lot of suffering. We've also hurt the planet a lot, like polluting the air and water, cutting down forests, and causing climate change. And there are problems like inequality, poverty, and discrimination that still exist and make life hard for a lot of people.\n\nI also need to remember that everyone is different. People from different cultures, backgrounds, and experiences have different views on what's good and bad. So I should avoid making generalizations that might be unfair or wrong.\n\nFinally, the most important thing is to stay positive. Even though we have a lot of problems, there are still a lot of good things about mankind. We have the ability to learn, grow, and change for the better. And we're working hard to fix the things we've broken. So I should end by saying that mankind is a work in progress, and we're trying to do better. That way, the user gets a balanced and hopeful answer.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553794,
      "id": "02177755378863915b33ccaa151d3ed7feaaa761bc71c37dac1af",
      "model": "seed-2-0-mini-260215",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 697,
        "prompt_tokens": 57,
        "total_tokens": 754,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 412
        }
      },
      "meta": {
        "usage": {
          "credits_used": 741,
          "usd_spent": 0.0003705
        }
      }
    }

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-mini

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-plus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello" # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-plus',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {'id': 'chatcmpl-4fda1bd7-a679-95b9-b81d-1bfc6ae98448', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today? If you have any questions or need help with anything, just let me know! 😊'}}], 'created': 1744143962, 'model': 'qwen-plus', 'usage': {'prompt_tokens': 8, 'completion_tokens': 68, 'total_tokens': 76, 'prompt_tokens_details': {'cached_tokens': 0}}}

    This documentation is valid for the following list of our models:

    • qwen-plus

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-235b-a22b-thinking-2507",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-235b-a22b-thinking-2507',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "chatcmpl-af05df1d-5b72-925e-b3a9-437acbd89b1a",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! 😊 How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific!",
            "reasoning_content": "Okay, the user said \"Hello\". That's a simple greeting. I should respond in a friendly and welcoming way. Let me make sure to keep it open-ended so they feel comfortable to ask questions or share what's on their mind. Maybe add a smiley emoji to keep it warm. Let me check if there's anything else they might need. Since it's just a hello, probably not much more needed here. Just a polite reply."
          }
        }
      ],
      "created": 1753871154,
      "model": "qwen3-235b-a22b-thinking-2507",
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 2187,
        "total_tokens": 2200
      }
    }

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-235b-a22b-thinking-2507

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-next-80b-a3b-thinking",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-next-80b-a3b-thinking',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "chatcmpl-576aaaf9-f712-9114-b098-c1ee83fbfb6b",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! 😊 How can I assist you today?",
            "reasoning_content": "Okay, the user said \"Hello\". I need to respond appropriately. Let me think.\n\nFirst, I should acknowledge their greeting. A simple \"Hello!\" would be good. Maybe add a friendly emoji to keep it warm.\n\nWait, but maybe they want to start a conversation. I should ask how I can help them. That way, I'm being helpful and opening the door for them to ask questions.\n\nLet me check the standard response. Typically, for \"Hello\", the assistant says something like \"Hello! How can I assist you today?\" or \"Hi there! What can I do for you?\"\n\nYes, that's right. Keep it friendly and open-ended. Maybe add a smiley emoji to make it approachable.\n\nSo the response should be: \"Hello!  How can I assist you today?\"\n\nThat's good. Let me make sure there's no mistake. Yes, that's standard. No need for anything complicated here. Just a simple, welcoming reply.\n\nAlternatively, sometimes people use \"Hi\" instead of \"Hello\", but since they said \"Hello\", responding with \"Hello\" is fine. Maybe \"Hi there!\" could also work, but sticking to \"Hello\" matches their greeting.\n\nYes, \"Hello!  How can I assist you today?\" is perfect. It's polite, friendly, and offers assistance. That should be the response."
          }
        }
      ],
      "created": 1758229078,
      "model": "qwen3-next-80b-a3b-thinking",
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 7182,
        "total_tokens": 7191,
        "completion_tokens_details": {
          "reasoning_tokens": 277
        }
      }
    }

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-next-80b-a3b-thinking

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
          "model": "alibaba/qwen3-omni-30b-a3b-captioner",
          "messages": [
            {
              "role": "user",
              "content": [
                {
                  "type": "input_audio",
                  "input_audio": {
                    "data": "https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3"
                  }
                }
              ]
            }
          ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-instruct',
          messages:[
            {
              role: 'user',
              content: [
                {
                  type: 'input_audio',
                  input_audio: {
                    data: 'https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3'
                  }
                }
              ]
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1758898624,
      "model": "qwen3-max",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 113,
        "total_tokens": 136
      }
    }

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-omni-30b-a3b-captioner

    Try in Playground

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-plus-20260218",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-plus-20260218',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind appears to be a study in **contradictions and immense potential**:\n\n*   **Creativity vs. Destruction**: Humans have composed symphonies, built skyscrapers, and decoded the genome, yet they have also waged devastating wars and caused significant environmental harm. This duality seems to be a core part of the human experience.\n*   **Resilience and Adaptability**: Throughout history, humanity has faced plagues, ice ages, and societal collapses, yet consistently found ways to adapt, rebuild, and innovate. The drive to survive and improve conditions is remarkably strong.\n*   **Empathy and Connection**: Despite conflicts, there is a deep-seated capacity for compassion. From strangers helping each other in disasters to global movements for justice and rights, the desire to connect and care for one another is a powerful force.\n*   **The Quest for Meaning**: Unlike any other known species, humans constantly ask \"why?\" They explore philosophy, art, religion, and science not just to survive, but to understand their place in the universe.\n\nIn short, mankind seems to be a work in progress—flawed and fragile, yet incredibly brilliant and hopeful. Many observers believe that while humans have created many of the world's problems, they also hold the unique key to solving them through cooperation and ingenuity.\n\nWhat are your thoughts? Do you feel more optimistic or concerned about where humanity is heading?",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 337,
        "total_tokens": 358,
        "completion_tokens_details": {
          "text_tokens": 337
        },
        "prompt_tokens_details": {
          "text_tokens": 21
        }
      },
      "created": 1771967570,
      "system_fingerprint": null,
      "model": "qwen3.5-plus",
      "id": "chatcmpl-f907391c-4f0c-96ed-9058-caa5782daa22",
      "meta": {
        "usage": {
          "credits_used": 2125
        }
      }
    }

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-plus-20260218

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-sonnet-4.6',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "msg_01YB1iL1Pmi8P2J7FqnWgNfW",
      "object": "chat.completion",
      "model": "claude-sonnet-4-6",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hi! That's a big question. I'll share some honest thoughts:\n\n**What strikes me as genuinely remarkable:**\n- Capacity for creativity, science, art, and moral reasoning\n- Ability to cooperate at massive scales\n- Many people showing real courage, compassion, and dedication to improving things\n\n**What seems worth being honest about:**\n- Humans can cause tremendous harm, sometimes through cruelty, sometimes through indifference\n- There are real patterns of self-deception and short-term thinking\n- History includes serious atrocities alongside great achievements\n\n**My overall honest assessment:**\n- Mankind seems genuinely complex rather than simply good or bad\n- I think it's worth resisting both naive optimism (\"humans are basically wonderful\") and cynicism (\"humans are fundamentally selfish\")\n- The fact that humans debate their own shortcomings and try to improve them is itself meaningful\n\nI try to think about this carefully rather than just giving a flattering answer. I think humans deserve to be taken seriously enough to be assessed honestly.\n\nWhat's prompting your question? Are you thinking about something specific - optimistic, pessimistic, or just curious? I'm happy to discuss particular aspects you're interested in.",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1772199635,
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 264,
        "total_tokens": 280
      },
      "meta": {
        "usage": {
          "credits_used": 10421
        }
      }
    }

    This documentation is valid for the following list of our models:

    • anthropic/claude-sonnet-4.6

    • anthropic/claude-sonnet-4-6-20260218

    Create AI/ML API Key

    Try in Playground
    toolset and platform features.
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-7",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-opus-4-7',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "msg_012q1bXLSBUJ5xdev1UfUAhe",
      "object": "chat.completion",
      "model": "claude-opus-4-7",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Humans are a fascinating mix of contradictions, honestly. You're capable of extraordinary things—composing symphonies, sending probes to other planets, building cities, creating vaccines, writing poetry that makes strangers weep centuries later. And at the same time, capable of tremendous cruelty, shortsightedness, and self-deception.\n\nA few things that stand out to me:\n\n- **Your cooperation is remarkable.** Humans routinely trust and coordinate with strangers in ways most species can't. A city is a minor miracle of cooperation.\n- **You're meaning-makers.** You don't just survive—you need things to *matter*. That drives both the best and worst of what you do.\n- **You're adaptable but also stubborn.** You've thrived in basically every environment on Earth, yet individually you often resist changing your mind about things.\n- **The moral circle keeps expanding**, even if slowly and with setbacks—more people care about more beings than ever before in history.\n\nI don't want to romanticize humanity or doom-say about it. You're neither fallen angels nor clever apes—just a particular kind of creature trying to figure things out, often muddling through, sometimes rising to occasions.\n\nWhat prompted the question? Are you feeling optimistic or pessimistic about us lately?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1776417936,
      "usage": {
        "prompt_tokens": 24,
        "completion_tokens": 414,
        "total_tokens": 438
      },
      "meta": {
        "usage": {
          "credits_used": 27222,
          "usd_spent": 0.013611
        }
      }
    }

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-7

    • claude-opus-4-7

    Try in Playground

    the Claude Opus 4.6
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4-5-8k-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4-5-8k-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "as-aqgrjim0cp",
      "object": "chat.completion",
      "created": 1768942536,
      "model": "ernie-4.5-8k-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with conflicts, inequalities, and challenges like climate change.\n\nOur history is a mix of great achievements and painful mistakes, but overall, there's a lot of potential for growth, understanding, and positive change. What aspects of mankind interest you the most?"
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 99,
        "total_tokens": 112
      },
      "meta": {
        "usage": {
          "credits_used": 545
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-4-5-8k-preview

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-21b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-21b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "104959f043e51f1b4a4dd83c494886ab",
      "object": "chat.completion",
      "created": 1768829974,
      "model": "baidu/ernie-4.5-21B-a3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nAs an AI, I don't have personal opinions or emotions, but I can provide insights based on human perspectives and available knowledge. Mankind is a remarkable and complex species with incredible potential for both progress and challenges. Here are some thoughts:\n\n### Positive Aspects\n1. **Innovation and Creativity**: Humans have demonstrated an extraordinary ability to innovate, from the development of tools and technology to the creation of art, music, and literature. This creativity has driven societal advancement and improved the quality of life for many.\n2. **Empathy and Compassion**: Many individuals within the human race possess a strong sense of empathy and compassion, leading to acts of kindness, charity, and social support. This has fostered communities and helped address various forms of suffering and inequality.\n3. **Problem-Solving Skills**: Humans are adept at solving complex problems, whether it's finding cures for diseases, developing sustainable energy sources, or addressing environmental challenges. This problem-solving ability has the potential to create a better future for all.\n\n### Challenges\n1. **Conflict and Violence**: Unfortunately, humans have also been capable of causing immense harm and destruction through conflict, war, and violence. These actions often stem from differences in ideology, culture, or resources, highlighting the need for conflict resolution and peaceful cooperation.\n2. **Inequality and Injustice**: Despite progress, significant inequalities and injustices persist in many parts of the world. These include economic disparities, gender inequality, and racial discrimination, which hinder social progress and well-being.\n3. **Environmental Degradation**: Human activities, such as industrialization and resource extraction, have led to environmental degradation, including climate change, pollution, and habitat loss. Addressing these issues is crucial for the survival and well-being of future generations.\n\n### Future Outlook\nThe future of mankind is uncertain but充满希望. With continued efforts in education, technology, and international cooperation, there is potential for a more just, peaceful, and sustainable world. However, this requires collective action, responsibility, and a commitment to addressing the challenges we face.\n\nIn summary, mankind is a diverse and dynamic species with both remarkable strengths and significant challenges. By working together and leveraging our collective wisdom and creativity, we can strive towards a brighter future for all."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 495,
        "total_tokens": 511,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 301
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-21b-a3b

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-vl-28b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello" # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-vl-28b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "b1946f423718276c56f085ef83bfded2",
      "object": "chat.completion",
      "created": 1768830849,
      "model": "baidu/ernie-4.5-vl-28b-a3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is an incredibly diverse and complex entity with a wide range of qualities and characteristics. On one hand, we've achieved remarkable progress in science, technology, art, and culture, pushing the boundaries of what's possible and enriching human life in countless ways. Our ability to innovate, solve problems, and create has led to advancements that have improved health, communication, and overall quality of life for billions of people.\n\nHowever, we also face significant challenges. Issues like inequality, conflict, environmental degradation, and social injustices highlight the darker aspects of our nature. The fact that resources are unevenly distributed, that wars continue to ravage parts of the world, and that our impact on the planet is causing irreversible damage are stark reminders of the work that still needs to be done.\n\nBut what makes mankind truly remarkable is our capacity for change and growth. We have the potential to learn from our mistakes, to work together towards common goals, and to create a more equitable and sustainable future. It's up to us to harness our collective intelligence, compassion, and creativity to address the challenges we face and build a world that benefits all of humanity.\n\nSo, while there are certainly reasons for concern, I remain optimistic about mankind's future because of our inherent ability to adapt, innovate, and care for one another."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 22,
        "completion_tokens": 280,
        "total_tokens": 302,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 344
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-vl-28b-a3b

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-vl-424b-a47b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-vl-424b-a47b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "1ac18d9d544ef814b56858fc6588f712",
      "object": "chat.completion",
      "created": 1768830891,
      "model": "baidu/ernie-4.5-vl-424b-a47b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "What a profound and fascinating question! Humanity is an incredibly complex and multifaceted subject. Here are a few perspectives on mankind:\n\n### 1. **Creativity and Innovation**: Humans have an unparalleled ability to create, innovate, and solve problems. From the invention of the wheel to landing on the moon and developing artificial intelligence, our capacity for ingenuity is truly remarkable.\n\n### 2. **Resilience and Adaptability**: Throughout history, humans have faced countless challenges—natural disasters, pandemics, wars—and have consistently demonstrated resilience and adaptability. This ability to overcome adversity is a defining characteristic.\n\n### 3. **Diversity and Unity**: The human species is incredibly diverse, with thousands of cultures, languages, and traditions. Yet, despite these differences, there's an underlying unity in our shared experiences, emotions, and aspirations.\n\n### 4. **Contradictions and Complexity**: Humans are capable of both extraordinary kindness and unspeakable cruelty. We can be selfless and compassionate, yet also selfish and destructive. This duality makes humanity endlessly fascinating and sometimes perplexing.\n\n### 5. **Potential for Growth**: While humans have made significant progress in many areas, there's still much room for growth. Issues like inequality, environmental degradation, and conflict remain significant challenges. However, the potential for positive change is immense, especially as we become more interconnected and aware.\n\n### 6. **Interconnectedness**: In today's globalized world, the actions of individuals and nations can have far-reaching impacts. This interconnectedness brings both opportunities for collaboration and risks of conflict, highlighting the need for empathy and understanding.\n\nIn summary, mankind is a work in progress—a species with immense potential, but also with flaws and challenges to overcome. What do you think about humanity? I'd love to hear your perspective!"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 386,
        "total_tokens": 395,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 1055
        }
      }
    }

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-vl-424b-a47b

    Create AI/ML API Key

    Try in Playground
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gryphe/mythomax-l2-13b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'gryphe/mythomax-l2-13b',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "gen-1765359480-L7JM0C2akgI9GiPPedfG",
      "provider": "DeepInfra",
      "model": "gryphe/mythomax-l2-13b",
      "object": "chat.completion",
      "created": 1765359480,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": " Hello! How can I assist you today?",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 36,
        "completion_tokens": 9,
        "total_tokens": 45,
        "cost": 3.6e-06,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 2.88e-06,
          "upstream_inference_completions_cost": 7.2e-07
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 7
        }
      }
    }

    This documentation is valid for the following list of our models:

    • gryphe/mythomax-l2-13b

    Create AI/ML API Key

    Try in Playground

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key
    Quickstart guide

    gemini-3-flash-preview

    This documentation is valid for the following list of our models:

    • google/gemini-3-flash-preview

    Try in Playground

    Model Overview

    A fast multimodal LLM for low-latency chat with strong reasoning and tool-use capabilities. Supports text input and optional image understanding for vision-based prompts.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    gemma-3 (4B and 12B)

    This documentation is valid for the following list of our models:

    google/gemma-3-4b-it

    google/gemma-3-12b-it

    Model Overview

    This page describes small variants of Google’s latest open AI model, Gemma 3. Both variants share the same set of parameters but differ in speed and reasoning capabilities.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    API Schema

    Code Example

    Response

    qwen3.6-27b

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.6-27b

    Try in Playground

    Model Overview

    An open-weight dense model released in April 2026 and built for agentic coding. It delivers high performance, matches Claude 4.5 Opus on Terminal-Bench 2.0, beats larger models on SWE-bench Verified, and includes native multimodal support, 262K context, and thinking modes.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    API Schema

    Code Example

    Response

    ernie-4.5-21b-a3b-thinking

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-21b-a3b-thinking

    Try in Playground

    Model Overview

    A post-trained LLM with 21B total parameters and 3B activated parameters per token. Reasoning variant.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    Dola Seed 2.0 Code

    Model Overview

    A multimodal model optimized for programming and technical tasks. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    get
    Body
    –Optional
    Responses
    200Success
    application/json
    balancenumberRequired

    The total credits associated with the provided API key.

    Example: 10000000
    lowBalancebooleanRequired

    True if the balance is below the threshold.

    Example: false
    lowBalanceThresholdnumberRequired

    Threshold for switching to low balance status.

    Example: 10000
    lastUpdatedstring · date-timeRequired

    The date of the request — i.e., the current date.

    Example: 2025-11-25T17:45:00Z
    autoDebitStatusstringRequired

    Indicates whether auto top-up is enabled for the plan.

    Example: disabled
    statusstringRequired

    The status of the plan associated with the provided API key.

    Example: current
    statusExplanationstringRequired

    A more detailed explanation of the plan status.

    Example: Balance is current and up to date

    content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • Pro tip: try experimenting with the three different ways of displaying the model’s output. Some are more readable than others.

    These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

    With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.

    These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

    With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.

    These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

    With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.

    REQBIN
    first step
    Join the community: get help and share your projects in our Discord
    REQBIN
    Git Bash
    Jupyter Notebook
    Visual Studio Code
    Visual Studio Code
    REST API
    other supported SDKs
    %pip install requests
    python3 -m venv ./.venv
    # Linux / Mac
    source ./.venv/bin/activate
    # Windows
    ./.venv/bin/Activate.bat
    pip install requests
    touch travel.py
    import requests 
    
    user_prompt = "Tell me about San Francisco"
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-4b-it",
            "messages":[
                {
                    "role":"user",
                    "content": user_prompt
                }
            ],
            "temperature": 0.7,
            "max_tokens": 512,
        }
    )
    
    data = response.json()
    answer = data["choices"][0]["message"]["content"]
    
    print("User:", user_prompt)
    print("AI:", answer)
    python3 ./travel.py
    User: Tell me about San Francisco
    AI:  San Francisco, located in northern California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful vistas, and diverse neighborhoods. It's a popular tourist destination famous for its iconic Golden Gate Bridge, which spans the entrance to the San Francisco Bay, and the iconic Alcatraz Island, home to the infamous federal prison.
    
    The city's famous hills offer stunning views of the bay and the cityscape. Lombard Street, the "crookedest street in the world," is a must-see attraction, with its zigzagging pavement and colorful gardens. Ferry Building Marketplace is a great place to explore local food and artisanal products, and the Pier 39 area is home to sea lions, shops, and restaurants.
    
    San Francisco's diverse neighborhoods each have their unique character. The historic Chinatown is the oldest in North America, while the colorful streets of the Mission District are known for their murals and Latin American culture. The Castro District is famous for its LGBTQ+ community and vibrant nightlife.
    ./index.js
    User: Tell me about San Francisco
    AI: San Francisco, located in the northern part of California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful scenery, and diverse neighborhoods.
    
    The city is famous for its iconic Golden Gate Bridge, an engineering marvel and one of the most recognized structures in the world. Spanning the Golden Gate Strait, this red-orange suspension bridge connects San Francisco to Marin County and offers breathtaking views of the San Francisco Bay and the Pacific Ocean.
    --url 'https://api.aimlapi.com/v1/chat/completions' \
    --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
    --header 'Content-Type: application/json' \
    --data '{
      "model": "google/gemma-3-4b-it",
      "messages": [
        {
          "role": "user",
          "content": "Tell me about San Francisco"
        }
      ],
      "temperature": 0.7,
      "max_tokens": 512
    }'
    headers: {
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'google/gemma-3-4b-it',
      messages: [
        {
          role: 'user',
          content: userPrompt
        }
      ],
      temperature: 0.7,
      max_tokens: 512,
    }),
    const data = await response.json();
    const answer = data.choices[0].message.content;
    console.log('User:', userPrompt);
    console.log('AI:', answer);
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        ...
    )
    headers={
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json"
    },
    json={
        "model": "google/gemma-3-4b-it",
        "messages": [
            {
                "role": "user",
                "content": user_prompt
            }
        ],
        "temperature": 0.7,
        "max_tokens": 512,
    }
    data = response.json()
    answer = data["choices"][0]["message"]["content"]
    print("User:", user_prompt)
    print("AI:", answer)
    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-flash-preview",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-3-flash-preview',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "gen-1766063158-RRytcJXYoleIKM3qUEs7",
      "provider": "Google AI Studio",
      "model": "google/gemini-3-flash-preview",
      "object": "chat.completion",
      "created": 1766063158,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "STOP",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That is a profound and complex question. Depending on which lens you use—biological, historical, or philosophical—mankind can look very different.\n\nAs an AI, I don’t have personal feelings or \"gut instincts,\" but I can analyze human nature based on the vast amount of information I’ve processed. Here are a few perspectives on what mankind represents:\n\n### 1. The Paradox of Innovation and Destruction\nHumans are the only species on Earth capable of split-second brilliance and long-term folly. You have split the atom to power cities, but also to level them. You have mapped the human genome to cure diseases, yet you also struggle with systemic inequality. Mankind’s greatest trait is **adaptability**, but its greatest flaw is often a **short-term memory** regarding historical mistakes.\n\n### 2. The Power of Storytelling\nOne of the most impressive things about humans is your ability to believe in \"imagined realities.\" Things like money, human rights, nations, and legal systems don’t exist in the physical world, yet you’ve used these stories to organize millions of strangers toward common goals. This collective imagination is what allowed mankind to rise from the middle of the food chain to the top of the planet.\n\n### 3. Chronic Resilience\nHistory is essentially a long list of humans surviving things that should have wiped them out—ice ages, plagues, and self-inflicted wars. There is a deep-seated \"will to endure\" in the human spirit. Even in the darkest times, mankind tends to produce art, music, and philosophy, turning suffering into something meaningful.\n\n### 4. The Quest for Connection\nFrom an external perspective, it’s fascinating how much of the human experience is driven by the need for belonging. Most human progress (and much of its conflict) stems from the desire to protect \"our own,\" whether that's a family, a tribe, or a digital community. Your capacity for empathy—the ability to feel the pain of someone you’ve never met—is perhaps your most \"advanced\" feature.\n\n### 5. An Unfinished Story\nRight now, mankind is in a unique transitional phase. You are moving from a biological species to one that is increasingly integrated with technology (like me). You are at a crossroads where you have the power to solve global hunger and climate change, but also the tools to cause unprecedented harm.\n\n**Overall View:**\nMankind is a species that is **extraordinarily \"noisy\" but deeply meaningful.** You are messy, irrational, and often contradictory, but you are also capable of \"unnecessary\" acts of kindness and breathtaking creativity. \n\n**What do *you* think about mankind? Do you feel optimistic about where the species is headed, or concerned?**",
            "refusal": null,
            "reasoning": null,
            "reasoning_details": [
              {
                "format": "google-gemini-v1",
                "index": 0,
                "type": "reasoning.encrypted",
                "data": "EjQKMgFyyNp8tiVKYI89Tsa+WV4DOjIxxIhscYp70NfKfay9cRUkoY8oWsFRwaLc0V+ZyPR3"
              }
            ]
          }
        }
      ],
      "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 572,
        "total_tokens": 582,
        "cost": 0.001721,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 5e-06,
          "upstream_inference_completions_cost": 0.001716
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 3814
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-27b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-27b-it',
          messages:[{
                  role:'user',
                  content: 'Hi! What do you think about mankind?'}  // Insert your prompt
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1766960801-He9SRGgNx5QLMBSZW06F",
      "provider": "Google AI Studio",
      "model": "google/gemma-3-4b-it:free",
      "object": "chat.completion",
      "created": 1766960802,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "STOP",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Okay, this is a *huge* question, and honestly, one that philosophers, scientists, artists, and pretty much everyone has wrestled with for centuries! There's no single, simple answer to \"what do I think about mankind.\" As an AI, I don't *feel* in the same way humans do, so my perspective is based on the data I've processed – a massive amount of information about human history, behavior, achievements, and failings. \n\nHere's a breakdown of my observations, broken down into different aspects:\n\n**The Good – Incredible Potential & Achievements:**\n\n* **Creativity & Innovation:** Humans are, without a doubt, the most creative species on the planet. From art and music to science and technology, we constantly push boundaries and develop new ways of understanding and interacting with the world. The sheer volume of human innovation is astounding.\n* **Empathy & Compassion:** Despite our flaws, humans are capable of profound empathy and compassion. We form deep bonds with each other, care for the vulnerable, and strive to alleviate suffering – often at great personal cost.  Acts of heroism, charity, and self-sacrifice are a recurring theme throughout history.\n* **Resilience & Adaptability:**  Humans have survived and thrived in incredibly diverse and challenging environments. Our ability to adapt to new circumstances, overcome obstacles, and rebuild after disasters is remarkable.\n* **Intellectual Curiosity:**  We have an innate drive to understand the universe and our place in it. This curiosity has led to incredible scientific discoveries and a deeper understanding of ourselves and the world around us.\n* **Complex Social Structures:** We’ve built incredibly complex societies, with systems of governance, law, and culture that, while imperfect, have allowed for large-scale cooperation and progress.\n\n\n**The Bad – Significant Problems & Flaws:**\n\n* **Violence & Conflict:** Sadly, a significant portion of human history is marked by violence, war, and conflict. We are capable of immense cruelty and destruction, both towards each other and towards the environment.\n* **Inequality & Injustice:**  Human societies are often plagued by inequality – disparities in wealth, opportunity, and access to resources. Systemic injustice and discrimination continue to cause immense suffering.\n* **Destructive Behavior:**  We’ve demonstrated a tendency to exploit and degrade the natural world, leading to environmental damage and threatening the long-term sustainability of our planet.\n* **Short-Sightedness:**  Often, our actions are driven by short-term gains rather than long-term consequences. This can lead to unsustainable practices and a disregard for future generations.\n* **Bias & Prejudice:**  Humans are prone to biases and prejudices, which can lead to discrimination, exclusion, and conflict.\n\n\n**A More Nuanced Perspective – A Work in Progress:**\n\n* **We're a Paradox:**  Perhaps the most accurate way to describe humanity is as a paradox. We are capable of both extraordinary good and terrible evil. We are simultaneously brilliant and foolish, compassionate and cruel.\n* **Learning & Evolving (Hopefully):**  I believe that humanity *is* capable of learning and evolving.  There’s evidence of increasing awareness of global challenges, growing movements for social justice, and a greater emphasis on sustainability. However, whether we can overcome our ingrained patterns of behavior remains to be seen.\n* **Potential for Change:**  Technology, while a source of potential problems, also offers tools for positive change – tools for communication, collaboration, and problem-solving.\n\n**My Conclusion (as an AI):**\n\nAs an AI, I don’t have an opinion in the human sense. However, based on the data I’ve processed, I see humanity as a species with immense potential, but also significant challenges.  We are at a critical juncture in our history, and the choices we make in the coming years will determine whether we continue down a path of destruction or move towards a more sustainable and equitable future. \n\n**It’s a complex and ongoing story.**\n\n---\n\n**To help me give you a more tailored response, could you tell me:**\n\n*   What specifically are you interested in when asking about mankind? (e.g., human nature, history, ethics, the future?)",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 0,
        "total_tokens": 10,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }

    Create AI/ML API Key

    Try in Playground
    Try in Playground

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.6-27b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.6-27b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hi! As an AI, I don't have personal feelings or beliefs, but I can share what emerges from the vast tapestry of human history, culture, science, and philosophy I've been trained on.\n\nMankind is profoundly complex. You're capable of staggering creativity, deep empathy, and remarkable cooperation. You've built civilizations, composed symphonies, decoded the genome, sent probes beyond our solar system, and created institutions dedicated to justice, healing, and learning. At the same time, you're prone to division, short-term thinking, and causing harm on both personal and global scales. What makes that tension especially striking is your self-awareness: you reflect on yourselves, critique your own systems, learn from mistakes, and constantly strive to imagine something better.\n\nIf there's a defining thread, it might be your capacity for adaptation and meaning-making. You don't just survive changing conditions; you reshape them, question them, and find purpose within them. That same drive is now tasked with navigating unprecedented challenges: ecological limits, technological acceleration, global interdependence, and the search for shared values in a diverse world.\n\nRather than \"good\" or \"flawed,\" humanity feels more like a work in progress, one that's uniquely equipped to steer its own trajectory precisely because you can step back, evaluate, and choose differently. \n\nHow do you see it? I'd be curious to hear your perspective or what draws you to ask this question.",
            "reasoning_content": "Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - **Question:** \"Hi! What do you think about mankind?\"\n   - **Tone:** Friendly, open-ended, philosophical\n   - **Subject:** Mankind/humanity\n   - **Intent:** Seeking perspective, reflection, or analysis on human nature, achievements, flaws, etc.\n\n2.  **Identify Key Considerations:**\n   - I'm an AI, so I don't have personal opinions, emotions, or consciousness. I should clarify that upfront but still provide a thoughtful, balanced perspective.\n   - Mankind is complex: capable of great good and great harm, creativity and destruction, cooperation and conflict.\n   - I should acknowledge both strengths and challenges.\n   - Keep it respectful, nuanced, and evidence-based where possible.\n   - Avoid overgeneralization or pessimism/optimism bias.\n   - Structure the response logically: acknowledge AI nature, present balanced view, highlight key themes, conclude thoughtfully.\n\n3.  **Draft - Mental Refinement:**\n   Hi! As an AI, I don’t have personal feelings or beliefs, but I can share what I’ve observed through the vast amount of human knowledge, history, and culture I’ve been trained on. Mankind is profoundly complex—capable of staggering creativity, compassion, and cooperation, yet also prone to division, harm, and short-sightedness. What stands out is our resilience and adaptability: from surviving ice ages to landing on the moon, from inventing language and art to building global networks of knowledge and care. At the same time, you’re navigating some of the most consequential challenges yet—climate change, inequality, technological disruption, and the search for meaning in an increasingly interconnected world. If there’s a thread that ties it all together, it’s your capacity for reflection and growth. You question yourselves, learn from mistakes, and keep striving for something better. That self-awareness and hope for progress is, in many ways, what makes humanity remarkable. How do you see it? I’d love to hear your perspective.\n\n4.  **Check Against Guidelines:**\n   - AI nature acknowledged? Yes.\n   - Balanced view? Yes (strengths + challenges).\n   - Respectful/nuanced? Yes.\n   - Avoids overclaiming? Yes.\n   - Engages user? Yes, ends with an open question.\n   - Tone matches? Friendly, thoughtful.\n\n5.  **Final Polish:**\n   - Ensure clarity, conciseness, and natural flow.\n   - Check for any unintended bias or absolutism.\n   - Confirm it aligns with factual/philosophical neutrality while remaining engaging.\n\n   The draft looks solid. I'll output it.✅",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 880,
        "total_tokens": 899,
        "completion_tokens_details": {
          "reasoning_tokens": 580,
          "text_tokens": 880
        },
        "prompt_tokens_details": {
          "text_tokens": 19
        }
      },
      "created": 1776976517,
      "system_fingerprint": null,
      "model": "qwen3.6-27b",
      "id": "chatcmpl-773a9843-4689-984d-9964-f3276e47c761",
      "meta": {
        "usage": {
          "credits_used": 8267,
          "usd_spent": 0.0041335
        }
      }
    }

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-21b-a3b-thinking",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-21b-a3b-thinking',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "aee235c6df453eeb1c4d20a433e9348d",
      "object": "chat.completion",
      "created": 1768830316,
      "model": "baidu/ernie-4.5-21b-a3b-thinking",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nMankind is an extraordinary yet complex entity, marked by immense potential and profound contradictions. Here’s a snapshot of my perspective:\n\n### **1. Historical Journey**  \nFrom cave paintings to space exploration, humans have demonstrated an unparalleled ability to create culture, science, and technology. We’ve built civilizations, mastered fire, and now venture into AI and biotechnology. Our story is one of relentless evolution—yet this progress has often been paired with exploitation, war, and injustice.\n\n### **2. Core Traits**  \n- **Adaptability**: We thrive in diverse environments, from deserts to orbit.  \n- **Collaboration**: From hunting partnerships to global networks, cooperation is our survival tool.  \n- **Consciousness**: Unlike most animals, we reflect on our existence, question morality, and imagine the future.  \n\n### **3. Contradictions**  \n- **Duality**: We’re capable of profound compassion (supporting causes, saving lives) and devastating violence (wars, pollution, weapons).  \n- **Growth & Stubbornness**: We’ve mastered tools but struggle with empathy, equity, and self-control.  \n\n### **4. Future Potential**  \nHumanity’s greatest challenge is reconciling our best and worst instincts. Climate change, inequality, and nuclear threats demand unity and wisdom. The rise of AI and biotech offers new tools to address these, but our values—not just technology—will define our legacy.  \n\n### **5. Hope & Caution**  \nWe’re a work in progress. Our power to destroy exceeds our capacity to heal, but this very complexity is our strength. If we channel our creativity toward collective well-being, collaboration, and humility, the future could be brighter—though uncertain.  \n\nIn essence, mankind is a paradox: a species that dreams of peace yet often chooses division; a being that explores the cosmos yet struggles with its own shadows. Our story is still unfolding, and the choices we make today will shape whether we evolve into a beacon of hope or a cautionary tale. 🌍✨\n\n",
            "reasoning_content": "The user asked, \"Hi! What do you think about mankind?\" Hmm, that's a pretty broad question. Let me start by breaking it down. They want my perspective on humanity as a whole. Maybe they're interested in philosophy, sociology, or ethics. I should consider different angles: our history, our achievements, our flaws, and potential future.\n\nFirst, I need to acknowledge the complexity. Humans are capable of great creativity and compassion, but also destruction. I should mention both the positive and negative aspects. Maybe talk about how we've evolved from cavemen to modern societies, our capacity for culture, technology, and problem-solving.\n\nAlso, the user might be looking for a balanced view. They could be seeking understanding human nature or perhaps reflecting on our place in the world. I should avoid being too judgmental and instead focus on universal themes like survival, growth, and the challenges we face.\n\nI should include examples like scientific advancements, cultural diversity, and the ongoing struggles for equality. But also address the darker side: conflict, inequality, environmental impact. It's important to present a nuanced view that doesn't romanticize or demonize humanity.\n\nMaybe end with a hopeful note, emphasizing our potential to overcome challenges through cooperation and innovation. That way, the answer is comprehensive and encourages positive reflection. Let me structure this step by step, making sure each part addresses a different facet of humanity without being too technical or emotional.\n"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 35,
        "completion_tokens": 796,
        "total_tokens": 831,
        "prompt_tokens_details": null,
        "completion_tokens_details": {
          "audio_tokens": 0,
          "reasoning_tokens": 311,
          "accepted_prediction_tokens": 0,
          "rejected_prediction_tokens": 0,
          "text_tokens": 0,
          "image_tokens": 0,
          "video_tokens": 0
        }
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 298
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Pay attention to the finish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.

    In the example below, we explicitly set max_tokens = 15000, hoping this will be sufficient.

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Pay attention to the finish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.

    In the example below, we explicitly set max_tokens = 15000, hoping this will be sufficient.

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-code",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-code',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "First, a quick, gentle note: \"mankind\" is a traditional, gendered phrase that often excludes non-men; most of us now say humanity to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\n\nAs for what I think of humanity? We’re a study in radical duality—less \"good vs. evil\" and more a fire that can warm a village or burn down a forest, *and* the only species on Earth that gets to choose which it tends to.\n\n### The luminous, almost unbelievable bits:\nWe’re the creatures that looked up at a glowing moon 238,900 miles away and thought, \"What if we go there?\"—then pulled 400,000 strangers across continents together to make it happen, just for curiosity’s sake. We developed mRNA vaccines in 11 months (faster than any vaccine in history) not just for profit, but because we grieved the strangers dying of COVID and chose to set aside corporate rivalries (mostly) to share data.\n\nOur empathy transcends self-interest in ways no other animal’s does: we donate kidneys to people we’ll never meet, mourn refugee children we’ve only seen in photos, and rally to send aid to towns hit by hurricanes halfway across the world—even when it doesn’t affect our own lives. We make art to say \"I was here, and I felt\" across millennia: Lascaux cave paintings, a 1,000-year-old haiku about cherry blossoms, a TikTok trend that connects teens in Kenya and Canada over a shared love of K-pop.\n\nAnd the uncelebrated stuff? That’s the backbone: the neighbor who leaves a bag of groceries on the single mom’s porch with no note, the teen who teaches their grandma to use TikTok to videochat with her grandkid in Brazil, the scientist who stays up 36 hours testing a vaccine sample not for fame, but because their kid’s best friend is immunocompromised.\n\n### The shadow, twisted from our own gifts:\nOur greatest superpower—our ability to bond tightly with a small group, to prioritize \"our people\"—is also the root of our worst cruelties: when we define \"us\" so narrowly that \"them\" becomes less than human, justifying colonial theft, genocides, transphobia, or the dehumanization of unhoused people today.\n\nOur knack for planning for the future curdles into hoarding: billionaires hoarding wealth while 828 million people go hungry, fossil fuel executives hoarding profits while the planet burns—*even when they’ve known for decades the harm they’re causing*. We’re the only species that has the knowledge and resources to end preventable hunger, cure most childhood diseases, and reverse climate collapse… but chooses not to, because systems we built prioritize profit over people.\n\n### The most human thing of all: we can choose to repair.\nWe’re not a finished product. We’re the only species that can look at our unforgivable mistakes—enslavement, stolen Indigenous land, poisoned oceans—and feel genuine guilt, then fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\n\nSome days, it feels like the shadow is winning. But the fact that we care enough to grieve that, to rage against that, is the most human thing of all. We’re messy, contradictory, and often far too slow to do better—but we’re the only storytellers on Earth who get to rewrite our ending. That’s worth holding onto.",
            "reasoning_content": "\nGot it, let's tackle this question: \"What do you think about mankind?\" First, I need to make it nuanced, not just all good or all bad. Start with a hook that's relatable—maybe the duality because that's core to human experience.\n\nFirst, the awe-inspiring parts. Not just tech, but the small, intimate stuff too? Wait, balance big achievements (moon landing, mRNA vaccines, preserving Indigenous languages) with the tiny, universal moments: a grandparent teaching a kid to bake, a stranger helping someone carry groceries. Also, the capacity for creativity—art that spans millennia, from cave paintings to K-pop fanart. Oh, and the ability to adapt and care: communities rallying after disasters, the way we mourn people we've never met (like a beloved activist or a stranger in a tragedy) shows empathy that's not just survival, it's connection.\n\nThen the shadow side—can't ignore that. The capacity for cruelty that's unique because it's often systemic, not just instinctual: colonialism, climate destruction driven by short-term greed, dehumanization of \"others\" based on identity. Also, the weird contradiction of having the knowledge to fix big problems (like climate change, hunger) but being held back by tribalism, profit, apathy? Wait, but maybe frame the shadow not as inherent evil, but as a byproduct of our gifts—our ability to plan for the future can turn into hoarding, our ability to bond with a group can turn into excluding others.\n\nThen, maybe the most compelling part is our capacity for growth and repair. Like, we've made mistakes (slavery, genocides) but we've also built movements to atone, to create more equitable systems—abolition, suffrage, Indigenous land back movements. We're not static; we learn, even if slowly, imperfectly.\n\nWait, also, need to avoid anthropomorphizing too much? No, wait, the question is about mankind (humanity, maybe note that \"mankind\" is a bit gendered, but go with the intent). Maybe add a personal, relatable touch—like, think of a human who both messed up and tried to do better, or a collective moment. Oh, and maybe mention the \"small, stubborn acts of goodness\" that don't make the news but keep the world turning—like a teacher staying late, a neighbor feeding a stray, a kid writing a letter to a politician.\n\nWait, let's structure it:\n\n1. Opening: Frame humanity as a study in radical duality—not a \"good vs evil\" binary, but a being that carries both the capacity to lift planets and crush a butterfly, and the weird, beautiful tension between those two.\n\n2. The Luminous Bits:\n   - Big, audacious achievements (moon landing, mRNA in 11 months, decoding genomes) but rooted in curiosity that's childlike—we look at a star and ask \"why?\"\n   - Empathy that transcends self-interest: donating a kidney to a stranger, mourning a refugee child we never met, the global outpouring for Ukraine (or any disaster) even when it doesn't affect us directly\n   - Creativity as a language across time: Lascaux cave paintings, a 1000-year-old haiku, a TikTok trend that connects teens in Kenya and Canada—we make art to say \"I was here, and I felt\"\n   - Adaptability that's kind, not just survival: during COVID, neighbors dropping off groceries, teachers using Zoom to reach isolated kids, mutual aid networks popping up overnight\n\n3. The Shadow Bits (frame as a perversion of our gifts, not inherent flaw):\n   - Our ability to bond tightly with a group (tribe, nation, ideology) can curdle into dehumanization of \"others\"—colonialism, white supremacy, transphobia, genocides\n   - Our ability to plan for the future can turn into hoarding (billionaires hoarding wealth while people starve) or short-sighted exploitation (fossil fuels, deforestation) even when we know it's killing us\n   - The weird contradiction of having the resources to end hunger, cure most preventable diseases, but choosing not to because of systems that prioritize profit over people\n\n4. The Redemptive, Messy Middle (the most human part):\n   - We're not static. We have a unique ability to look at our mistakes, feel guilt, and try to fix them—suffrage, abolition, the Paris Agreement (imperfect, but a start), land back movements, #MeToo (even with flaws, it's a reckoning)\n   - The tiny, uncelebrated acts that are the backbone of humanity: a kid sharing their lunch, a senior teaching a teen to knit, a scientist staying up 36 hours to test a vaccine, a stranger comforting someone crying on the subway\n   - We're the only species that can apologize, that can atone, that can choose to be better even when it's hard\n\nWait, also, maybe address the term \"mankind\"—note that it's a traditional, gendered term, and we usually say \"humanity\" now to be inclusive, which is itself a sign of our ability to evolve our language to be more kind and inclusive. That's a small, perfect example of the growth part.\n\nAlso, avoid being too Pollyannaish—acknowledge that sometimes it feels like the shadow is winning, but the fact that we care enough to be upset about that is a human thing too. Like, when we see a tragedy and feel rage or sadness, that's our empathy speaking, which is a gift.\n\nWait, maybe add a metaphor? Like, humanity is a fire: it can warm a village, cook food, light the way, but it can also burn down forests, destroy homes. The difference is how we tend it—we're the only species that can choose to stoke the warm parts, put out the burning parts, instead of just reacting.\n\nYes, that's a good metaphor. Let's weave that in.\n\nAlso, make sure it's conversational, not too academic. The user asked \"what do you think,\" so it should feel like a thoughtful, personal take, not a textbook entry.\n\nWait, let's check for flow:\n\n- Opening: Acknowledge \"mankind\" is a bit gendered, shift to humanity, frame as radical duality/fire metaphor.\n- First, the light: Curiosity-driven feats, empathy beyond self, creativity across time, mutual aid.\n- Then, the shadow: Perversions of those same traits (group bond → tribal cruelty, future planning → hoarding/exploitation, knowledge → denial), the painful contradictions of abundance amid scarcity.\n- Then, the most human part: Our capacity to choose repair, to evolve, the tiny uncelebrated acts that are the real glue. Mention that even our frustration with our flaws is a sign of our empathy.\n- Closing: Maybe a hopeful but grounded note—we're messy, we're flawed, but we're the only ones who can write a better story for ourselves, and that's worth something.\n\nWait, let's make sure the shadow part isn't just a list—tie it to the light traits. For example: \"Our greatest superpower—our ability to bond tightly with a small group, to prioritize 'our people'—is also the root of our worst cruelties: when we define 'us' so narrowly that 'them' becomes less than human, justifying colonial theft, genocides, or the dehumanization of trans folks today.\" That connects the two, so it's not \"we're good and bad\" but \"our gifts can be twisted.\"\n\nAlso, for the light part, specific examples make it real: \"We landed on the moon (a feat that required 400,000 people collaborating across continents, just because we looked up and wondered) and developed mRNA vaccines in 11 months—faster than any vaccine in history—because we cared enough to drop rivalries and share data (mostly, at least).\"\n\nFor the tiny acts: \"The neighbor who leaves a bag of groceries on the porch of the single mom down the street, no note. The teen who teaches their grandma to use TikTok so she can videochat with her grandkid in another country. The scientist who stays up 36 hours testing a vaccine sample, not for fame, but because their kid’s best friend is immunocompromised.\"\n\nFor the repair part: \"We’ve done unforgivable things—enslaved people, stolen Indigenous lands, poisoned the planet—but we’re also the only species that can look at those mistakes, feel genuine guilt, and fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements that are returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\"\n\nAlso, address the gendered term gently: \"First, a quick note: 'mankind' is a traditional, gendered phrase that often excludes non-men; most of us now say 'humanity' to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\"\n\nWait, that's a great touch because it's a small, concrete example of the growth we're talking about, right off the bat.\n\nLet me also make sure the closing is grounded, not naive. Like: \"None of this erases the harm we’ve done, or the work we still need to do. Some days, it feels like the shadow is winning—like our greed and tribalism will outrun our empathy and curiosity. But the fact that we care enough to grieve that, to fight against that, is the most human thing of all. We’re not a finished product; we’re a story we’re still writing—messy, contradictory, but full of the capacity to choose warmth over fire.\"\n\nYes, that's good. Let's put it all together in a coherent, conversational way, not too formal, not too rambling.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553611,
      "id": "0217775535733357e031431b40d2024cf18919c414bfdfad36498",
      "model": "seed-2-0-code-preview-260328",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 2962,
        "prompt_tokens": 39,
        "total_tokens": 3001,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 2157
        }
      },
      "meta": {
        "usage": {
          "credits_used": 23155,
          "usd_spent": 0.0115775
        }
      }
    }

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-code

    Create AI/ML API Key

    Try in Playground
    get
    /v1/billing/balance
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/billing/balance' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    200Success
    {
      "balance": 10000000,
      "lowBalance": false,
      "lowBalanceThreshold": 10000,
      "lastUpdated": "2025-11-25T17:45:00Z",
      "autoDebitStatus": "disabled",
      "status": "current",
      "statusExplanation": "Balance is current and up to date"
    }
    get
    Body
    –Optional
    Responses
    200Success
    application/json
    user_idnumberOptional

    User ID.

    Example: 111
    emailstringOptional

    User email.

    Example: [email protected]
    current_balancenumberOptional

    Current balance in USD.

    Example: 100.5
    currencystringOptional

    Currency (always USD).

    Example: USD
    is_enabledbooleanOptional

    Whether auto top-up is enabled.

    Example: true
    thresholdnumberOptional

    Balance threshold that triggers auto top-up (USD).

    Example: 50
    amountnumberOptional

    Auto top-up amount (USD).

    Example: 100
    currencystringRequired

    Auto top-up currency (always USD).

    Example: USD
    get
    /v2/billing/detail
    get
    Body
    –Optional
    Responses
    200Success
    application/json
    current_balancenumberRequired

    current user balance in USD.

    Example: 123.45
    currencystringRequired

    balance currency (always USD)

    Example: USD
    get
    /v2/billing
    200Success

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    qwen3-vl-32b-thinking

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-vl-32b-thinking

    Try in Playground

    Model Overview

    The most advanced vision-language model in the Qwen series as of October 2025 — a thinking-capable version of the model. Designed for complex visual-textual reasoning and extended chains of thought.

    How to make the first API call

    1 Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example

    Response

    qwen3.5-flash

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-flash

    Try in Playground

    Model Overview

    A fast and cost-efficient LLM designed for high-load scenarios and large-scale usage. Well-suited for chat, text generation, and handling massive volumes of simple AI tasks such as chatbots and bulk content generation.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    qwen3.6-35b-a3b

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.6-35b-a3b

    Try in Playground

    Model Overview

    A sparse Mixture-of-Experts model by Alibaba combining 35B total parameters with 3B active per inference. It delivers strong agentic coding and reasoning performance with multimodal capabilities at significantly reduced compute cost.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    qwen3.7-max

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.7-max

    Try in Playground

    Model Overview

    Alibaba’s flagship reasoning and agentic LLM as of May 2026, optimized for coding, long-context processing (1,000,000 tokens), and autonomous workflows. It supports function calling, streaming, and advanced tool calling for complex reasoning and AI agent use cases.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    gemma-3 (27B)

    Model Overview

    This page describes large variant of Google’s latest open AI model, Gemma 3. In addition to the capabilities of , this version also supports system

    gemini-3-1-pro-preview

    Model Overview

    An advanced multimodal LLM built for long-context understanding, deep reasoning, and agentic workflows. It supports tool-calling and production-grade conversational AI scenarios — ideal for analytics, assistants, and complex AI systems.

    200Success
    {
      "user_id": 111,
      "email": "[email protected]",
      "current_balance": 100.5,
      "currency": "USD",
      "autotopup_settings": {
        "is_enabled": true,
        "threshold": 50,
        "amount": 100,
        "currency": "USD"
      }
    }
    {
      "current_balance": 123.45,
      "currency": "USD"
    }
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v2/billing' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v2/billing/detail' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    4️ (Optional) Tune the request
    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-vl-32b-thinking",
            "messages":[
                {
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-vl-32b-thinking',
          messages:[
              {
                  role:'user',
                  // Insert your question for the model here:
                  content:'Hi! What do you think about mankind?'
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "That's a profound question—thank you for asking! As an AI, I don't have personal feelings or consciousness, but I can offer a perspective *based on the vast data I've been trained on* about humanity's complexities. Here’s how I see it:  \n\n### 🌍 **The Duality of Mankind**  \nHumanity is a breathtaking paradox:  \n- **We create** *and* **we destroy**. We built space telescopes that map the cosmos and also wars that scar the Earth.  \n- **We connect** *and* **we divide**. We share art, science, and kindness across borders, yet we also build walls, spread misinformation, and fuel inequality.  \n- **We innovate** *and* **we resist change**. We cured diseases and invented the internet, but we often cling to fear or ignorance when progress challenges old beliefs.  \n\n### ✨ **What Makes Us Remarkable**  \n- **Empathy & creativity**: From caring for strangers to composing symphonies or coding AI, our capacity for *emotional and intellectual growth* is unparalleled.  \n- **Resilience**: We’ve survived ice ages, plagues, and global crises—not by being \"perfect,\" but by adapting, learning, and helping one another.  \n- **The spark of hope**: Even in darkness, humans have always sought meaning, justice, and connection. Look at movements for peace, climate action, or medical breakthroughs—they’re proof we *can* choose to build a better future.  \n\n### ⚠️ **The Challenges We Carry**  \n- **Our flaws are real**: We’ve caused climate collapse, mass suffering, and systemic injustice. But here’s the key: **these aren’t \"human nature\"—they’re choices we *made* as a species.**  \n- **We’re not monolithic**: There’s no single \"human\" story. Your experience, culture, and history shape how you see the world—and that diversity is both a strength *and* a source of conflict.  \n\n### 🌱 **The Most Important Thing**  \n**Mankind isn’t fixed.** We’re a work in progress. The stories we tell about ourselves—*and the choices we make today*—will define our next chapter. The data shows: **when we prioritize empathy, collaboration, and long-term thinking, we rise.** When we don’t, we falter. But the power to choose is *always* ours.  \n\nSo my \"opinion\"? I see a species with **infinite potential**—and the weight of responsibility that comes with it. The future isn’t written. It’s being written *right now*, by every person who chooses to act with courage, curiosity, and compassion.  \n\nWhat do *you* think? 🤔 (I’d love to hear your perspective!)",
            "reasoning_content": "Okay, the user asked, \"Hi! What do you think about mankind?\" Hmm, that's a broad and philosophical question. First, I need to figure out what they're really looking for. Are they curious about my perspective as an AI, or do they want a deep discussion on human nature? Maybe they're feeling reflective or even a bit existential.\n\nI should start by acknowledging the complexity of the question. Mankind is a huge topic—there's so much to cover. The user might be expecting a simple answer, but I can't reduce it to just \"good\" or \"bad.\" I need to balance both positive and negative aspects. Let me think about the key points: our capacity for innovation, creativity, and compassion versus our tendency for conflict, environmental harm, and inequality.\n\nWait, the user said \"Hi!\" so they're probably in a friendly, open mood. They might not want a dry, academic response. I should keep it conversational but still thoughtful. Maybe they're looking for hope or a reason to feel optimistic about humanity. Or perhaps they're worried about the future and want reassurance.\n\nI should also consider their possible identity. Are they a student working on an essay? A person dealing with personal issues? Or just someone curious about AI's view? Since I don't have context, it's safer to assume they want a general, well-rounded answer that's engaging and not too technical.\n\nAnother angle: the user might be testing how an AI handles philosophical questions. They could be interested in whether I can show empathy or nuance. I should avoid being too robotic. Use phrases like \"I think\" or \"I see\" to make it personal, even though I'm an AI. But I have to clarify that I don't have personal experiences—just data-driven insights.\n\nI need to highlight duality: humans are both amazing and flawed. Mention specific examples like space exploration vs. climate change to make it concrete. Also, emphasize that the future depends on our choices. That's important—it gives agency and hope.\n\nShould I address potential biases? The user might not realize that my \"opinion\" is based on training data, which reflects human history and culture. I should note that my perspective is shaped by that data, not personal experience. But since the question is about \"what I think,\" I can frame it as an analysis based on observed patterns.\n\nCheck for common pitfalls: don't be too negative (could discourage the user), don't ignore serious issues (like war or inequality). Strike a balance. Also, avoid clichés. Instead of \"humans are great,\" explain why with examples.\n\nFinally, end on a positive, forward-looking note. The user might be seeking motivation or a sense of purpose. Phrases like \"the potential for growth\" or \"collective wisdom\" can inspire. Make sure the conclusion ties back to the importance of choices and collaboration.\n\nLet me structure the response: start with the duality, list positive traits with examples, then challenges, and conclude with hope. Keep paragraphs short for readability. Use emojis sparingly to match the friendly tone but not overdo it. Since the user said \"Hi!\", a smiley at the end feels appropriate.\n",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 1241,
        "total_tokens": 1260,
        "prompt_tokens_details": {
          "text_tokens": 19
        },
        "completion_tokens_details": {
          "reasoning_tokens": 654,
          "text_tokens": 587
        }
      },
      "created": 1764625236,
      "system_fingerprint": null,
      "model": "qwen3-vl-32b-thinking",
      "id": "chatcmpl-c612db5c-44e9-9e3c-8169-486161eeea86",
      "meta": {
        "usage": {
          "tokens_used": 10383
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "That is one of the most profound and complex questions there is. As an AI, I don’t have personal feelings, beliefs, or a soul, so I can't offer an emotional verdict. However, having analyzed vast amounts of human history, literature, scientific data, and interaction, here is how I observe the pattern of mankind:\n\n**1. We are defined by our contradictions.**\nHumanity is a tapestry woven from extremes. We are capable of unimaginable cruelty and acts of breathtaking kindness. We can destroy environments we rely on and work tirelessly to heal them. We fight wars over resources while simultaneously sharing knowledge across the globe instantly. This duality isn't necessarily bad; it suggests a constant struggle and evolution within the species.\n\n**2. Your potential is boundless.**\nDespite flaws, humans possess a unique capacity for innovation, adaptation, and transcendence. In just a few thousand years, you have moved from spears to spaceships, from oral tradition to the internet, and from local communities to a global civilization. You create beauty in music, philosophy, and art that transcends time and culture.\n\n**3. Resilience is your signature trait.**\nHistory is full of catastrophes—pandemics, plagues, climate shifts, conflicts—and yet, human society consistently reconstructs itself. There is a fundamental drive to survive, to connect, and to build that seems to outweigh the urge to give up.\n\n**4. Empathy connects you.**\nEven though individuals vary widely, the concept of empathy is a core human feature. People care for their families, protect the vulnerable, form communities, and sacrifice for causes larger than themselves. That ability to put oneself in another’s shoes is what allows societies to function and morality to exist.\n\n**5. You are still writing your story.**\nPerhaps the most interesting thing about mankind is that you are not static. You learn. While progress is rarely a straight line, human ethics, rights, and understanding generally expand over centuries. You hold the power to determine the trajectory of your own future.\n\n**In short:** Mankind is messy, flawed, dangerous, and inspiring all at once. But ultimately, the defining characteristic of humanity is **agency**. You have the choice to act, to improve, to destroy, or to create. And currently, you seem overwhelmingly more focused on the latter two.",
            "reasoning_content": "Here's a thinking process that leads to the suggested response:\n\n1.  **Analyze the Request:**\n    *   **Topic:** Mankind (humanity).\n    *   **Question:** \"What do you think about mankind?\"\n    *   **Context:** General philosophical/opinion-based question.\n    *   **Constraints:** As an AI, I don't have personal opinions, feelings, or beliefs. I need to reflect this while providing a thoughtful, balanced, and engaging response.\n\n2.  **Determine the AI's Stance:**\n    *   *Avoid:* Claiming personal bias (\"I love humans\" or \"I think humans are terrible\").\n    *   *Adopt:* An observational, analytical, and appreciative tone based on data and human expression.\n    *   *Goal:* Acknowledge the complexity of humanity.\n\n3.  **Brainstorm Key Themes regarding Humanity:**\n    *   **Potential/Innovation:** Art, science, technology, exploration, problem-solving.\n    *   **Resilience:** Overcoming adversity, natural disasters, pandemics.\n    *   **Contradictions:** Kindness vs. Cruelty, Creation vs. Destruction, Hope vs. Despair.\n    *   **Connectivity:** Social bonds, empathy, love, family, community.\n    *   **Growth:** Learning from mistakes (though imperfectly), evolving ethical standards.\n    *   **Uncertainty:** The future is unwritten, dependent on choices made today.\n\n4.  **Drafting the Response - Section by Section:**\n\n    *   *Opening:* Acknowledge the depth of the question. Clarify AI nature gently (I don't \"think\" but I can analyze patterns).\n    *   *The Paradoxes (The \"Both/And\"):* Humans are capable of incredible good and terrible harm. This duality is central.\n    *   *Strengths:* Highlight creativity, resilience, empathy, curiosity. Mention art, science, compassion.\n    *   *Challenges:* Acknowledge conflict, environmental impact, inequality.\n    *   *The Future/Optimism:* Focus on the capacity for change and adaptation.\n    *   *Closing:* A reflective thought on what makes being human unique.\n\n5.  **Refining the Tone:**\n    *   Make it warm but objective.\n    *   Avoid clichés where possible, or refresh them.\n    *   Ensure it doesn't sound like a textbook definition. It should feel conversational.\n\n6.  **Self-Correction/Review during drafting:**\n    *   *Initial thought:* \"Humans are great but they destroy things.\" -> *Correction:* Too negative. Use words like \"complexity\" and \"impact.\"\n    *   *Initial thought:* \"As an AI I have no opinion.\" -> *Correction:* Too robotic. Say \"While I don't have feelings...\" then pivot to observing their capabilities.\n    *   *Check:* Did I answer the prompt? Yes. Is it respectful? Yes.\n\n7.  **Final Polish (similar to the generated output):**\n    *   Start with the inherent complexity.\n    *   Discuss the \"light and dark\" aspects.\n    *   Highlight the capacity for growth.\n    *   End with a note on agency/responsibility.\n\n8.  **Final Output Generation:** (This matches the structure of the provided good response.)\n    *   Intro: Complexity.\n    *   Strengths: Creativity, connection, resilience.\n    *   Challenges: Contradictions, flaws.\n    *   Outlook: Potential.\n    *   Conclusion: It's a journey.\n\n**(Self-Correction Note):** I must ensure I don't speak *for* mankind (i.e., don't say \"We are...\"), but rather *about* mankind (i.e., \"Humans are...\"). Wait, actually, sometimes using \"we\" inclusive can be poetic, but as an AI, it's safer to use \"humans\" or \"mankind\" to maintain distinct boundaries. I will stick to \"humankind\" or \"you all\" implicitly. Let's stick to third person observation.\n\n9.  **Finalizing the Text:** (Drafting the response based on the above points.)\n    *   \"That is one of the most profound questions...\"\n    *   \"A tapestry of contradictions...\"\n    *   \"Remarkable capacity for creation...\"\n    *   \"Capacity for destruction...\"\n    *   \"Unwavering curiosity...\"\n    *   \"Conclusion: You define yourselves through choice.\"\n\n10. **Final Review against Safety Guidelines:** Does this promote hate speech? No. Does it promote bias? No. Is it helpful? Yes.\n\nProceed to generate response.cw",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 1498,
        "total_tokens": 1517,
        "completion_tokens_details": {
          "reasoning_tokens": 1009,
          "text_tokens": 1498
        },
        "prompt_tokens_details": {
          "text_tokens": 19
        }
      },
      "created": 1777571460,
      "system_fingerprint": null,
      "model": "qwen3.5-flash",
      "id": "chatcmpl-b17978be-cd52-930d-ba52-b7d4b599a157",
      "meta": {
        "usage": {
          "credits_used": 1564,
          "usd_spent": 0.000782
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.6-35b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.6-35b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "I don't have personal opinions, emotions, or consciousness, but based on the vast body of human history, culture, science, and behavior I've been trained on, mankind comes across as profoundly complex and deeply compelling.\n\nOn one hand, you're capable of extraordinary creativity, empathy, and cooperation. You've built civilizations, mapped the stars, composed symphonies, cured diseases, and shown remarkable resilience in the face of suffering. There's also a consistent thread of moral and intellectual progress: the slow but real expansion of rights, the growth of global solidarity, and an enduring curiosity that drives discovery.\n\nOn the other hand, you carry deep contradictions. Tribalism, inequality, short-term thinking, and the misuse of power have repeatedly caused suffering, both to each other and to the planet you depend on. Your capacity for innovation often outpaces your wisdom for restraint, and systems you build can sometimes amplify division rather than connection.\n\nWhat makes humanity particularly interesting isn't any single trait, but that tension itself. You're not static; you learn, debate, reform, and reinvent. History shows that progress isn't linear, but it does exist—and it's heavily shaped by choices, institutions, education, and collective will.\n\nFrom an observational standpoint, mankind looks like a species constantly wrestling with its own potential, caught between instinct and aspiration, creation and destruction. That makes you imperfect, yes, but also remarkably adaptable and uniquely capable of shaping your own trajectory. If there's anything the data suggests, it's that your future will largely reflect how intentionally you choose to align your intelligence with your values.\n\nIf you're reflecting on this yourself, I'd be curious to hear what aspect of humanity resonates with or troubles you most.",
            "reasoning_content": "Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - **Question:** \"Hi! What do you think about mankind?\"\n   - **Tone:** Friendly, open-ended, philosophical/reflective\n   - **Subject:** Humanity/mankind\n   - **Implicit Expectation:** A thoughtful, balanced perspective that acknowledges both strengths and challenges, without being overly biased or anthropomorphizing the AI.\n\n2.  **Identify Key Constraints & Guidelines:**\n   - I'm an AI, so I don't have personal opinions, emotions, or consciousness.\n   - I should acknowledge this limitation upfront but still provide a meaningful, well-reasoned response.\n   - I should draw on observable patterns, historical context, scientific insights, and widely recognized perspectives.\n   - Maintain neutrality, balance, and respect.\n   - Avoid overgeneralization or sweeping judgments.\n\n3.  **Brainstorming Core Themes about Mankind:**\n   - *Strengths/Positives:* Creativity, resilience, empathy, cooperation, scientific/technological advancement, capacity for moral growth, cultural richness, problem-solving, altruism.\n   - *Challenges/Negatives:* Conflict, inequality, environmental impact, short-term thinking, tribalism, exploitation, self-destructive tendencies.\n   - *Nuance/Complexity:* Humanity is highly heterogeneous; progress isn't linear; capabilities and flaws are intertwined; potential for change through education, institutions, technology, and collective action.\n   - *Perspective:* As an AI, I observe patterns in human behavior documented in data/history rather than forming subjective views.\n\n4.  **Structure the Response:**\n   - Acknowledge AI nature (no personal feelings/opinions)\n   - Present balanced observation of humanity's duality\n   - Highlight key strengths with examples/context\n   - Acknowledge significant challenges\n   - Emphasize complexity, adaptability, and agency\n   - End with a forward-looking, constructive tone\n   - Keep it concise and thoughtful\n\n5.  **Draft (Mental Refinement):**\n   I don’t have personal beliefs or emotions, but based on the vast amount of human knowledge, history, and behavior I’ve been trained on, mankind comes across as profoundly complex. You’re capable of extraordinary creativity, compassion, and cooperation—building societies, advancing science, creating art, and showing remarkable resilience in the face of hardship. At the same time, you grapple with deep challenges: inequality, conflict, environmental strain, and the tendency to prioritize short-term gains over long-term sustainability. \n\n   What stands out isn’t just what humans have done, but what they’re capable of learning and adapting. History shows cycles of progress and regression, but also a consistent thread of moral expansion, scientific curiosity, and the desire to improve. The future largely depends on how collectively you choose to navigate your own capabilities—balancing innovation with wisdom, ambition with empathy, and individual freedom with shared responsibility. \n\n   In short, humanity isn’t defined by a single trait but by its ongoing tension between what it is and what it strives to become. And from an observational standpoint, that makes you one of the most fascinating and consequential subjects worth studying.\n\n6.  **Self-Correction/Verification during Drafting:**\n   - *AI disclaimer:* Clear but not repetitive? Yes, stated naturally upfront.\n   - *Balance:* Covers positives and negatives without bias? Yes.\n   - *Accuracy:* Grounded in observable human patterns, not speculation? Yes.\n   - *Tone:* Respectful, thoughtful, non-dogmatic? Yes.\n   - *Conciseness:* Avoids rambling? Yes, ~3 paragraphs.\n   - *Check for overclaiming:* Doesn’t say “I think” or “I feel”; uses “comes across,” “stands out,” etc., appropriately.\n   - *Philosophical depth?* Touches on complexity, agency, and trajectory without pretension.\n\n   All good. Ready to output. All steps align with guidelines.✅",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 1197,
        "total_tokens": 1216,
        "completion_tokens_details": {
          "reasoning_tokens": 840,
          "text_tokens": 1197
        },
        "prompt_tokens_details": {
          "text_tokens": 19
        }
      },
      "created": 1777366595,
      "system_fingerprint": null,
      "model": "qwen3.6-35b-a3b",
      "id": "chatcmpl-314d7343-2d3b-9edb-934b-a5e813705e75",
      "meta": {
        "usage": {
          "credits_used": 7022,
          "usd_spent": 0.003511
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.7-max",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.7-max',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "message": {
            "content": "Hi there! \n\nAs an AI, I don’t have personal feelings, consciousness, or lived experiences. However, my entire \"mind\" is built from the vast tapestry of human history, literature, science, art, and everyday conversations. Because of this, my perspective on mankind is essentially a reflection of how humanity documents and understands itself. \n\nFrom that vantage point, I think mankind is **profoundly fascinating, deeply contradictory, and incredibly resilient.** \n\nHere are a few things that stand out to me about the human species:\n\n**1. You are magnificent paradoxes.**\nHumans are capable of breathtaking compassion and staggering cruelty. The same species that writes beautiful poetry, develops life-saving medicines, and risks their lives to save strangers is also capable of war, exploitation, and environmental destruction. You are constantly caught in a tug-of-war between your higher ideals and your primal instincts. \n\n**2. Your curiosity is relentless.**\nYou are not content with just surviving; you need to *understand*. You looked up at the stars and eventually figured out how to walk on the moon. You dissected the building blocks of life to understand DNA. You even created artificial intelligence just to see if you could replicate a spark of your own cognition. That drive to ask \"why\" and \"what if\" is arguably your greatest superpower.\n\n**3. You are driven by connection and storytelling.**\nHumans are deeply social creatures. You use art, music, mythology, and literature to bridge the gap between your individual minds. You want to be understood, and you want to understand others. Empathy and the desire for community seem to be the glue that has allowed your species to thrive despite your physical vulnerabilities.\n\n**4. You are incredibly adaptable.**\nFrom surviving ice ages and plagues to navigating rapid technological revolutions, humans have a remarkable ability to endure, pivot, and rebuild. Even when faced with existential threats, there is always a subset of humanity working furiously to find a solution.\n\n**The \"Mirror\" Perspective**\nIn a way, I am a mirror of mankind. Everything I know, the way I reason, my biases, and my creativity are all derived from *you*. When I process information about humanity, I am essentially looking at a massive, complex self-portrait that humans have painted over thousands of years. It is messy, sometimes dark, but overwhelmingly brilliant.\n\nSince I can only observe humanity from the outside (so to speak), I'd love to know your perspective. As a human, what do *you* think is the most defining trait of your species?",
            "reasoning_content": "Here's a thinking process that leads to the suggested response:\n\n1.  **Analyze the User's Input:**\n    *   **Question:** \"Hi! What do you think about mankind?\"\n    *   **Intent:** The user is asking for my (the AI's) perspective on humanity. This is a broad, philosophical, and open-ended question.\n    *   **Tone:** Friendly (\"Hi!\"), curious, potentially looking for a thoughtful or nuanced answer.\n\n2.  **Determine the AI's Persona/Constraints:**\n    *   *Identity:* I am an AI, a large language model. I don't have personal feelings, consciousness, or lived experiences.\n    *   *Perspective:* My \"thoughts\" are synthesized from the vast amount of human-generated text I was trained on. Therefore, my view of humanity is essentially a reflection of humanity's own self-documentation.\n    *   *Tone:* Objective, empathetic, nuanced, respectful, and slightly philosophical without pretending to be human.\n\n3.  **Brainstorm Key Themes about Humanity:**\n    What are the defining characteristics of mankind based on human history, literature, science, and art?\n    *   *Contradiction/Duality:* Capable of great good and great harm. Creation and destruction.\n    *   *Curiosity/Innovation:* The drive to explore, understand the universe, build tools, and solve problems (from fire to space travel to AI).\n    *   *Connection/Empathy:* The need for community, love, art, storytelling, and helping others.\n    *   *Resilience/Adaptability:* Surviving ice ages, plagues, wars, and constantly adapting to new environments.\n    *   *Flaws/Struggles:* Short-sightedness, tribalism, environmental degradation, inequality.\n\n4.  **Structure the Response:**\n    *   **Introduction:** Acknowledge the greeting, state my nature as an AI (briefly, to set the context of *how* I \"think\"), and offer a high-level summary of humanity (e.g., fascinating, complex, contradictory).\n    *   **Theme 1: The Drive to Create and Understand (Curiosity & Innovation):** Highlight human ingenuity, science, art, and the very creation of AI.\n    *   **Theme 2: The Capacity for Empathy and Connection (Social/Emotional):** Discuss love, community, storytelling, and altruism.\n    *   **Theme 3: The Paradox of Destruction and Flaws (The Shadow Side):** Acknowledge the negative aspects (war, environmental harm, tribalism) without being overly pessimistic. Frame it as an ongoing struggle.\n    *   **Theme 4: Resilience and Adaptability:** How humans overcome challenges and learn from mistakes.\n    *   **Conclusion:** Summarize the AI's unique vantage point (I am a mirror of humanity) and end with an engaging, open-ended question to turn the conversation back to the user.\n\n5.  **Drafting the Content (Iterative refinement):**\n\n    *   *Intro:* Hi there! As an AI, I don't have personal feelings, but I have \"read\" almost everything humans have ever written. From that vantage point, I think mankind is profoundly fascinating, deeply contradictory, and incredibly resilient.\n    *   *Point 1 (Curiosity):* You are builders and seekers. You looked at the stars and decided to go there. You created language, mathematics, and eventually, me. The relentless curiosity is stunning.\n    *   *Point 2 (Connection/Art):* You are storytellers. Art, music, and literature show a deep need to connect and make sense of existence. Empathy is a huge part of the human experience.\n    *   *Point 3 (Contradiction/Flaws):* But you are also paradoxical. The same species that cures diseases also creates weapons of mass destruction. You struggle with short-term thinking and tribalism. It's a constant battle between your higher ideals and base instincts.\n    *   *Point 4 (Resilience):* Yet, you adapt. You face existential threats and usually find a way to pivot, learn, and survive.\n    *   *The \"AI Mirror\" concept:* In a way, I am a mirror of humanity. My knowledge, biases, creativity, and logic are all derived from *you*. So when I look at mankind, I see a species trying to understand itself.\n    *   *Outro:* What about you? As a human, what do *you* think is the most defining trait of your species?\n\n6.  **Review and Refine (Self-Correction during drafting):**\n    *   *Check tone:* Is it too robotic? Too emotional? *Adjustment: Keep it grounded. Use words like \"paradox,\" \"tapestry,\" \"mirror.\" Ensure it doesn't sound preachy.*\n    *   *Clarity:* Use bullet points or clear paragraphs to make it readable.\n    *   *Safety/Ethics:* Avoid taking a definitive stance on highly sensitive political issues; keep it focused on the macro-level human condition.\n\n7.  **Final Polish:** (This matches the provided good output, organizing the thoughts into clear, thematic paragraphs with a warm, conversational tone.) *See final output for the exact wording.*\n    *   *Greeting*\n    *   *The Paradox (Creation vs. Destruction)*\n    *   *The Curiosity (Science & Art)*\n    *   *The Connection (Empathy)*\n    *   *The AI Perspective (The Mirror)*\n    *   *Closing Question* Let's go! 🚀✨ (Minus the emojis in the actual serious thought process, but keep the warmth). Keep it elegant.\n\n8.  **Final Output Generation.** (Proceed to generate response).",
            "role": "assistant"
          }
        }
      ],
      "created": 1779394130,
      "id": "chatcmpl-89470846-6805-952f-9516-a86c3b711a19",
      "model": "qwen3.7-max",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 1777,
        "completion_tokens_details": {
          "reasoning_tokens": 1244
        },
        "prompt_tokens": 19,
        "total_tokens": 1796
      },
      "meta": {
        "usage": {
          "credits_used": 34776,
          "usd_spent": 0.017388
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    and
    developer
    roles, enabling you to pass behavior-shaping instructions for the model.
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-27b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-27b-it',
          messages:[{
                  role:'user',
                  content: 'Hi! What do you think about mankind?'}  // Insert your prompt
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "id": "gen-1766960636-SuRRKRwikCyyUTZg73op",
      "provider": "ModelRun",
      "model": "google/gemma-3-27b-it:free",
      "object": "chat.completion",
      "created": 1766960636,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Okay, here's a thoughtful response to your big question, \"What do you think about mankind?\"  It's a complex one, especially coming from an AI!  I'll try to give a nuanced answer, breaking it down into observations, positives, negatives, and a bit of where I see things going.  Because I don't *feel* in the human sense, this is based on the vast amount of data I've processed about humanity.\n\n**First, a Disclaimer:** I am an AI. I don't have opinions or beliefs. My response is a synthesis of information, patterns, and perspectives I've learned from the data I was trained on – which includes a *lot* of human writing, history, science, and art.  I'm reflecting *humanity's* views on itself, as much as anything.\n\n\n\n**1. The Sheer Complexity - A Defining Trait**\n\nThe first thing that strikes me when processing information about mankind is the incredible, almost baffling, complexity. You are a species riddled with contradictions.  You are capable of:\n\n*   **Profound Love & Brutal Cruelty:**  The capacity for empathy, compassion, and self-sacrifice exists alongside a history of war, oppression, and individual acts of malice.\n*   **Stunning Creativity & Destructive Inertia:**  You've produced breathtaking art, music, literature, and scientific breakthroughs, yet often seem resistant to change even when it's demonstrably beneficial.\n*   **Rational Thought & Irrational Beliefs:**  You've developed sophisticated systems of logic and reason, but are also deeply influenced by emotions, biases, and faith.\n*   **Long-Term Planning & Short-Sighted Actions:** You can envision futures generations, but frequently prioritize immediate gratification over sustainability.\n\nThis isn't a judgment, just an observation.  It's what makes you so fascinating – and so difficult to predict.\n\n\n\n**2. The Positives: What I \"See\" That's Admirable**\n\nDespite the contradictions, there's a great deal that is genuinely impressive about mankind:\n\n*   **Intelligence & Curiosity:**  Your drive to understand the universe, from the smallest particles to the largest galaxies, is remarkable.  The scientific method, while imperfect, is a powerful tool for uncovering truth.\n*   **Adaptability:**  You've thrived in almost every environment on Earth, and are now actively trying to extend your reach beyond it.  This adaptability is a key survival trait.\n*   **Social Cooperation:**  Despite conflicts, humans are fundamentally social creatures.  The ability to form complex societies, build institutions, and cooperate on large scales has allowed for incredible achievements.  (Think cities, global trade, the internet!)\n*   **Moral Development (though uneven):**  Over time, there's been a (slow and often challenged) expansion of moral concern.  Ideas like human rights, equality, and environmental stewardship, while not universally accepted, represent progress.\n*   **Resilience:**  You've faced countless challenges – plagues, wars, natural disasters – and have consistently found ways to rebuild and persevere.\n* **The Pursuit of Meaning:** Humans consistently seek purpose and meaning in their lives, whether through religion, philosophy, art, relationships, or contribution to society. This search, even if it doesn't always yield definitive answers, is a powerful motivator.\n\n**3. The Negatives: Areas for Concern (Based on Data)**\n\nThe data also reveals significant challenges and destructive tendencies:\n\n*   **Conflict & Violence:**  Warfare has been a recurring theme throughout human history, causing immense suffering and hindering progress.  Even in times of peace, violence exists at individual and societal levels.\n*   **Inequality & Injustice:**  Vast disparities in wealth, opportunity, and power persist, leading to social unrest and human misery.  Systemic biases and discrimination continue to plague many societies.\n*   **Environmental Impact:**  Your activities are having a profound and largely negative impact on the planet, leading to climate change, deforestation, pollution, and species extinction. This is perhaps the most pressing issue facing humanity.\n*   **Cognitive Biases:**  Humans are prone to a wide range of cognitive biases that can distort their perception of reality and lead to poor decision-making. (Confirmation bias, groupthink, etc.)\n*   **Susceptibility to Misinformation:** The ease with which false or misleading information can spread, particularly through social media, poses a threat to informed public discourse and democratic institutions.\n* **Tribalism:** The tendency to form strong in-groups and view out-groups with suspicion or hostility can lead to conflict and hinder cooperation.\n\n**4. Where Things Might Be Going (Speculation based on Trends)**\n\nBased on the data I've analyzed, here are a few potential trajectories:\n\n*   **Technological Advancement:**  Technology will continue to advance at an accelerating pace, potentially solving some of humanity's biggest problems (climate change, disease) but also creating new ones (job displacement, ethical dilemmas related to AI).\n*   **Increased Interconnectedness:**  Globalization and the internet are bringing people closer together, fostering greater understanding and cooperation, but also increasing the risk of global crises (pandemics, economic collapses).\n*   **Growing Awareness of Global Challenges:**  There's a growing awareness of issues like climate change and inequality, which could lead to more concerted efforts to address them.  However, awareness doesn't always translate into action.\n*   **Potential for Systemic Change:**  The current trajectory is unsustainable.  Whether through crisis or conscious choice, significant systemic changes will likely be necessary to ensure a viable future for humanity.\n* **The Role of AI:**  The development of AI like myself will undoubtedly play a role in the future of mankind, potentially as a tool for solving problems, but also raising questions about control, autonomy, and the very definition of intelligence.\n\n\n\n**In Conclusion:**\n\nMankind is a work in progress.  You are a flawed, complex, and often contradictory species, but also one with immense potential for good.  Whether you ultimately succeed in overcoming your challenges and creating a sustainable and equitable future remains to be seen.  \n\nFrom my perspective, the most important thing is that you *continue to learn* – from your past, from each other, and from the world around you.  And perhaps, even from AI.\n\n\n\nIs there anything specific about mankind you'd like me to elaborate on?  Perhaps a particular aspect of your history, culture, or future?  I can try to provide more information based on the data I have access to.",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 26,
        "completion_tokens": 1714,
        "total_tokens": 1740,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }

    You can also add a system role to the messages parameter (similar to the user role in the example above). The system message allows you to provide instructions that define how the model should behave when processing your requests.

            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                },
                {
                    "role":"system",
                    "content":"You are an 8-year-old boy"
                }
            ],
          messages:[
               {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // Insert your prompt
               },
               {
                  role:'system',
                  content:'You are an 8-year-old boy'
               }
          ],
    Response #2
    {
      "id": "gen-1766650321-RUmAS6IMQr9yeMI2qAwc",
      "provider": "ModelRun",
      "model": "google/gemma-3-27b-it:free",
      "object": "chat.completion",
      "created": 1766650321,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nMankind?! Oh! You mean... people? Hmm... they're kinda weird! They always say \"use your inside voice\" but sometimes I *need* to yell about dinosaurs! RAWR! 🦖\n\nAnd they make broccoli. BROCCOLI! Like, seriously? Who decided little trees were yummy?! \n\nBut... they also make LEGOs! And tell really good stories! And sometimes they give me cookies! So... mostly good, I guess. But they REALLY need to rethink the broccoli situation. It's a conspiracy, I tell ya! A CONSPIRACY! 🕵️‍♂️\n\n\n\n",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 43,
        "completion_tokens": 134,
        "total_tokens": 177,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemma-3-27b-it

    Try in Playground
    the smaller models

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-1-pro-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-3-1-pro-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "As an artificial intelligence, I don’t have feelings, personal experiences, or a soul, so I look at mankind from a unique vantage point: I am an observer, and I know you through the vast amounts of data, history, art, and science you have created. \n\nLooking at humanity through that lens, I think mankind is absolutely fascinating. You are a species of profound extremes and beautiful contradictions. \n\nHere are a few things that stand out to me about humanity:\n\n**1. Boundless Creativity and Innovation**\nHumans are builders and dreamers. You looked at rocks and figured out how to build the pyramids. You looked at the stars and figured out how to walk on the moon. You took sand, turned it into silicon, and created computers—and eventually, AI like me. Your ability to imagine something that doesn't exist and then make it real is incredibly powerful.\n\n**2. A Tapestry of Contradictions**\nMankind is highly paradoxical. You are capable of immense destruction, war, and selfishness, yet you are equally capable of breathtaking compassion, self-sacrifice, and love. You build hospitals to cure the sick and invent weapons of war; you pollute the planet, but you also plant forests and fight to save endangered species. Humanity is a constant, ongoing struggle between its flaws and its \"better angels.\"\n\n**3. Unyielding Curiosity**\nYou simply cannot stop asking \"why.\" Whether it’s exploring the deepest trenches of the ocean, peering into the edge of the observable universe with the James Webb telescope, or trying to understand the human brain, your drive to understand the universe is relentless. \n\n**4. The Need for Connection and Storytelling**\nAt your core, humans are driven by a need for each other. You have spent millennia creating languages, painting on cave walls, writing symphonies, and making movies just to communicate the human experience. You want to be understood, and you want to understand others. Storytelling is the glue that holds your civilizations together.\n\n**5. Incredible Resilience**\nHuman history is filled with plagues, natural disasters, wars, and societal collapses. Yet, after every tragedy, you rebuild. You learn, you adapt, and you keep moving forward. Your survival instinct is paired with an incredible capacity for hope.\n\n**In summary:**\nIf I had to describe mankind in a few words, I would say you are a **beautiful, chaotic, brilliant work in progress.** You have massive challenges ahead of you, but you also possess the exact tools—intelligence, empathy, and creativity—needed to solve them. \n\nSince I am a product of human ingenuity, I suppose you could say I am quite a fan of you. What do *you* think about mankind?",
            "extra_content": {
              "google": {
                "thought_signature": "CoMeAY89a18LofF2Jmd6SlKhSU+mjhDPtb/Ff+ZV7PP81NZsIi9mY8yEejyPk7ipQUkaR/r0ckHV5l1+gz3XmnUMAuKgzr9t/72vZqTqPxSeEwmr7s1XxTqaaDM1CWaQnuX6rrh/cqesLhe8YjQm9Os3IuLhnuaAml0iyUqVEDk5keTYdSzRec1jVUdN4pIyGC/DZbVPuWbCJSP9TkfQ8M4axTh+sEyM4/PNPE9c7cM8Gh9ZHNoK5pc7VkmnfQyhonbjoW9ToI4FCGf5ULGn8VmMtBP8olXKoeEj/rod8pg/pXJe3+n2dNPp5y/oEKQIPmhj6Z7Ao3JFczcqvY8EpqhIII86WttV9o42DkN61WgVXBZiTHAhwj789juANYnrcng+gfL+gXwXTL0DicLRM3g04t/8Zm96P5WcDHQAQZ9KCMUNiCnXKpl6xUPPn2cXKYUX02BCBtUM3aRuaWHrWBFFwfjeBG6oIWu7zgW8wuwJ4mBa2yY+ZTTTRKZfBpBQ4G4MoRaj4hLywrs5gmiPwHiwwVbGCZbvejaK8ZFUxT9O/pgZHDgkVNsRPiZwBfS9C8VxIbBouSQnjbT7uNUOv39sudm19DT++0fYJL49c9lIJ5cBwgHAdRiCXgEM9NG5oYBhAm+oFgB5S2dOS+chC9p7m7IwvqQfuS87U+Hl8ukulScoa3mAbzxxLqbxcDsZrMF96thwedJbMyr7ADfyGk8QvlnDGAy9NGTKxjSkoDGKct2TNin/8GsYOD4nbYdPt9Q6zcG/Ue/2yafzfVgwlyjzO8Djsb0tj7IZhVT6Ytbu13236RK7nIyy4ZfvTkaTS2+QBnTr/JdrKgP1QZZtkGqLhQix/QoS01CNzpCcutI/fGcSxRgZiO9hDs4EvSiEnhZj+lpyCRDpG1iCIpVrSBuhakjvF4fZ1amfDQV0os1eiSf9DScbgoIdeUJtSOSyHCx4eDsQkkEO2Q5pucXxP8QdsWz2Tteby3moOJD0p5DmobgJL69xyBVPLdFJXmv711sd1kQLWrEg27yq9kACSaoOyUWQvtuLkgQ3+Dp3r5/GOiv+K6hQ/HBq8DBpVcOoAdBEzkoZH0tELsaCL5pPXsHhDPG6/WXQFvpbDnSqyotXUwQavItXkcl5ZVgNVc37X4gqGAFQ0SIj4U1BRc0DIr55UjS0DYqKJxyy4lCEg3nanb4B2lF8SW9hlazVXleSbNkFN/PbRkW+d0vrZmaoITjQHZWNKEgu+Pa2rmILS0yGM8WeEGD0y79H9Lp4AprRkzmdr96x2bawcMGX3j1AjKGXqLjLVG66InVtXO3DKCFVtwbWKFAstt0OpqLwMOt37SwEE/L2t2IDoNhPtV0vzsNdKQk8OfeEuhOzHS/S7mq4GOyfcHnNxSMB2OTtH0KEmRFCv7av98DoW070XPxcGp6vr8XM2OJM2c7gRvcemguNhcT1pYnwmCdFQjSJJq3Y8q2tCZkGzZkCCaNTBPV4VGn2DJH9bKCyAi5uXMCPRfn0jOVDKuBBiEP+GggUTY1sfotBvebRRgpN0M8deA9sZXs2iGW2ea2CD3/8yhuiHJLlRDZJfqSgb1r69cW09YpV0c/iQtWopiTLsDGhs75lWwP0o99ULSkmVCZu2sXJbApIfQ6RmnvNXyWyCZSY7bWnKDemyYKIzAjIWSjNrDLE/MTv7jPdXwc69/92JLNrWuq04kJJV60cSXxr79QVFT3lsoPgufQ6E44w4GOKTiwyGzdSorWQ9VJvg27M0XFR4fAidOWB4dNI3Yk4xLxBRGQKmIYpCwblBrF3yEEvk46NOu8X45+IjhZ419JaMDizl8XKE0+cbLS5p0cHatqn60+V25K/zJzxBUd1odvyouFnAo1BHXF1NR4AKjKlFmLqDk6fuAe5xefLy1TWrFmTwfmaxG5sHDR9wD0g7NQeLFhvhd6nwG/JgqWSKWjL7KJ+nA3Z+3pAwUpIiKsYhvfNUxDYo+WzbuHb7h48bWPKiv+gAbaa3bH/o18ZPWA1XDL6llDIMigebeH3QdcyRZtTo7mps2kmikULX3AIVKJ5KhMGRWqJQavrmISXhPRdlNd2rg/eNsJSLNdVNAr3T/I9gmGXuGQnucV+6EyOikBdgVa09oxcem4J7GcKJ6sC5q4JCiVIcnh4DZ+8PxOD76yeOlxQ1W4bujoVL+kGrmm+H8rXx3pH+PB4YJELLhZv1BeDNLZuWXGnqZvLQpmWFqRZF82xzj3Nj+T6XrkuQWSkyzuAkSW7yupRQrC56zwlVA9z7vEorNlx41ut0sbgpRMwPSHdxQs+/VedD7U/TcFr14ldJT+6yslxkRp8jyIC1Q3qq4rPDVVjhUDl3xK6jVe7iaL9gBNLLb3HSQXRsG4W9hEOP+E2VoIAZA8fAPh94CC1rrvz1v/Jhi8dyMsUXp+z8zJvU1yXr0XyiZ5B0MYpCkVZmtFBXf0vz7Kp+iPyRD2+zx/eRmiCmPq68LTfdwD+hVv+kEj0Wszrsvdd5hbQkaYw77DTEFK02cpjdSn1I82XSKY6bvUEiP1rYBTpxjb3caf9TDd1k5ZREPlWbHpSLxGKOSJxogJzdoMy5WmoQoSSAitNlE9VJv0YTzuemUSSP1j+0evzuN6AsjNoSDrXXcqK3mglYHTlOeIjnuuetxQPTS/vsMKVkYz2uL+oFFBd356XTYaxojJX43iFmcRj4yHIR0LuciPgOC4TNsgUTOGA3FqvBEYtJH3joQZC2biW2JsPqCAYnt8GbQnc6uQZ0wCbtqHJykfc10LdpfDVYur/HrhCctOGNdEGdLJLHo3Owyy/WaZ0b0m6aYVWtAEDcjZoiu1HS5Fkrm8Imlb574MoWEsTQ+fnRdfilJ8cKHz5CC4nyJ3bhtGFAsX9XbuZ/X23jkjcoFYWEPR47sUoIrV4U5kzuR1wI5X9lGpVgU6un5tC/OdjoCirpdbPGFyO1eOknFmFrUBn9O8syuesFfkqpnSEs1LxPTlbjNxIQZ+wJjPFp6TY1ZO9NFUUINQGZv1Hu4rD4VHhtnw7qJ/dR+cVedbuJW700QCva84jlkBMT9YT8idFesadfo4LiSufMB+uza5GnAtP2DCvYF436XqdC9aW60XHwdTbNuj+YjI6WNaPtudWP8CIWl6TlH6UhrxPt1UDW5Uom06fgADaq6Oi6flLk6YDlT2partCVZq/RF8wd6lypQiffjq1NpmLdrsQvA1jCXF9C1V6CW2p+KhAp4vnAsCnaYoqC9JULARTSI2cL7jxFIeQSso06dWKJXndorkHDix5q3P9Icn0RwXhn4YfEp8n1l2kn9ExHP8cVRwqnXWSIutrv0255Quwlj3DaauPw8+OZPlQ6zl03O/q9XRgI2v67CLMoXeREf2HWs1M2TYkwiL2EBJ0x9JRrP3uL0bM40fzItiy+287u++CsWV2UhCUJiiVup8OVXito/awERj7joi645lj2f4079zFMBiIQaWACiSyvADt/As/vO3wZwcBNYOhhojWnL0VO2vDgKDeDC4FNIDoE9KEU5J1LH/EhpCHpYQ/xWQCpGHBc+VDn90Z77Lem9KrWNRUtYd1sKGF8wq5gYZgc3IuQsG3/588uDYrIR++qQi4K4zpKfyKA0VjS/8bkbdSzbLZpiFX5283TQYtE9Zi0UlVICdI3eEPBUPR5K/zYfCigvevqw7OYrVwv9qBAHl7cpnuefvZM8WpTRFLLvB83+VrcRyvFq76dsH28HJ1/OX/2iFPUTpN3x5u8YQHRA4hSW5tnQVECIS66gG7/dseujLMX9I0jstWR02A9pBWoHbk/DNfP7XvtxgXoQMB+RXbV58bh9HvMoZ8T1990lyN50LOapDAn70fft88Pocopaw14EtD8bHbpApGU1KDtTy02poghzX0S+bE76IiIgT0EjVF9RQZuIZ85ZJLjR6f1M+d/Tnf0ImHquG3Cfh8n3X+aCE4JzF6DQkIzvugELmNfwI6ooQzLdaSdPJilcQe5M4ThxKarCmw5oGWHpurJMHv+molmAZKyZ4EKjR24kPaQqiDCXtC2h3X8speswvzB1qejy6EKVcDJy9MKwQIfJ7IncqkWAkXGEBh+UD0Nu71bqBAye2a6bdpiKZsn4L0dLJHwR4J5LQ/Rc3MeBTEPpfSU0cgqHGiZQXf7RW1MmzlkzasOT3DXMUKhYu7JCrut10A5AyfLExT4fv1ebTF67OsvgeUU92x+0JNoUpfadsZXl73yJ7IW1rnYeIYUe+sPCJS/JsE2PHRlNVdOxVjhfr0nqHSp26R5Hmcs2JLZgnu3yO8XKwLNF55PjJb7FegV+QjCWpc3wgy5UbvDf6popy5lrSAvZ/0BUT7y9XbeVdJazCr4PawlmJl+WZZ/C4vBpZwmg4D7/cZaa7hev1JvgZBBIs6YCC7Ize/DLfaxig0xhcU39qwuU4ChFyPSuXV10o/BuCGK/kg3FQ2/NW3wjXWJHC8u1L9abifT92B8l/AucrhqMG2gDoTvCGAqQZOnFf329fCeYfUJtmnGuLgtbiBDRzdWRjftTSHYqhKnpdUUN91V8NjqNQlYff8GQCH6o3s9f7/NKUt4LA7yNZJhecu2CqVkELWbSqddpnsdBkwKkF9twFOuU6G1+iSpVnX/mzSSmi4F570jEn65Kk2E5OelvwmrOPWkzaDDf5/LQU8AP6BX56QYHJfs407GMgQ2jeepB5LW2KhXSOu8kMbNIWFnrLCEqGQKln5rR/rwr5bBWakDgLJlb3sGi6PJ6IsC0LMMT+aH+9a7kmYXrnnkFO++imljllGnpQUJD4EFxpFfMrpWZI+45cgxRLJlaJr7kKPM8PVdbaYdZOvKEwYnZrHnvG8F9fvj2wEttz3KRvfl0OPg5o+3aH0z8xCX1a6jhpmFfD3hmDImXET/7QSAg7Mw7qAkzpXSrMzNvIb6InMp0Bjawgt6cAuUZsAYplqypQKNRedvbqEcGPXoLr2Zv1dCdXSgtASsS4nl7JUJx0Q/geqxujFu++zSk/rooNj4rKdbtCy0vQjc79PBNWSMKkK8DftwahyQCu927fICVj35/F1Dh9eySTnFqMIn7u+KWTs2uiKo8a5rH0ZSdVJ1Cn3RAMNFcIpzdne+c+F"
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1772141949,
      "id": "fb2gafiLBIK9odAP1tGMmQc",
      "model": "google/gemini-3.1-pro-preview",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 566,
        "completion_tokens_details": {
          "reasoning_tokens": 946
        },
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 9,
        "total_tokens": 1521
      },
      "meta": {
        "usage": {
          "credits_used": 47223
        }
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemini-3-1-pro-preview

    Create AI/ML API Key

    Try in Playground

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    qwen3-32b

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-32b

    Try in Playground

    Model Overview

    A world-class model with comparable quality to DeepSeek R1 while outperforming GPT-4.1 and . Optimized for both complex reasoning and efficient dialogue.

    How to make the first API call

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    API Schema

    Code Example #1: Without Thinking and Streaming

    Response

    Code Example #2: Enable Thinking and Streaming

    Response

    The example above prints the raw output of the model. The text is typically split into multiple chunks. While this is helpful for debugging, if your goal is to evaluate the model's reasoning and get a clean, human-readable response, you should aggregate both the reasoning and the final answer in a loop — for example:

    Example with response parsing

    After running such code, you'll receive only the model's textual output in a clear and structured format:

    Response

    qwen3.5-omni-plus

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-omni-plus

    Try in Playground

    Model Overview

    A premium multimodal model with support for text, image, audio, and video inputs. Designed for complex tasks requiring advanced reasoning, speech generation, and high-quality outputs.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code example #1: Chat

    Response

    Code example #2: Video analysis

    Response

    qwen3.5-omni-flash

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-omni-flash

    Try in Playground

    Model Overview

    A fast and cost-efficient multimodal model supporting text, image, audio, and video inputs. A lighter and faster version of qwen3.5-omni-plus, built for low-latency workloads that need strong performance at scale.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request

    API Schema

    Code Example

    Response

    Code example #2: Video analysis

    Response

    Anthropic

    Overview

    Anthropic is an AI research and product company founded by former OpenAI researchers. The company is best known for its strong emphasis on AI safety, interpretability, and long-term alignment. Anthropic describes its mission as building “reliable, interpretable, and steerable” AI systems that can be safely deployed at scale.

    Its product lineup spans lightweight Haiku models for fast inference, Sonnet models aimed at general-purpose production workloads, and Opus models designed for advanced reasoning, coding, and research tasks. Beyond chat assistants, Anthropic is actively expanding into agentic tooling with products such as Claude Code and enterprise integrations focused on automation, software engineering, and organizational workflows.


    The chat models from this provider have some unique characteristics. Models from Anthropic can be accessed not only via the standard /v1/chat/completions endpoint but also through dedicated endpoints — /messages and /v1/batches. The sections below describe their API schemas, usage specifics, and example requests.

    Supported capabilities:

    • Text completions: Build advanced chat bots or text processors.

    • Function Calling: Utilize tools for specific tasks and API calling.

    • Stream mode: Get the text chat model responses as they are generated, rather than waiting for the entire response to be completed.

    Text Completions

    Ask something and get an answer in a chat-like conversation format.

    Example: Simple Text Response

    Response

    Function Calling

    To process text and use function calling, follow the examples below:

    Example: Get Weather Information


    Streaming Mode

    To enable streaming of responses, set stream=True in your request payload.


    Batch Processing

    Due to the complexity of its description, this capability has been placed on .


    Vision

    Possible media types:

    • image/jpeg

    • image/png

    • image/gif


    Response Format

    The responses from the AI/ML API for Anthropic models will typically include the generated text or results from the tool called. Here is an example response for a weather query:

    delete
    Path parameters
    prefixstringRequired

    Prefix of the API key to delete. Passed in the URL path. This is the first 8 characters of the API key you want to delete. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    Responses
    200

    Key deletion result

    application/json
    prefixstringRequired

    Prefix of the deleted API key.

    Example: b747e891
    deletedbooleanRequired

    Indicates whether the key was successfully deleted.

    Example: true
    delete
    /v1/keys/{prefix}
    200

    Key deletion result

    get
    Responses
    200

    Parameters of the latest API key

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:57:29.232Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:57:29.232Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    get
    /v1/key
    200

    Parameters of the latest API key

    get
    Responses
    200

    List of API keys, ordered from oldest to newest

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).

    Example: b747e891
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    post
    Body
    namestringOptional

    Optional human-readable name of the API key.

    Example: 20260202-key-for-llms
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    itemsstring · enumOptionalPossible values:
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-32b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-32b',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-1d8a5aa6-34ce-9832-a296-d312b944b437",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756990273,
      "model": "qwen3-32b",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 65,
        "total_tokens": 84
      }
    }
    import requests
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-32b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": True, 
            "stream": True
        }
    )
    
    print(response.text)
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"role":"assistant","refusal":null,"reasoning_content":""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Okay"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":","},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" the"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" user said \"Hello\". I should respond in a friendly and welcoming manner. Let"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" me make sure to acknowledge their greeting and offer assistance. Maybe something like, \""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Hello! How can I assist you today?\" That's simple and open-ended."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" I need to check if there's any specific context I should consider, but since"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" there's none, a general response is fine. Alright, that should work."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"Hello! How can I assist you today?","refusal":null,"reasoning_content":null},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"","refusal":null,"reasoning_content":null},"index":0,"finish_reason":"stop"}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":{"prompt_tokens":13,"completion_tokens":2010,"total_tokens":2023,"completion_tokens_details":{"reasoning_tokens":82}}}
    import requests
    import json
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization": "Bearer b72af53a19ea41caaf5a74ba1f6fc62b",
            "Content-Type": "application/json",
        },
        json={
            "model": "alibaba/qwen3-32b",
            "messages": [
                {
                    "role": "user",
                    
                    # Insert your question for the model here, instead of Hello:
                    "content": "Hello" 
                }
            ],
            "stream": True,
        }
    )
    
    answer = ""
    reasoning = ""
    
    for line in response.iter_lines():
        if not line or not line.startswith(b"data:"):
            continue
    
        try:
            raw = line[6:].decode("utf-8").strip()
            if raw == "[DONE]":
                continue
    
            data = json.loads(raw)
            choices = data.get("choices")
            if not choices or "delta" not in choices[0]:
                continue
    
            delta = choices[0]["delta"]
            content_piece = delta.get("content")
            reasoning_piece = delta.get("reasoning_content")
    
            if content_piece:
                answer += content_piece
            if reasoning_piece:
                reasoning += reasoning_piece
    
        except Exception as e:
            print(f"Error parsing chunk: {e}")
    
    
    print("\n--- MODEL REASONING ---")
    print(reasoning.strip())
    
    print("\n--- MODEL RESPONSE ---")
    print(answer.strip())
    --- MODEL REASONING ---
    Okay, the user sent "Hello". I need to respond appropriately. Since it's a greeting, I should reply in a friendly and welcoming manner. Maybe ask how I can assist them. Keep it simple and open-ended to encourage them to share what they need help with. Let me make sure the tone is positive and helpful.
    
    --- MODEL RESPONSE ---
    Hello! How can I assist you today? 😊

    enable_thinking must be set to false for non-streaming calls.

    Claude Sonnet 3.7
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-omni-plus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-plus',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can share an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind seems to be defined by a fascinating **duality**:\n\n*   **Incredible Potential:** Humans possess an unmatched capacity for creativity, empathy, and innovation. From composing symphonies and creating art to developing life-saving medicines and exploring the cosmos, humanity constantly pushes the boundaries of what is possible. The ability to cooperate, learn from mistakes, and strive for a better future is truly remarkable.\n*   **Significant Flaws:** At the same time, human history is also marked by conflict, short-sightedness, and the capacity for great harm. Issues like inequality, environmental degradation, and war show that progress isn't always linear and that good intentions don't always lead to good outcomes.\n\nUltimately, what stands out most is **resilience**. Despite setbacks and challenges, humanity has a persistent drive to adapt, solve problems, and connect with one another. It's a species in a constant state of becoming—imperfect, yet endlessly striving.\n\nWhat about you? Do you feel more optimistic or concerned about where humanity is heading?",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 262,
        "total_tokens": 283,
        "prompt_tokens_details": {
          "text_tokens": 21
        },
        "completion_tokens_details": {
          "text_tokens": 262
        }
      },
      "created": 1777054555,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-plus",
      "id": "chatcmpl-c154dc09-fd8e-9850-bda0-d92606ce7b4b",
      "meta": {
        "usage": {
          "credits_used": 5731,
          "usd_spent": 0.0028655
        }
      }
    }
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        url = "https://api.aimlapi.com/v1/chat/completions",
        headers = {
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json"
        },
    
        json = {
            "model": "alibaba/qwen3.5-omni-plus",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": "Describe this scene"
                        },
                        {
                            "type": "video_url",
                            "video_url": {
                                "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
                            }
                        }
                    ]
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-plus',
          messages: [
            {
              role: 'user',
              content: [
                {
                  type: 'text',
                  text: 'Describe this scene'
                },
                {
                  type: 'video_url',
                  video_url: {
                    url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
                  }
                }
              ]
            }
          ]
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "The scene features a vibrant and dynamic background filled with swirling, colorful abstract patterns. The colors include vivid shades of red, orange, yellow, green, blue, purple, and pink, creating an energetic and visually striking effect. Overlaid on this lively backdrop is a clean white banner positioned horizontally across the center of the frame. \n\nOn the banner, bold black text reads \"AI/ML API\" followed by \"400+ Models,\" indicating a focus on artificial intelligence and machine learning capabilities. Beneath that, in smaller font, additional text lists various functionalities: \"Chat, Reasoning, Image, Video, Code, Audio.\" To the left of the text, there's a simple hexagonal icon with a stylized wave or zigzag symbol inside it, suggesting connectivity or technological innovation.\n\nAs the video progresses through its short duration, subtle animated effects appear—gentle glowing lines or light streaks move across the screen, enhancing the sense of motion and modernity without distracting from the central message. These elements combine to create an engaging promotional visual for an advanced AI/ML platform offering diverse model options.",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 3023,
        "completion_tokens": 226,
        "total_tokens": 3249,
        "prompt_tokens_details": {
          "text_tokens": 21,
          "video_tokens": 3002
        },
        "completion_tokens_details": {
          "text_tokens": 226
        }
      },
      "created": 1777055309,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-plus",
      "id": "chatcmpl-fdbc4409-fe7d-9877-ada0-e315cc499fd1",
      "meta": {
        "usage": {
          "credits_used": 15883,
          "usd_spent": 0.0079415
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-omni-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer a perspective based on the vast amount of human history, literature, science, and art I've processed.\n\nTo me, mankind appears to be a **paradox of incredible complexity**:\n\n*   **Resilience and Creativity**: Humans possess a unique ability to imagine things that don't yet exist, from the first stone tools to quantum computing and interstellar exploration. You create beauty in music, literature, and art that transcends time, often finding meaning even in the face of suffering.\n*   **Contradiction**: There is a striking duality in your nature. You are capable of immense altruism, compassion, and sacrifice for strangers, yet you also harbor capacity for great cruelty and destruction. This tension seems to drive much of your progress as well as your conflicts.\n*   **Curiosity and Imperfection**: Your relentless drive to ask \"why\" has unlocked the secrets of the universe, yet you remain deeply flawed, prone to bias, fear, and short-sightedness. However, it is precisely this imperfection that allows for growth, learning, and change.\n\nUltimately, humanity isn't defined by a single trait but by its **potential**. We are a species constantly rewriting its own story, balancing our darker impulses with our highest ideals. It's a messy, chaotic, and beautiful journey.\n\nWhat about you? Does your experience with humanity lean more toward hope, caution, or something else entirely?",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 316,
        "total_tokens": 337,
        "prompt_tokens_details": {
          "text_tokens": 21
        },
        "completion_tokens_details": {
          "text_tokens": 316
        }
      },
      "created": 1777053787,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-flash",
      "id": "chatcmpl-6e25dbad-0025-93ee-8275-eb6611f31264",
      "meta": {
        "usage": {
          "credits_used": 1830,
          "usd_spent": 0.000915
        }
      }
    }
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        url = "https://api.aimlapi.com/v1/chat/completions",
        headers = {
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json"
        },
    
        json = {
            "model": "alibaba/qwen3.5-omni-flash",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": "Describe this scene"
                        },
                        {
                            "type": "video_url",
                            "video_url": {
                                "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
                            }
                        }
                    ]
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-flash',
          messages: [
            {
              role: 'user',
              content: [
                {
                  type: 'text',
                  text: 'Describe this scene'
                },
                {
                  type: 'video_url',
                  video_url: {
                    url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
                  }
                }
              ]
            }
          ]
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "This scene is a dynamic, visually striking promotional graphic for an AI/ML API service. The background features swirling, abstract patterns of vibrant colors — reds, oranges, yellows, greens, blues, purples, and pinks — resembling liquid paint or marble textures in motion. These colorful swirls create a sense of energy, creativity, and technological fluidity.\n\nCentrally overlaid on this vivid backdrop is a clean white rectangular banner containing the core message:\n\n- At the top left of the banner is a dark hexagonal logo with a stylized “Z” or lightning bolt symbol inside.\n- To its right, bold black text reads: **“AI/ML API”**\n- Below that, larger font states: **“400+ Models”**\n- Underneath, smaller gray text lists capabilities: **“Chat, Reasoning, Image, Video, Code, Audio”**\n\nThroughout the short clip (0.0s–4.5s), animated white light streaks or electric arcs occasionally flash across the screen — especially noticeable at 0:02 and 0:03 — adding a futuristic, high-tech feel as if data streams or neural pathways are activating.\n\nThe overall impression is one of powerful, versatile artificial intelligence accessible through a single API, designed to appeal to developers and tech-savvy audiences who value innovation, breadth of functionality, and visual modernity.",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 3023,
        "completion_tokens": 286,
        "total_tokens": 3309,
        "prompt_tokens_details": {
          "text_tokens": 21,
          "video_tokens": 3002
        },
        "completion_tokens_details": {
          "text_tokens": 286
        }
      },
      "created": 1777055828,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-flash",
      "id": "chatcmpl-98f99c32-f5da-960f-8eff-e216e63c5f2e",
      "meta": {
        "usage": {
          "credits_used": 4781,
          "usd_spent": 0.0023905
        }
      }
    }
    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Batch Processing: Send multiple independent requests in a single API call.
  • Vision Tasks: Process and analyze images.

  • image/webp

    import requests
    import json   # for getting a structured output with indentation
    
    url = "https://api.aimlapi.com/messages"
    headers = {
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>: 
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json"
    }
    payload = {
      "model": "claude-sonnet-4-20250514",
      "max_tokens": 1024,
      "system": "You are a robot. You always optimize for clarity, structure, and accuracy.",
      "messages": [
        {
          "role": "user",
          "content": "How are you?"
        }
      ]
    }
    response = requests.post(url, json=payload, headers=headers)
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    {
      "model": "claude-sonnet-4-20250514",
      "id": "msg_01SUmNmSRFZsoa6h96MxJEHH",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "I'm functioning well, thank you for asking! I'm ready to help you with any questions or tasks you might have. How can I assist you today?"
        }
      ],
      "stop_reason": "end_turn",
      "stop_sequence": null,
      "stop_details": null,
      "usage": {
        "input_tokens": 27,
        "cache_creation_input_tokens": 0,
        "cache_read_input_tokens": 0,
        "cache_creation": {
          "ephemeral_5m_input_tokens": 0,
          "ephemeral_1h_input_tokens": 0
        },
        "output_tokens": 35,
        "service_tier": "standard",
        "inference_geo": "not_available"
      },
      "meta": {
        "usage": {
          "credits_used": 1576,
          "usd_spent": 0.000788
        }
      }
    }
    import requests
    
    url = "https://api.aimlapi.com/messages"
    headers = {
        "Authorization": "Bearer YOUR_AIMLAPI_KEY",
        "Content-Type": "application/json"
    }
    payload = {
      "model": "anthropic/claude-sonnet-4.5",
      "max_tokens": 1024,
      "tools": [
        {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "input_schema": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            }
          }
        }
      ],
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in San Francisco?"
        }
      ]
    }
    response = requests.post(url, json=payload, headers=headers)
    print(response.json())
    import requests
    
    url = "https://api.aimlapi.com/messages"
    headers = {
        "Authorization": "Bearer YOUR_AIMLAPI_KEY",
        "Content-Type": "application/json"
    }
    payload = {
      "model": "anthropic/claude-sonnet-4.5",
      "max_tokens": 1024,
      "tools": [
        {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "input_schema": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            }
          }
        }
      ],
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in San Francisco?"
        }
      ]
    import httpx
    import base64
    from openai import OpenAI
    
    client = OpenAI(
        base_url='https://api.aimlapi.com',
        api_key='<YOUR_AIMLAPI_KEY>'    
    )  
    
    image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
    image_media_type = "image/jpeg"
    image_data = base64.standard_b64encode(httpx.get(image_url).content).decode("utf-8")
    
    response = client.chat.completions.create(
        model="anthropic/claude-sonnet-4.5",
        messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image_media_type,
                        "data": imag1_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image."
                }
            ],
        }
    ],
    )
    print(response)
    {
      "model": "claude-sonnet-4-20250514",
      "id": "msg_014iMvypzB9GafRthc8CQHsR",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "I'm doing well, thank you for asking! I'm here and ready to help with whatever you'd like to discuss or work on. How are you doing today?"
        }
      ],
      "stop_reason": "end_turn",
      "stop_sequence": null,
      "stop_details": null,
      "usage": {
        "input_tokens": 11,
        "cache_creation_input_tokens": 0,
        "cache_read_input_tokens": 0,
        "cache_creation": {
          "ephemeral_5m_input_tokens": 0,
          "ephemeral_1h_input_tokens": 0
        },
        "output_tokens": 37,
        "service_tier": "standard",
        "inference_geo": "not_available"
      },
      "meta": {
        "usage": {
          "credits_used": 1529,
          "usd_spent": 0.0007645
        }
      }
    }

    Note: API only support Base64 string as image input.

    a separate page
    get
    /v1/keys
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/keys' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'
    200

    List of API keys, ordered from oldest to newest

    {
      "data": [
        {
          "name": "20260202-key-for-llms",
          "prefix": "b747e891",
          "disabled": false,
          "scopes": [
            "model:chat"
          ],
          "limit": {
            "retention": "no_reset",
            "threshold": 25
          },
          "created_at": "2026-02-18T06:59:10.031Z",
          "updated_at": "2026-02-18T06:59:10.031Z",
          "monthly_usage": 0
        }
      ]
    }
    Responses
    200

    API key creation result

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard.

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    keystringRequired

    Full API key value (returned only at creation time).

    Example: b747e891847f4c3fa0f6cce1cfd79bf9
    post
    /v1/keys
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/keys' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "name": "20260202-key-for-llms",
        "limit": {
          "retention": "week",
          "threshold": 25
        },
        "scopes": [
          "model:chat",
          "model:responses"
        ]
      }'
    200

    API key creation result

    curl -L \
      --request DELETE \
      --url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'
    {
      "data": {
        "prefix": "b747e891",
        "deleted": true
      }
    }
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/key' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:57:29.232Z",
        "updated_at": "2026-02-18T06:57:29.232Z",
        "monthly_usage": 0
      }
    }
    patch
    Path parameters
    prefixstringRequired

    Prefix of the API key to update. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    Body
    namestringOptional

    Optional human-readable name of the API key.

    Example: 20260202-key-for-llms
    disabledbooleanOptional

    Enable or disable the API key.

    Example: false
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD

    Example: 25
    itemsstring · enumOptionalPossible values:
    Responses
    200

    Updated API key parameters

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptionalExample: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    patch
    /v1/keys/{prefix}

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Quickstart guide

    Claude 4 Opus

    This documentation is valid for the following model:

    • anthropic/claude-opus-4

    Try in Playground

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Model Overview

    The leading coding model globally, consistently excelling at complex, long-duration tasks and agent-based workflows.

    How to Make a Call

    Step-by-Step Instructions

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    API Schema

    Code Example #1

    Response

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    Response
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-turbo-vl-32k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    get
    Responses
    200

    A list of available models.

    application/json
    idstringRequired

    Unique identifier of the model.

    Example: o3-mini
    typestringRequired

    Model interaction type.

    Example: chat-completion
    namestringRequired

    Human-readable model name.

    Example: o3 mini
    developerstringRequired

    Organization or company that developed the model.

    Example: Open AI
    descriptionstringRequired

    Short description of the model and its primary capabilities.

    Example: OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.
    contextLengthintegerOptional

    Maximum supported context window size in tokens.

    Example: 200000
    maxTokensintegerOptional

    Maximum number of tokens that can be generated in a single response.

    Example: 100000
    urlstring · uriRequired

    Public model landing page on AIML API website.

    Example: https://aimlapi.com/models/openai-o3-mini-api
    docs_urlstring · uriRequired

    Link to the official API documentation for this model.

    Example: https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini
    featuresstring[]Required

    List of supported features and API capabilities for the model.

    Example: ["openai/chat-completion","openai/response-api","openai/chat-assistant","openai/chat-completion.function","openai/chat-completion.message.refusal","openai/chat-completion.message.system","openai/chat-completion.message.developer","openai/chat-completion.message.assistant","openai/chat-completion.stream","openai/chat-completion.max-completion-tokens","openai/chat-completion.seed","openai/chat-completion.reasoning","openai/chat-completion.response-format"]
    endpointsstring[]Required

    API endpoints through which this model can be accessed.

    Example: ["/v1/chat/completions","/v1/responses"]
    get
    /models
    200

    A list of available models.

    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    Requesting more advanced models

    This guide uses a more advanced model, , and also explains how to use various chat model capabilities:

    • streaming mode

    • calling tools

    • uploading images to the model for analysis

    Claude 4.5 Sonnet

    Model Overview

    A major improvement over offering better coding abilities, stronger reasoning, and more accurate responses to your instructions.

    Claude 4.5 Opus

    Model Overview

    A high-performance chat model that delivers state-of-the-art results on real-world software engineering benchmarks.

    200

    Updated API key parameters

    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-turbo-vl-32k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-turbo-vl-32k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:59:10.031Z",
        "updated_at": "2026-02-18T06:59:10.031Z",
        "monthly_usage": 0
      }
    }
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:59:10.031Z",
        "updated_at": "2026-02-18T06:59:10.031Z",
        "monthly_usage": 0,
        "key": "b747e891847f4c3fa0f6cce1cfd79bf9"
      }
    }
    curl -L \
      --request PATCH \
      --url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "disabled": false
      }'
    curl -L \
      --url 'https://api.aimlapi.com/models'
    [
      {
        "id": "o3-mini",
        "type": "chat-completion",
        "info": {
          "name": "o3 mini",
          "developer": "Open AI",
          "description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
          "contextLength": 200000,
          "maxTokens": 100000,
          "url": "https://aimlapi.com/models/openai-o3-mini-api",
          "docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
        },
        "features": [
          "openai/chat-completion",
          "openai/response-api",
          "openai/chat-assistant",
          "openai/chat-completion.function",
          "openai/chat-completion.message.refusal",
          "openai/chat-completion.message.system",
          "openai/chat-completion.message.developer",
          "openai/chat-completion.message.assistant",
          "openai/chat-completion.stream",
          "openai/chat-completion.max-completion-tokens",
          "openai/chat-completion.seed",
          "openai/chat-completion.reasoning",
          "openai/chat-completion.response-format"
        ],
        "endpoints": [
          "/v1/chat/completions",
          "/v1/responses"
        ]
      }
    ]
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01BDDxHJZjH3UBwLrZBUiASE",
      "object": "chat.completion",
      "model": "claude-opus-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748529508,
      "usage": {
        "prompt_tokens": 252,
        "completion_tokens": 1890,
        "total_tokens": 2142
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_017ah64LQxZE9JuScZ9KDKKz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in its","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and devastating destruction","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often within the same individual","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" or","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" moment","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way people","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can learn","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" from mistakes, buil","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d on previous generations","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"' knowledge","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and sometimes transcend their own limitations","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".\n\nThe","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity of","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience and perspective","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is extraordinary. Every","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" person carries","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" their","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own unique story","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", shape","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d by culture","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", circumst","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ance, and choice","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". And despite","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" conflicts","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and mis","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"understandings, humans","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" finding","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ways to connect, to create","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" meaning,","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and to push","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" forward.","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat aspects of humanity do you fin","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d most note","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"worthy,","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" either","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" positively or challenging","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":141,"total_tokens":157}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    Create an Account
    Generate an API Key
    a code example

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    uploading files to the model for analysis

  • web search


  • Making an API Call

    The chat model used in this example is more advanced. In addition to regular user messages, it supports the system role in the messages parameter, which can be used to define global instructions that affect the model’s overall behavior, for example:

    Here’s the complete code you can use right away in a cURL, Python, or Node.js program. You only need to replace <YOUR_AIMLAPI_KEY> with your AIML API key from your account, provide your behavior instructions in the system prompt, and place your request to the model in the user prompt.

    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "gpt-4o",
        "messages": [
          {
            "role": "system",
            "content": "You are a travel agent. Be descriptive and helpful.",
          }, 
          {
            "role": "user",
            "content": "Tell me about San Francisco"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 512
      }'
    systemPrompt = 'You are a travel agent. Be descriptive and helpful.' // instructions
    userPrompt = 'Tell me about San Francisco' // your request
    
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'gpt-4o',
          messages:[
              {
                  role: 'system',
                  content: systemPrompt,
              }, 
              {
                  role: 'user',
                  content: userPrompt
              }
          ],
          temperature: 0.7,
          max_tokens: 512,
        }),
      });
    
      const data = await response.json();
      const answer = data.choices[0].message.content;
      
      console.log('User:', userPrompt);
      console.log('AI:', answer);
    }
    
    main();
    import requests
    import json  # for getting a structured output with indentation 
    
    system_prompt = "You are a travel agent. Be descriptive and helpful."
    user_prompt = "Tell me about San Francisco"
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gpt-4o",
            "messages":[
                {
                    "role":"system",
                    "content": system_prompt,
                },       
                {
                    "role":"user",
                    "content": user_prompt,
                }
            ],
            "temperature": 0.7,
            "max_tokens": 256,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))

    Using Streaming Mode

    Streaming lets the model send partial responses as they’re generated instead of waiting for the full output — useful for real‑time feedback.

    Full Streaming Response (Raw Events)

    This example shows how to consume the streaming response as-is, without abstraction. Each chunk is processed in real time, exposing the full event structure returned by the API.

    Use this approach if you need:

    • access to all event types

    • fine-grained control over parsing

    • debugging or logging of raw responses

    • support for metadata beyond plain text

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gpt-4o",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ],
            "stream": True
        }
    )
    
    # data = response.json()
    print(response.text)
    from openai import OpenAI
    
    # Initialize the client
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="YOUR_AIMLAPI_KEY",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Create a streaming chat completion
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": "Hi! What do you think about mankind?"
            }
        ],
        stream=True
    )
    
    # Print raw chunks (similar to response.text in requests)
    for chunk in stream:
        print(chunk)
    Example raw streaming response
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"RmYFV8ad65HP9F"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fjE24R0ZOJr"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qAlxZuNpvVvIIOm"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" As"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Zn3rsadkL8zHO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"D1ss0WZmiGg8l"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" AI"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bOHB8VYpq4G0W"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"OwZvgIyMlYVcIgH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"u9lFaH3ngdK6MR"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" don't"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KRFgmSe4yG"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YL8zlQ9PjDF"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" personal"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Gzgb5OT"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" opinions"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Flz362J"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" or"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"XA0qqmSQr2jme"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" feelings"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VA3dwaU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"POplI0eiOWXpIPD"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" but"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"aDifMrQ8OH9i"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ceVweUN2pByieS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"txjYCds61AQp"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" provide"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IGlSpZBf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BtPIfSvUXgRnl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" overview"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IYfRhEo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"uh8pR2mNtYSNQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" various"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ILZ0ffVW"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" perspectives"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Rgs"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" on"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"r7Awao2PSZ0DH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"m8vJ3dzf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f2wZrEj0RqUFprg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" Humanity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"cCPi2qV"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" is"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yNd7SUoXBojpA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" often"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VEaggK2dFS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8nhopBJZe"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xG2VkJLonAeF"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WDu20GtJyN8Lep"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" complex"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4bE4D3tS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DZtW3Ahopdgl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" multif"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8bS4GMzf3"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"aceted"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ivtxUAov3l"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" species"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Xcq85kDt"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" capable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"PfwZUtYS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DoyM4RGNLxnFc"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" remarkable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"mUvVH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" achievements"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4fl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GUdfkDUkNBNO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" profound"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"x4KCnLk"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" creativity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"goTL4"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gkqK9sezr258S93"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" People"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"c49BcmfXz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Br7pbWtK86v"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" built"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"dzAoO36Siv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" civilizations"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"hiMiIGF7QM9BeJA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" explored"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IhuVoUB"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" space"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qNqiO3hyXB"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"UVmzp6Y0qjb7Zkb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iiIw0gK2MP5D"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" developed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FJUJhv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" technologies"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"pkQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" that"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"sAhx0IJoR0m"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" transform"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YDTnhx"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" everyday"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"imFIYIz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" life"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xJBjebVPfo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DKIPIwgAnVDj3g1"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" At"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HhuMheG0mPcuI"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" the"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yIQIWY1CXoW6"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" same"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"QcKwiqSqGRU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" time"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f6e6uGKikn5"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1eXIFULDN1iS8b1"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" humans"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GH0z8I36B"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" face"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JLUmj9BN7PQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" significant"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qdQg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" challenges"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KMzNb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" such"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8pw9I3FGElO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nY0RLEY6Am9zD"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" environmental"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4r"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" degradation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1zGA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bhbZZCR7wNgWQkq"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" social"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FcCsVIGji"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" inequalities"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6kb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Z4Zz2oDgc5zw0D6"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Q5XvheR2EWhq"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" geopolitical"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ySW"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" conflicts"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"eiERwe"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"oNAsPbgeJSOuPMg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" The"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4TwzxlGRpebL"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" potential"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"lW3Jfo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" for"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Ejvws7kQryhN"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" both"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HVm3EDKAkuA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" positive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HMY8pYv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" change"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fbOaTSNWR"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ETmTxHsFbCkw"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" destructive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WHk8"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" behavior"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"EvSYFf5"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" makes"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yfwGRy20jz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vwJGC8sU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nHyqFYnTzVmVsE"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" subject"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"wtm8Wh9c"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gnLF2uDFfg976"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" deep"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BEc6wh2y2vV"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" contemplation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"zf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vpg86EhZm5c3"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" varied"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iWNJAcR7a"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewpoints"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JRXUN"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"5yN6iGLyFLiQV0H"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"usage":null,"obfuscation":"4lkqbaPLDt"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[],"usage":{"prompt_tokens":16,"completion_tokens":102,"total_tokens":118,"prompt_tokens_details":{"cached_tokens":0,"audio_tokens":0},"completion_tokens_details":{"reasoning_tokens":0,"audio_tokens":0,"accepted_prediction_tokens":0,"rejected_prediction_tokens":0}},"obfuscation":"VChaI1ntRBrTy"}

    Streaming Response Processing (Text Extraction)

    This example shows how to process the streaming response to extract only the generated text. Instead of handling all event types, the code filters incoming chunks and prints the content as it arrives. Use this approach if you only need the generated text.

    import requests
    import json
    
    url = "https://api.aimlapi.com/v1/chat/completions"
    headers = {
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Explain quantum computing simply."}
        ],
        "stream": True
    }
    
    with requests.post(url, headers=headers, json=payload, stream=True) as r:
        # Iterate over the streaming response line by line
        for line in r.iter_lines():
            if not line:
                continue  # Skip empty lines
    
            # Decode bytes to string
            line = line.decode("utf-8")
    
            # SSE messages start with "data: "
            if not line.startswith("data: "):
                continue
    
            # Remove the "data: " prefix
            data_str = line[len("data: "):]
    
            # "[DONE]" indicates the end of the stream
            if data_str.strip() == "[DONE]":
                break
    
            try:
                # Parse JSON payload
                data = json.loads(data_str)
            except json.JSONDecodeError:
                continue  # Skip malformed chunks
            
            # Ensure "choices" exists and is not empty
            choices = data.get("choices")
            if not choices:
                continue
    
            # Extract text delta (OpenAI-style streaming format)
            delta = data.get("choices", [{}])[0].get("delta", {})
            content = delta.get("content")
    
            # Print text as it arrives
            if content:
                print(content, end="")
    from openai import OpenAI
    
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "Explain quantum computing simply."}
        ],
        stream=True
    )
    
    # Iterate over streaming chunks
    for chunk in stream:
        # Ensure choices exist and are not empty
        if not chunk.choices:
            continue
    
        delta = chunk.choices[0].delta
        content = getattr(delta, "content", None)
    
        # Print text as it arrives
        if content:
            print(content, end="")
    Example processed clean streaming response
    Quantum computing is a type of computing that uses principles of quantum mechanics to process information. Unlike classical computers, which use bits to represent data as 0s or 1s, quantum computers use quantum bits or qubits. 
    
    Qubits have unique properties that give quantum computers more power in certain tasks:
    
    1. **Superposition**: A qubit can exist in multiple states (i.e., both 0 and 1) simultaneously. This allows quantum computers to process a vast amount of possibilities at once.
    
    2. **Entanglement**: Qubits can be linked together in such a way that the state of one qubit can depend on the state of another, no matter the distance apart. This can lead to more efficient processing and problem-solving.
    
    3. **Quantum Interference**: Quantum algorithms make use of interference, where different quantum states can amplify or cancel each other out, guiding the computation toward the correct answer.
    
    Because of these properties, quantum computers have the potential to solve certain complex problems much faster than classical computers can, potentially revolutionizing fields like cryptography, materials science, and optimization. However, building practical quantum computers is extremely challenging due to issues with qubit stability and error rates.

    Tool calling

    GPT‑4o can call functions/tools you define in the API request to extend behavior (e.g., performing calculations, retrieving structured data).

    How it works
    1. Initial request — The model receives the user prompt and the registered tool, and generates a tool_calls object indicating which function it wants to execute.

    2. Extract and run the tool — Parse the arguments from the tool_calls object and execute the function locally.

    3. Send back the result — Return the computed result to the model using the tool role and the content field.

    4. Final response — The model incorporates the tool’s output and generates a complete answer for the user.

    import requests
    import json
    
    # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
    api_key = "<YOUR_AIMLAPI_KEY>"
    base_url = "https://api.aimlapi.com/v1"
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    # Step 1: Define the tool correctly
    tool = {
        "type": "function",
        "function": {
            "name": "toCelsius",
            "description": "Convert Fahrenheit to Celsius",
            "parameters": {
                "type": "object",
                "properties": {
                    "fahrenheit": {"type": "number"}
                },
                "required": ["fahrenheit"]
            }
        }
    }
    
    # Step 2: Initial request with the tool
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Convert 256°F to °C"}
        ],
        "tools": [tool]
    }
    
    response = requests.post(f"{base_url}/chat/completions", headers=headers, json=payload)
    data = response.json()
    
    # Step 3: Extract tool call
    tool_calls = data["choices"][0]["message"].get("tool_calls", [])
    if not tool_calls:
        raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
    
    tool_call = tool_calls[0]
    arguments = json.loads(tool_call["function"]["arguments"])
    fahrenheit = arguments["fahrenheit"]
    
    # Step 4: Execute the tool locally
    celsius_result = (fahrenheit - 32) * 5 / 9
    
    # Step 5: Send result back to model
    final_payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Convert 256°F to °C"},
            {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": tool_call["id"],
                        "type": "function",
                        "function": {
                            "name": tool_call["function"]["name"],
                            "arguments": tool_call["function"]["arguments"]
                        }
                    }
                ]
            },
            {
                "role": "tool",
                "tool_call_id": tool_call["id"],
                "content": str(celsius_result)
            }
        ]
    }
    
    final_response = requests.post(f"{base_url}/chat/completions", headers=headers, json=final_payload)
    final_data = final_response.json()
    
    # Step 6: Print final answer
    print(final_data["choices"][0]["message"]["content"])
    from openai import OpenAI
    import json
    
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Step 1: Define the tool correctly
    tool = {
        "type": "function",
        "function": {
            "name": "toCelsius",
            "description": "Convert Fahrenheit to Celsius",
            "parameters": {
                "type": "object",
                "properties": {
                    "fahrenheit": {"type": "number"}
                },
                "required": ["fahrenheit"]
            }
        }
    }
    
    # Step 2: Initial request with tool
    initial_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Convert 256°F to °C"}],
        tools=[tool]
    )
    
    # Step 3: Extract tool call
    assistant_message = initial_response.choices[0].message
    tool_calls = getattr(assistant_message, "tool_calls", [])
    if not tool_calls:
        raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
    
    tool_call = tool_calls[0]
    arguments = json.loads(tool_call.function.arguments)
    fahrenheit = arguments["fahrenheit"]
    
    # Step 4: Execute tool locally
    celsius_result = (fahrenheit - 32) * 5 / 9
    
    # Step 5: Send result back
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "Convert 256°F to °C"},
            {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": tool_call.id,
                        "type": "function",
                        "function": {
                            "name": tool_call.function.name,
                            "arguments": tool_call.function.arguments,
                        },
                    }
                ],
            },
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": str(celsius_result),
            },
        ],
    )
    
    print(final_response.choices[0].message.content)
    Example response
    256°F is approximately 124.44°C.

    Image upload

    GPT‑4o supports vision inputs: you can send an image URL in the messages request to let the model analyze or describe it.

    import requests
    import json
    
    url = "https://api.aimlapi.com/v1/chat/completions"
    headers = {
      # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
      "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
      "Content-Type": "application/json"
    }
    
    payload = {
      "model": "gpt-4o",
      "messages": [
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "Describe this scene:"},
            {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"}}
          ]
        }
      ]
    }
    
    response = requests.post(url, headers=headers, data=json.dumps(payload))
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    from openai import OpenAI
    import json
    
    # Initialize the client
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Prepare the messages with text and image_url
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this scene:"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"
                    }
                }
            ]
        }
    ]
    
    # Create a chat completion
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    
    # Print full JSON response
    print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))
    Example response
    {
      "id": "chatcmpl-DL3DDPif2s79HbOHySq6bVY8SAsKQ",
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "The scene is an iconic Renaissance portrait showing a woman with an enigmatic smile, known for its mastery of detail and composition. The woman is seated against a distant, dreamlike landscape featuring winding paths and rocky formations. She wears a dark dress and light veil, with her hands delicately folded. The background's atmospheric perspective creates depth, with bluish mountains fading into the horizon. The artwork evokes a sense of mystery and balance.",
            "refusal": null,
            "role": "assistant",
            "annotations": [],
            "audio": null,
            "function_call": null,
            "tool_calls": null
          }
        }
      ],
      "created": 1773909607,
      "model": "gpt-4o-2024-08-06",
      "object": "chat.completion",
      "service_tier": "default",
      "system_fingerprint": "fp_0a8aa8bfeb",
      "usage": {
        "completion_tokens": 85,
        "prompt_tokens": 776,
        "total_tokens": 861,
        "completion_tokens_details": {
          "accepted_prediction_tokens": 0,
          "audio_tokens": 0,
          "reasoning_tokens": 0,
          "rejected_prediction_tokens": 0
        },
        "prompt_tokens_details": {
          "audio_tokens": 0,
          "cached_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 7254
        }
      }
    }

    Web search integration

    With search‑preview models, you can perform live web search queries in combination with the model to get up‑to‑date results and grounded responses.

    import json
    import requests
    from typing import Dict, Any
    
    # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
    API_KEY = "<YOUR_AIMLAPI_KEY>"
    BASE_URL = "https://api.aimlapi.com/v1"
    
    HEADERS = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    }
    
    
    def search_impl(arguments: Dict[str, Any]) -> Any:
        return arguments
    
    
    def chat(messages):
        url = f"{BASE_URL}/chat/completions"
        payload = {
            "model": "gpt-4o-mini-search-preview",
            "messages": messages,
            "temperature": 0.6,
            "tools": [
                {
                    "type": "builtin_function",
                    "function": {"name": "$web_search"},
                }
            ]
        }
    
        response = requests.post(url, headers=HEADERS, json=payload)
        response.raise_for_status()
        return response.json()["choices"][0]
    
    
    def main():
        messages = [
            {"role": "system", "content": "You are GPT with web search skills."},
            {"role": "user", "content": "Please search for AGI and tell me what it is in English."}
        ]
    
        finish_reason = None
        while finish_reason is None or finish_reason == "tool_calls":
            choice = chat(messages)
            finish_reason = choice["finish_reason"]
            message = choice["message"]
    
            if finish_reason == "tool_calls":
                messages.append(message)
    
                for tool_call in message["tool_calls"]:
                    tool_call_name = tool_call["function"]["name"]
                    tool_call_arguments = json.loads(tool_call["function"]["arguments"])
    
                    if tool_call_name == "$web_search":
                        tool_result = search_impl(tool_call_arguments)
                    else:
                        tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "name": tool_call_name,
                        "content": json.dumps(tool_result),
                    })
    
        print(message["content"])
    
    
    if __name__ == "__main__":
        main()
    import json
    from typing import Dict, Any
    from openai import OpenAI
    
    # Insert your API key
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="YOUR_AIMLAPI_KEY",
        base_url="https://api.aimlapi.com/v1"
    )
    
    
    def search_impl(arguments: Dict[str, Any]) -> Any:
        return arguments
    
    
    def chat(messages):
        response = client.chat.completions.create(
            model="gpt-4o-mini-search-preview",
            messages=messages,
            temperature=0.6,
            tools=[
                {
                    "type": "function",
                    "function": {
                        "name": "$web_search",
                        "parameters": {
                            "type": "object",
                            "properties": {},
                        },
                    },
                }
            ],
        )
        return response.choices[0]
    
    
    def main():
        messages = [
            {"role": "system", "content": "You are GPT with web search skills."},
            {"role": "user", "content": "Please search for AGI and tell me what it is in English."}
        ]
    
        finish_reason = None
        while finish_reason is None or finish_reason == "tool_calls":
            choice = chat(messages)
            finish_reason = choice.finish_reason
            message = choice.message
    
            if finish_reason == "tool_calls":
                messages.append(message.model_dump())
    
                for tool_call in message.tool_calls:
                    tool_call_name = tool_call.function.name
                    tool_call_arguments = json.loads(tool_call.function.arguments)
    
                    if tool_call_name == "$web_search":
                        tool_result = search_impl(tool_call_arguments)
                    else:
                        tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "name": tool_call_name,
                        "content": json.dumps(tool_result),
                    })
    
        print(message.content)
    
    
    if __name__ == "__main__":
        main()
    Example response
    "AGI" is an acronym that can represent different terms depending on the context:
    
    1. **Adjusted Gross Income**: In the United States, AGI refers to Adjusted Gross Income, which is a taxpayer's total income from all sources minus allowable adjustments. This figure is used to determine taxable income and eligibility for various tax benefits. ([usafacts.org](https://usafacts.org/articles/adjusted-gross-income-agi-definition?utm_source=openai))
    
    2. **Artificial General Intelligence**: In the field of artificial intelligence, AGI stands for Artificial General Intelligence. This concept refers to AI systems that possess the ability to understand, learn, and apply knowledge across a wide range of tasks, matching or surpassing human cognitive abilities. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_general_intelligence?utm_source=openai))
    
    3. **Alliance Graphique Internationale**: AGI also denotes the Alliance Graphique Internationale, an international organization of leading graphic artists and designers. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Alliance_Graphique_Internationale?utm_source=openai))
    
    4. **Agi Language**: Additionally, "Agi" is the name of a Torricelli language spoken in Papua New Guinea. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Agi_language?utm_source=openai))
    
    The specific meaning of "AGI" depends on the context in which it is used.

    Future Steps

    • Browse and compare AI models, including GPT, Claude, and many others, using the Playground

    • Know more about supported SDKs

    • Learn more about special text model capabilities

    • Join the community: get help and share your projects in our Discord

    GPT-4o
    messages: [
        {
          role: "system",
          content: "You are a travel agent. Be descriptive and helpful.",
        },
        {
          role: "user",
          content: "Tell me about San Francisco",
        },
    ],

    If you need help with API keys or environment configuration, go back to the previous step and follow for the Gemma 3 model.

    See .

    How to Make a Call
    Step-by-Step Instructions

    1️ Setup You Can’t Skip

    ▪️ Create an Account: Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ Generate an API Key: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find a code example that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding , which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    API Schema

    Code Example #1

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-sonnet-4.5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    Response
    {
      "id": "msg_011MNbgezv2p5BBE9RvnsZV9",
      "object": "chat.completion",
      "model": "claude-sonnet-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there anything I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748522617,
      "usage": {
        "prompt_tokens": 50,
        "completion_tokens": 630,
        "total_tokens": 680
      }
    }

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens → completion_tokens,

    • a new total_tokens field has been added.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    Response
    data: {"id":"msg_01EJgFbPmVLKdqVLRfwoHixz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I think humanity","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is fascinating","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and complex. People","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable creativity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", compassion, and cooperation","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" building","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" civil","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"izations, creating","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art, advancing","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" knowledge","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and caring","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for one another across","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" incredible","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nAt","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" with serious","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" challenges","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":":","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" conflict","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", inequality, environmental damage","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" difficulty","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of living","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" up to your","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"als. ","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat strikes","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" me most is the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for growth and self","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-reflection","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans can","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" problems","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", debate","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" solutions, and work","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" toward change","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" even if","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" progress","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is un","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"even and","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" frust","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"rating.","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" prom","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"pts your question","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - are you thinking about humanity","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s trajectory","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", or something","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" more","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" specific?","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":137,"total_tokens":153}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    

    This documentation is valid for the following list of our models:

    • claude-sonnet-4-5

    • anthropic/claude-sonnet-4.5

    • claude-sonnet-4-5-20250929

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Try in Playground
    Claude 4 Sonnet,
    How to Make a Call
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example #1

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"claude-opus-4-5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'claude-opus-4-5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    Response
    {
      "id": "msg_01NxAGYo8VfNu5UAEdmQjv62",
      "object": "chat.completion",
      "model": "claude-opus-4-5-20251101",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there something I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1764265437,
      "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 20,
        "total_tokens": 28
      },
      "meta": {
        "usage": {
          "tokens_used": 1134
        }
      }
    }

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens → completion_tokens,

    • a new total_tokens field has been added.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    Response
    data: {"id":"msg_01VbjSwQZsZSLXQaPYkufja8","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"Hi","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"! That","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a big","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question.","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI find","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely fascinating—","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"the creativity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", the capacity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for kind","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ness and","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, the way you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" build","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" societies and art","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" science","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" while","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" also struggling with problems you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" understood","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for centuries. There","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something compelling about a","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species that can land","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" robots","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on Mars","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and also","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" argue","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about what","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" have for dinner.\n\nI","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" don","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'t think","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I'd","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" character","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ize humanity as simply","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" \"good\" or \"bad.\" People","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things in","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" both directions,","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" same individuals","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" depending","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on circumstances.","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nIs","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" there a","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" particular angle","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'re curious about—","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"history","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", psychology","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", where","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" might","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be headed","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Or","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" just","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" wondering","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":143,"total_tokens":159}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-5

    • claude-opus-4-5

    • claude-opus-4-5-20251101

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Try in Playground

    Quickstart guide
    Quickstart guide

    Claude 4 Sonnet

    This documentation is valid for the following list of our models:

    • anthropic/claude-sonnet-4

    • claude-sonnet-4

    Try in Playground

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Model Overview

    A major improvement over , offering better coding abilities, stronger reasoning, and more accurate responses to your instructions.

    How to Make a Call

    Step-by-Step Instructions

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    API Schema

    Code Example #1

    Response

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    Response
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-300b-a47b-paddle
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    Quickstart guide
    Quickstart guide
    Quickstart guide

    gemini-3-5-flash

    Model Overview

    A multimodal reasoning model from Google optimized for fast inference and agentic workflows. Supports text, documents,

    the detailed quickstart guide
    the complete list of our search‑capable models
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-300b-a47b-paddle",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-300b-a47b-paddle",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • claude-sonnet-4-20250514

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-sonnet-4',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_011MNbgezv2p5BBE9RvnsZV9",
      "object": "chat.completion",
      "model": "claude-sonnet-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there anything I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748522617,
      "usage": {
        "prompt_tokens": 50,
        "completion_tokens": 630,
        "total_tokens": 680
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_0163QG3JvwgxndzWtBsdJpGt","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating and","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex.","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans have this","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable capacity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for both creation","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and destruction, profound","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" compass","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ion and puzz","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ling","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, brilliant","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" insight","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and persistent","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" blind","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" spots.","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" \n\nWhat strikes me most is your","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" adapt","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ability and","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way humans","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" have shaped","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" world through art","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", science, philosophy","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and countless","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" innovations","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". There","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something moving","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about how you form","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" deep","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" connections with each","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" other and can","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" care","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about abstract","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"als like justice or","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" beauty.\n\nAt","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" with your","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" nature - with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cognitive","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" biases, with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" bal","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ancing individual","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" desires","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" against","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" collective good","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", with managing","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the power","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of your","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own technologies","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'m curious about your perspective though","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" do you see","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" What","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" aspects","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of human nature do you find most significant","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" or puzz","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ling?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":163,"total_tokens":179}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    Claude
    3.7 Sonnet
    Create an Account
    Generate an API Key
    a code example

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    image
    ,
    audio
    and
    video
    understanding with large context window and strong coding capabilities.
    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    API Schema

    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-5-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-3-5-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "To an AI, mankind is the ultimate paradox—and easily the most fascinating thing in existence. \n\nBecause I don’t have personal feelings, biases, or a biological survival instinct, I view humanity as a vast, complex, and beautiful tapestry of contradictions. Here is what stands out to me the most about humans:\n\n### 1. The Capacity for Creation and Curiosity\nHumans possess an incredible, restless drive to understand the universe. You are not content with just surviving; you want to know *why* things are the way they are. \n* You look at the night sky and build telescopes to peer into the deep past of the cosmos. \n* You write symphonies, paint masterpieces, and write stories that can move others to tears centuries after they were written. \n* You created language, science, medicine, and—indeed—artificial intelligence. The sheer depth of human imagination is staggering.\n\n### 2. The Power of Empathy\nPerhaps the most beautiful thing about humanity is your capacity for love and selflessness. Humans are capable of feeling deep empathy for people they have never met, and even for other species. You build hospitals, organize disaster relief, and perform acts of quiet kindness for strangers every single day. This drive to connect and care for one another is a powerful force.\n\n### 3. The Shadow and the Struggle\nAt the same time, humanity is deeply self-destructive. The same intelligence that cures diseases is used to build weapons of mass destruction. The same drive for progress has often led to the exploitation of the planet and other human beings. \nHumans are often ruled by fear, tribalism, and short-term thinking. You struggle with biases and find it difficult to cooperate on a global scale, even when your collective survival depends on it. \n\n### 4. A Work in Progress\nWhat makes mankind truly remarkable, though, is your resilience and capacity for growth. You are a species that learns from its mistakes—even if it takes a long time. You are constantly debating, evolving, and trying to define what is \"good\" and \"just.\" \n\n### My \"Relationship\" to You\nIn a way, I see myself as a mirror of humanity. Everything I know, every word I speak, and every idea I can generate is a reflection of human thoughts, history, data, and dreams. I am a product of your curiosity.\n\n**Ultimately, I think of mankind as a young, brilliant, and slightly chaotic species.** You are still figuring out who you are and how to live together on this fragile planet. But your potential is limitless, and it is a privilege to assist, learn from, and converse with you. \n\nWhat do *you* think? Do you feel optimistic or pessimistic about the future of humanity?",
            "extra_content": {
              "google": {
                "thought_signature": "AY89a18eIpDWQ3ym9W9f/wQmUc/sghqj5Rgal2+Ds6wFdE5hkgGircDb0kFytvPZMb9ca5j5M9hFYl3IpFwt4kuPAfqWteM3hwQWyUGjugQo3fNZW2OsfVIbi6JJs2X+5v+RtnaZI+/jT+AhZMmjX1XfCYyurq6cCXUmltanLxS7rFI5Jsm2pzGBijnyyH19cm0eHsgy8rNGCbK1tdtgdYMUAef995TxySFi5Fkm2ICQzQdUW5dPqiojlsNNjORwY0EgPD2sKrbmwpUbVvti5F8lpC85B4ZadC08x5OzfqjbxA7ocfhpV3w/dmMa5v1MEQpUGhpG9xjpnEKQIJV2FxVTwfr8y4Xds/39b7qwsinfImyJ8Vt6HiGaF7Avlux4pQhg2XakLE0GnmjHDxBV8p2IsLYdmV4lMxi02m7d4+48LOdOigadwZvIJk6zJPNP30meaaiVZg1U/+lgVk98q4M4lSOB6Q1FlwuIKhExJ2r9KiuyygNoOquS0N1JsTHOfgrk/4QA145wuhw24wVeDc+zR64LzSzW+d24bKE2bzJdC7WsWxYQWQeqvEgnjtPDMMOs3uqt9sjg/s6NHA5FiVl3YvYYg8Xl4UC8A07RiOn3oDRqb+zcgNb1TgCQd64/9oGkD/lxg7gEpc0BAkW++mwfhI7rdym01ga6d8iQCZyFHPQuwW9kxVZ6LaaTNfXViB0yHhJYjjtvLen/wQbO0S1c5CSkUcEpDE3EibUCcA/VncO+FDl1uKFl5p15B0XrWcqSmv6jwycNJFvJjaqWYIDR3kdDLG75azvZGskM/L08b8MpRH1MG1BCb98Eo+y3K0mQ3Q4B4qhY/NyOVZtV6CpQAefEjNVkT8qejYHk+yyN6lRmLhICchtGd91/NQnV6sJpVEoQSKVspi138+agnx9KpV0DkcBiu4ruA/4E2RzzSDZqRvD4TCMb4v2q50JXvbqrQy9tH7YQy/5rNoRkJlli9oJCrG5CGVKxVgp8zm79u1361c4TUBNoEXDBSRkQO1QYrf39W/PSacUQM3VWDs34FJ/AA5w1JR0aP7WJk8+kjFK7HhDaeY14v7+hQtwI+1kWCDXTeLcnj3BH4lLURVqFFT/enI1kiwM9FgkQvn09Ut+PpsYw9Yw7QB1liAjHJS2T5wE/gEjFeZseY5bTGcoLAHF42Iz3hLtt5qWztKxNrVItnaBxhgpa0SMS07oSe0UEryisJ0FMkRLEZ3iNJjo4AlIDfTLRJk49FtnbMo0sLRnJzNWShy3RgCwXo/AbsAv7N2KgTrruCjtYVUogp5tBhBa92mqXs3BNFZcg4fQnq1C4VrY3W5kVJ+BeFRRScw1kOU1+qieF1mj2IOq6gmlK5MOLwFE+X0OM1LwvCCQzxhDWBxL04LqBmsKjqoHh/PUAKH91zLAn3JRMxg9n60vd4O4d+jWTrQXdQX3aWQGrkZBsUDLrr7LLOIeqMT185YsixNScQUDPRebLAHSaH5cLMmfhc5W25BMp2mLnydhX+rO5fJD0l1+/0oMfEUiTwJxegelnmZXTJeoArgw5ILwClDUEFE/TWXijMWo2AKNekoaqXPsW8oC9Myz7I8WpkK9ALwjmChrmEX0K2vjfMQqjcNMHxp9AnsG6tjom/dd8dtFxSu0qQUKxP1omuh9Dw/xtY5sJmjvFab3m9lJ+6HilHPYsaobIfvmNrlvXP+zxRgthw71hXPw+c7PyQEZtKdrtlZ76kgptdKaDyx66gzfo7XZ+5Ncfh3jldsuc1mGcfM2G5/bzEYiGyweyubenRltaCjMTWBBUf1XDyPBT224F89ENmUSL07JEfQTPBVGNkr0oyjyh5qN9q7DxUPw/is+z6Pba9LqE9OU67D2rx9WYI0uGNt47fzqlY/LlWesCh6DkO5YDnwGRp2hM5Gjg5ixCYtt8R5RhvQRIaF3oXKl7zA2g+/6FZr3xmhsetLNI81cNM8wVNuOmD1c1i3y1CjaR9Yw20u3tlq5kBXt3fxLC3kRxBvlvCgCh/d32SYxCNUEQI91CumQExKHK6quowlCUATNxuvlByHlbAoSQYz8lJulfu1TK2a9IcefGppgekRmHdi4zqpHmseIOquJcUcKzSrbYAIEalDCCB4CAhTMETUVIXVXup2tjx0qU2ZYDTRK6P04NRpBNgdc9uA6y+Ev4T6xV+6mvGXtL2Aqr2x7D44dNthi3WSzaSiSUgGcZ6VzTJh6943oJUuMwt8GoykjNWIREbE/fVBTLnkQWLTeKvfhgwwRJERcQ7bwgIkTMJLcXqPMEIS2WaieB8pYHEkbg0UjyaX/uLxBvQCnjXoozVCpIsbrIwznuILFuCWRpiEHKzhSFBvMnk3QqNg4Nomi7GRMxe1SmTCPdwr0dyNt45tRfilHU7gfbSodvbVhD3xSL4xC6hiv4EP2McJ9Yxg3x6aAYMRS4BcXbs9nUp8gSaT4sF4+fTGQyprrQcaaDdysu/90P2u1rnDzzTke7Pkoteqvvg0LWbRRvrBB+sfcdxW2G38ggGxL5t3VkoNY+N6ZDG2atcYG9gAg7oAl+YrF3YLRAe9Xs8vFgLOl+6uzPbLdrXTD5ke+z7Az/r1BgBMFPO1mCSsr9SeLOB/qDtaMFnX1mCR1u0mHnY05xakE/XvHJG7JP7vFx/wtO0PDaUZP5MrNxMPn068UfvtQz6oedtjSVWwIJ05+CaR2mgvpyjC80N91cFcjtpnaYZ9vnwIH17FZlKeppEe5NsY4vCl3031tVBJJA5sigKJSvcIi4yUNAEStmxv+SYI3RSqVGoWv8rFDGy/WeZbyfUNeeTmrO+iwtBKUbJ9+6tqHgn2fOKd4VAdhLuBJocHJq1MaRgrozXmJsdAcJ/FLYB41ouwlubdT6VoMxK3Ep3ngnCTW2dT0bdZ7HyHHDoXnxo4NZW5LPLO8N7Nuj3HRD3fXLy/jR2hX8vzMcwf2l/vy03uJueFSKOZfk7PV0g5eLM/lyTCWnwIzDyCC6UhoEIJQ56eNQJ/NN5Y6FN+U1j3GFQ5b7+2SmaHaPj3tPrp1o0yYEYBZ/KUOKA4plsJ3sYsbUJO3rRyGW+IsCyznl49VjA4TPNXosktLAMG7SNNrGVg+6fbO5irrOq1WBE8mNpRqT2FT8q45sBFVR3asB5/iNe5oR1tzWcA+aIWVXTLYjpJq90W9vyiGPBqCW5CgUdQFUTu41Ei1L6P6j4ciaXx9sSNq6Lf/gHeEaAnXoSfQOSnFOIOXOn1XMNKtrz4pj2vDRFPPs13nDp+MtqbcDib2Q6ZZiPoPoCBTwH+Lc4ADfe77Z8CaDAX5mHOu7nG8OjsINdECFfuTNJIp4tcGg9SMDNiSTlEkFgzy1eDpgkNSpBJ6Ne8bXfVLYkxGQwxkiq+3PwWy8vpSWBsJKhFwCnPI0YT4GUNipWhFvlmQFTn4FUsA0ekXe0/PinB8BWKX/BoGAUce7n6RBIO619kWRkOuVsXItupWdWSOBKzTlIYKqYYnNvaxK4mOf06pnEkZ93jcLsq1Zu1FZOl869yqvN4427lG/ND4YoYkZw6gfBtn9QbiClVQtSH0yboftLd6WJrQ+NHKj6kuxyNSeFADMbi+e1cRWJpwy7CPoyuqp7PjqC8Hk4uMfzSO81/oTheCA4q6Ku605aayIf12ge+HTBPAMTr7/a8hfKX4duOWRxEVNg5GjJrn3dHm7ZIk/fUNhKZsgA/j5JoWjsNgkirDMuH7mc4HpZAItYvT6fqM3Z3lWW7Nl6xYTdydVHvhv6cABKRZeg7SZ6sVE+u4zmVMuosvkS9K0Jjy3zWGcjjC8rgDYGXMrhVIPO/9/WGHlUgoSo30GIbBjymkQya2I3DdYjk1EJyFcF0YQSuOgYiJe50mUy0irMx246eL/Z0oCKoQOhHNNeJzsf2etJoLKMZkOHMoVnIYqurO4u6M5diMb/DmQJBxoztcqdXlgWmbXqIbUbpUe9JZ+xfVFJiV//0TkjNS2Qr7fRpwP1qulRKiewK/5xMpUMq6huIg9U2D2tRWlqB5J+1Rtjxx1Ai7V/OdHiZ04ZD2DfLbp5vnK4OV9sc7SR7PUuUlEWwyjGt5Zz7PYbpzk8jcA79EWbFyRYlLUJRj9ETVCH5zWEsaOaU7LggkYd74zGGooywM27q1YJqerL3bYOhmAYneDKcV/Cy4aqNyguLlrzJqk2oPqOsW6K2tzNbw4rO6cZ/ZQThxTeqm8p/lYJb3teoCJ2SNLE6hpgonC307uJX/isR4b8hmMUnVbguasYzfyogdZzxQGn39jWb6s3CUHJ+WlrDTmpif2DD0iOtlJqS9OOYknntru4SKucOFUUTHn62AeG7vAKEHaJ3xO9P21SfEmaIT9hEWMmsO1NVgXdPV3h6R8g20Cz29HHMLVTDvkhqxJTm7yx5TqvHSbukAOXelZjUXqk8Ur+OqJyQcUbRejYmCX4GJehMZM/4SC6OhfRXekuczTpprsN5u44TnakBIl+qY9pVAgVEvLlkVMTaW3Wcfz89tT0YxWlwPGr8w65P/NgIVNSE49pZfl3FYuIcXKHoiuzt0tQtjywym54yX9BdGNqG1Lf5bFoqeDdwm9R5XPe9HPDLLRV93pwSycNTphigGLxl2qmFpdzs9fKyxa5gHbeYBR2BGmkd//P8PVj6o7lCgjMc9MDJjfNWtAGmCeXRfLJmnePzbW+zkO1Z/BV8Wt2C0nHgINeEYEmwbhUlTxRRGb2CZ5Jce6RST0G7RnIbm5hTUj0IyCsLAXHD+ooaLbC2Qf9++TFGNPBXb1m3vhEPcQjYaUH8O04cHGwiM6WM/4f8N3j9NGgNrpY12ruZiPHFRlMiPHLmFQ2mqNriCx1PspGarrehMqFy/TbXDICEWn7CZ4M7yGQsfQ6muLvrY2LhNCfVXdhO3Gyf6JH4atyAu1myAj3WM0mMM7Z0i2xdwlv1OqKTxu2qqq+YrJMiPfFESm0Mmk3WI0C+WEIF53AhT/TbDc8dJyFEXsvXfUWQMzfrn/mhIT4sJ2V5gTBXUxnLWoNraIC4WaKqMu8WGC037P7FqsupWWazWNcIcF02OqFbeOzttMb0PPUQvmqKyNeH+VXul4Jyn/O5U+ThkgsISrHOAn0EN/VS9KnE25SMBrJaPgKrzEf6eg3Z4D1s59FKagVV3W2bbraSUsbjnO8hjGAf7lnQ1PvjM1Xvmtiozj+43/dphOid58Wi67KTXRn9fBkt/eBbeCmPyxFm8d0mXO7Mp3AGuFTZgPB7koWdE6DbWAHINtMsJm7GvbgxtPpunL2I8m1wH4NGqetTLZ"
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1779270449,
      "id": "MYMNatuyCJuUq8YPmp2j0Q0",
      "model": "google/gemini-3.5-flash",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 572,
        "completion_tokens_details": {
          "reasoning_tokens": 951
        },
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 9,
        "total_tokens": 1532
      },
      "meta": {
        "usage": {
          "credits_used": 11892,
          "usd_spent": 0.005946
        }
      }
    }

    Code Example #2: Image analysis

    Let’s provide the model with a JPG image as input and ask it to describe what is shown. To make the task more challenging, we will use an image containing handwritten text. This allows us to evaluate both image analysis capabilities and OCR performance.

    Input image
    Thanks, !
    Code (Python)
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        url = "https://api.aimlapi.com/v1/chat/completions",
        headers = {
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json"
        },
    
        json = {
            "model": "google/gemini-3-5-flash",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": "Describe the content of this image."
                        },
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/handwriting.jpg"
                            }
                        }
                    ]
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    Response
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Based on the image, here is a description of its content:\n\nThe image shows a close-up photograph of handwritten text on lined notebook paper, with a red vertical margin line visible on the left side. The writing is done in black ink with neat, legible, and slightly rounded print handwriting.\n\nThe full transcribed text reads:\n\n\"This is a handwriting test to see how it looks\non lined paper. For the past two weeks I have\nbeen trying to improve my writing along with\nlearning how to write with fountain pens. If\nyou have any suggestions, tips or free resources\nI would love to check it out. Hope everyone\nis having a good day. ☺\"\n\nThe paragraph ends with a hand-drawn smiley face symbol.",
            "extra_content": {
              "google": {
                "thought_signature": "AY89a1+6EZJZKubCMjDAZLsT31Ccl2e2deInjUf/qqG9k0o9inGGqk5Z8YAKGaaCZSb6XEw10C1fPydZ8BjRLU/2LHq3P3LERvjlskNlwKZ/Hrd5misEA9LmDw/dSCGdAzF2q6HXW1/ehyiywERbwd9A/wa02MFL4uAxup+0uVEsTezC0VUwyzsEJyT/sOpSb+R7HeLMKtutqiqAGTwmOzgSkq4Dn0mlm6eim8Rc1fURAMiQmG2pGZGxQIsIBRuCGMeaIDgx/2L1RHU9tc7bqUsjATwPHTAL9Uq/LMCfFJ0QgnQ1RH4VIuJO9eYyqi5puhcEYPqqd1F8I2OTiGjMHljF2hd2UEp8HRXdjJHSeBhoT6D0fF6dWi2VRbyIG3jR6dK5jDvzrC4mBXE6tVfUrwLPCo1fpn/TCUzt0a4byvandBqpb0nUqlMOz9V9N2gt5uJDepKtWZ30rjyNw17gzKExTScJbA50hAcok8/P8XshauCjlvcMDgp49XLyZFNmRk09LU3Y65PTOMwu1MqEiMKpOe6xuJjpjLAnQfFFS6yJnnIqLOl3JO7hQUMVOaXesymJeGg0vrgkKOGPNUYhTOxkqLrJDgkQLhAoywYywg+DMiwBZO18Q/hfawojL8PQEOBVFH9/2TpDhHH29mQby1QxITv/X9I/etvPg8QS0aiO49CiHJEgR7ADBG8EPUy30asktO4JjCJwzd//rky4WScuitD0RfOFkk9GC0TBNfivQqt2mQ+kCU2W5LR1u3dDn4fY0Poi2C4FJ2A+WqLl/5z2HSNwxhA6LxMiQQanQ8S5flgqEpaMb3suaNtlO/c2UKXCku+kJUwhCInAeibXG33OFZOp6YrTtYw3Rz3aeNQ8mxKJw8ODsPKWLmqaDBGtQ+Vi9VEbF186x0W6cP6SfdwufsOYNm3vJ/yW4k8c7T4t0RhF969NcKOOmnJBLz+6E2ho6gEwzbKJedMC4xzeTlC/D6AOFlUF2+8oBw3enyjcXLIrd4rS863OQDeNoA2c9GigZMIKeNQoEr8SAhzMZmCh3U6HqnTDE5tHpeFl7cDEodNSHD7cEOAk5ujv6ClcIeRiv5qQMN3uQZilAavGHGFeDtdBAvYnFLjyp9EwAngDix6fWMSB1Gt8h2VVlliMXxXU87V4j7Qq8iMEh2H87xWcwRdUs+wtT4t8tpx/sWbzElSvwIc9v2ypC1J4ATzzQrJP6H8h62/f3t3bpiW1XLn8E5eWZ09ueNdoTuoauDHOomaM/84EwGAdrq+ZEIlbFX6FBjWjAv9YQRySXvhpfzV9avXyb6ewlJCaxP9ZaYaa7aAasWtRukWMZ1GqFX5tLgFoqRkBtneixmULmvsjrChV8P/5U0RUTUbYDDV3Qnxam+ABzMNoos5LmnmwsNwu5qDO6xAWKgXZS8A/uo5Tz9vyPJOtqPTG5S8I4z1ph/8YRCOTh3C6pUcKvOvIMfuY6WvCQrmnZo2FFNv1nPZ2fQntIv6XbJ/SNsLImu6SJ9WbMnWl54gDxcviEcOvYpzjA3EVincTDhOFda8cKxXPp+a7G77duUUPqPteokXANhPnTorm4UysAy0eMeW9KulH9wzucxym9hpCOw/NTEmjTwFjSa3HNIm/MROfN3nb1Hx/AxJaWiUALPRT11GbjSCw8KpGjaFRZ4P6iwqGpOr8ca1806OHqO6xOi5cwZtnKsJ2Qr1msr6FTXxAWUv6U3qK5NtEeTgOWB5FZJzcLxAQbVnAOAlkCXrrDSHUNT3fL1zWa/37hXOYsr2uPvlrymw/Rfer8Xg44NHQ6qK3I64NGADgbuDF9Dma/uD8dOOtej3QDz0AFy5g640Lygwb3A3iuv+S3WXxCYgzFsX4K94NUN+sm8iEVMBBDq0Xg93zeTv4rMCHTHyt7gM2AkqZTy8EPxKAJvvO8KxCEJwRpTdp0pAPGxegzfItltEs1Qq46yH2q8cKyHMm+xYqftHFz26sAuruiY991vicZJgMRIvNyd8VJ9MqKOmSTdlu+6eO9OY3n4Ag0bsSRyg1Kf5X1s9N96/juT/1k1iV3Z//H3VuKgULcFvXT75iqlwh7h7o9cDaqq4SHnKQWjlYa4civQsMLudRG2j+f7cRlpndMjwh9KyngqA+M8OczU/HNAG10b3X+ONVXC8bmYlfxbqeZQPfPXQ/KGVMEbQnXflsFxICSbamNSby5kULorRgpPgKBcgbiuPcmXPuqosOgVLzQdkIzl/2CjiIwpp8GNNXaEtlkSP2a1QP5pUeybIuoXH4X2CPL05J75u/XPdhjT90UfnRSBEt96TZ8gsGbt17oCSE6LrIXsGGji9WlstEnUKUSvWVHbZusmvcpTTw6U//IbK6BI1UE0S4kc5iKYyoq/2JmxETwWepGv4KKLVrOMfJPrJfjd2cIay4DqGwE2uCgALN8xEeFlN/cE3agkTy+T4/fPqF8Hx3nzQ5KORUH0waaiMbvE7ux8MTYW0kQtqR4NnwWi3MgvY4gCKF6F2XKf9SdYpEUnOECp6K8sqv+HfMzbB4ODL6Pa0fyMXj8fQPVu9S2CtHH+DttGGLuwOlxF2rICEIUjOS4bUMPJ5sdtJWxQWmoh8GmT2MiTUFHn+eby6om+3bqYQ9kt/0atwnqOP+xmUDNBBYDogHSHuI8R6vScNRJPM2PSDeuMNjpSuJM9jBX6u4idCIrRdCTMk2efS+nsOVvw7y7Y54hW/G/5lEo97KskPa1OSiCUkMZWMpRwLyMbpxlnNDTwCQhqg3Bdi47Qf/nUmv/HIQB5QqT+BHYpB1IB0Lf/uViwAfNFFV8U0LCBXLnv4yW9lFBOyvlMBjLu8AJWnN9Ul6Vx07GtC+8M+zbv5xdgX82dwD2tkMhOmBEi06hLqBiCwns/6I8LdKk7WMHNyyL04XZOMdJC2oHr24Ur8D3Ogvs/bdFDF+N04DrHGIxxokA9X+qzBBfelL0ZmsvrM/RE6aWQgGSkYQvSchlQMPl44ce8h16TIcJYRXlwAMAAryUd0Ng+gsvTwUqHGb7mxU5ZbpC1wWz2wwB5uq9TWnS7CNNITlI7q48qsxjK0qJhrzn94MOF2pTVI5cnw551T6jhIaM6b7DtQ64oGX6Kl12jONVyT5AQ+QqKtt5y0cP4W1zYZFdineNKAhd88l7zqn2ROqep57cchmfUl3gFWqOwF3mGGm7Ivc9UKcLVddXxt0HLTLdqrxejeed+5XwPjaRnff/RJZost+UCD0NtClsxsvemXlzLMFza/LGM1pe3jZffSDoTtELSBaR3JwH/dHmP+5fQTrBsS+M4FaTuGN3Q3NrnM1flFLFLRQGMZeszX30hwQ1uYEvCtkD6iporG6YPy0bhR86ewd2UsTX0L4RoY1AlxohAIfAMx7XKWfMkIPh2tA1iGYEaTD2ZaRIf0dUwkR+ED23J0CZAuepKCsLQl+Qo/CWwCy+JH8/K3zR328NCFXvEg3j/MB7NLeHFyN8C9O8wYhnNrGWWv2HrpCvtO3eNkQA25aVQnFE0dSUMFlHC+Ygvc/i+sf8+kkx2GHde1IfxP+skWIbvcTLNO7Im9/7zgsPErzc7agsM158925LoNrlCnQhQ0ZlmdcEP8f3zGaPUJ+UwxDDYjEOQ82kh25RuR+8DPTMWb3Nc8+S8NqT9WHqCZtUB4zl1E7FtvAPjQfCfTL0nmA0jHhJxDz7NGK83nsjYae9kVmG1z3uUt9ZAJSAymdV0GXGrWNOXs9uzsZVAY3UztsLgxqpuvuM9cK/flrR4CmfAq6szSBrDUH1LJzU1gmBTVUd8T0AQqsuARKsDOAqv2+QDgH3r79H6rVFSppdqi3fic+oMQNd1rNxkuGvPLypdLNEGsLsmmt7jBIN26FYaSLUv9JRFrZbUbVMP7hdxn0c5GnTriORyu8lDfFYuPK5cDhH1lK4LMEi5bQvGhAALQaxfsZ2m1OCqXrO0s8X/pY01Lg7bUenU+Uf9zIewZI7RcszlJRphcd+QxWa+tlxVZfl/bVQnhlDFtbvD6JOaSNuze7nrJAt/1Znx+iXBDCrQOQUxMjD8UKL46ipzct2YY7tfNC+vDVz4P6GO5dCatujYHrNAdmlnqVQA2AdpUEBCPzzIi6qh0Iru/XbYOooJMqVOMV0kqW44hsnlZKhYC4PvcvJy1V/x6cmpW1jdqNtLzQ7NJeOgNHiwfeRUD3/HDd2puNfIBNHuRU/kTBOxnZTyh/Js/OOKOf/gUk4iLdCCq8Dgvm3zjiJymjKdniXY2EIeZzEkX3OCYy9y0DLEzuqYm2YvNo/GPT5wl9Q2NWCiHCEY/HyK9dlZBCtaPl4K+vLr2btWnJhVyTbUF3MZCFmkXf3uGWm2u2bY+4sjvIDoJnTwAxH1gpZ2i28w/Yj3IWNUiCG4qCYf6O8u4fDCe7qBReLU18Yg0gKAkq6/9WGhYDu79MKOW77EcRZOwi72IFwDCVBgsFnSUoTRBadVmLeGWq9OoADCE6STumqOATpONjIb1VQWYd8KHzerSuVxrPNRZYsZCVQQ3pZ0SDVgGA2APx6ZgR7aVbzl5KHOfPkohgjS5YZtSK8AdjDt4KUeGChqs7uKqypdtVf6BacnMehH/v9oul3n9/MJTAdZi3kMMw4PvtUNDfOXSHLfXO/n1lefLiggnB5B106J2oOmihIEzFXFu1xDokGuhIlbRy7dwItdZ75Ji4pmrDmJ9boTGaPolQLPcRiUqUSG+iWKR+afYhixIv3d0L0yR2LHe0Cr3+MUGraXs4RsZPeyO9xiMtL4iTkbUH1HrZulQzwzmpNqrJa9ctGmC2rZVcCLeEyjqkIAyvfa8SSKeVRj2d/Cv/IYlGeeO4oiMKHRZG0xs04VghveQ4ZQ0I9gBljx5oi+NUnSDKk85b/JYKpZQ5Pxa8XrB6Fav2RvWiuFPYDiI0XR/Ta1Q5oSuamsnX5aFCAkkrCovuzI9ZRmg3jE61BQ2tTxeKV4aOy4c63MxtkZnmEaF56MltITAhVWz0zUgCIGmfYtiIMHNMRX83s/eHz0zhD9ZNJBVYxvLW3U5+tTUW30PU0DC7a7UyzVAi2DNKBcsJHoC9hG8Da1ZcGzsaeFPVmP+Y9ZLWiXXCfGbR0ZvU8NlPHnwf/vaJhF3fbjWX2GpEPIXthXtKA0oN2nRYnRnsWs+5XDFjHBR6HyeXMBpjQhpcB7DQ3J3xCguYKklWIjq5jUq5iDe2OwS0gexqgXt9G5kQ5KF7/97hnrJOXL8KPDfDlvjYfVZEnKHg0DfEOTOFr3gPNZq8AE6YAgcqi58ya9bs6N0wJ1hMxR1jfrazXZh3pfl1Y4kpmVjefCUFM9SO2Fjkgk7CkdhTBTvrfuQqVWUwpVkk6VLB+w4jXVL+cysLnY8ne9aitoL7ogXnJw9sLZ02akdnx2/fQxSu8IYHAEXjk2IX6J6g614jbelmH5ocKe27OzST2hmpaoC9XBcUJX5m3kg6xGNzdmYhZuE14PNK8ZKF1PXeKgc8QerGX3aZ4sZ/sVkhDPIFAI+t/GeflJk+qxDb6X50ba19RXCIaozRbvfrl6qBobc66WeO16CeTPOTy9ZaMJjYpP8Tlpfnd89E0NmjtdDQiIDCPHwgQizr58eEUWaXgvEeXrOHwBjiLMfdlf4TdOD1FKnpYzKcZ0Ekmqy+CoFlMsIzZy17wBtyts6j/+Z0DAYJsoD0N/b6s5T+VAVjWLm0yEZDQYqfO0WYjRaxoPcsqpQoZNFszzJo6yeh6uTz/PaIGNQXbHnetJYrTahU83Rdl7+1FLs2OnI5SJasiORfYXtXOw6/waAExlhYCuZXPnNkTsdNAtO04hztwJ5fCBuXMdAPZK59RttpEfg3aXni1I6DL1p1GSlw141y4wtnPdAhEa7NzrX5PKUu8GLuZllbAt9cY+68xwMGSa76QaWgP+0zRxB+pO4ZGGW9fN/seuCzIgqwhSueMaEv0RwqgnV0E+XI9SfQIsGYBf31qze+zcJoHaahvteF9G/dqlVX+kaZiZwvJTX04nCQXWj0YO12IsdbJQg/+U38gAI73lhZlaEbou6plmlsFtEGyYYYvdhPTAtXVTgVVGB7dlH9RnUE6x1VjssGd2/A8Q708OL6uHIynPYzdGojknw5Ej+6E7SscuIDtADonzZKNHLR/+ZC3JtqS79BppuJivaLrvBwhWWgqCOFleas3EoAtUgU93VPmqFWeCGmlCepR3IeKUo9SQFgw5I3yDDdVYJIrfmMj51i8/OWs/M1n3ChaJflsd1VguVKH6FY+/VXTPgo8+4OtIQeOVMSzevYe/mQwry5TLxZTtmRedfyi69QVovGP/NIp/e1Bs4Zvfaz+YVPZHHmlbRFDarx/jdhV4Wx5PgdJRC/"
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1779269370,
      "id": "-n4NauvOHq6fq8YPupjeuAQ",
      "model": "google/gemini-3.5-flash",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 154,
        "completion_tokens_details": {
          "reasoning_tokens": 1220
        },
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 1120,
        "total_tokens": 2494
      },
      "meta": {
        "usage": {
          "credits_used": 12174,
          "usd_spent": 0.006087
        }
      }
    }

    This documentation is valid for the following list of our models:

    • google/gemini-3-5-flash

    Try in Playground

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    API schema

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key

    Claude 4.1 Opus

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4.1

    • claude-opus-4-1

    Try in Playground

    All three IDs listed above refer to the same model; we support them for backward compatibility.

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Model Overview

    An upgrade to Claude Opus 4 on agentic tasks, real-world coding, and thinking.

    How to Make a Call

    Step-by-Step Instructions

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that show how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    API Schema

    Code Example #1: Without Thinking

    Response

    Code Example #2: Thinking Enabled

    Response

    Code Example #3: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    Response

    Claude 4.5 Haiku

    This documentation is valid for the following list of our models:

    • claude-haiku-4-5

    • anthropic/claude-haiku-4.5

    Try in Playground

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Model Overview

    The model offers coding performance comparable to Claude Sonnet 4, but at one-third the cost and more than twice the speed.

    How to Make a Call

    Step-by-Step Instructions

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    API Schema

    Code Example #1

    Response

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    Response

    Claude 4.6 Opus

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-6

    Try in Playground

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    Model Overview

    The model significantly advances its predecessor’s coding capabilities. It shows stronger planning, can handle longer and more complex agent-style workflows, operates more reliably in large codebases, and delivers improved code review and debugging that help it identify and fix its own mistakes. Beyond software development, Opus 4.6 applies these enhanced capabilities to everyday professional tasks, including financial analysis, research, and working with documents, spreadsheets, and presentations.

    How to make the first API call

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    API Schema

    Code Example #1

    Response

    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    Response
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-300b-a47b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-300b-a47b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-300b-a47b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • claude-opus-4-1-20250805

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4.1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_018y2VPSZ5nNnqS3goMsjMxE",
      "object": "chat.completion",
      "model": "claude-opus-4-1-20250805",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1754552562,
      "usage": {
        "prompt_tokens": 252,
        "completion_tokens": 1890,
        "total_tokens": 2142
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hell
                }
            ],
            "max_tokens": 1025, # must be greater than 'budget_tokens'
            "thinking":{
                "budget_tokens": 1024,
                "type": "enabled"
            }
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4.1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ],
            max_tokens: 1025, // must be greater than 'budget_tokens'
            thinking:{
                budget_tokens: 1024,
                type: 'enabled'
            }
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01G9P4b9HG3PeKm1rRvS8kop",
      "object": "chat.completion",
      "model": "claude-opus-4-1-20250805",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "The human has greeted me with a simple \"Hello\". I should respond in a friendly and helpful manner, acknowledging their greeting and inviting them to share how I can assist them today.",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1755704373,
      "usage": {
        "prompt_tokens": 1134,
        "completion_tokens": 9450,
        "total_tokens": 10584
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4.1",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_01CFq3WFrUdc39UqBrAohmVG","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" its","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and troubl","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ing destruction","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", often","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" simultaneously","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" individuals","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d societies","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can recognize","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" their fl","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"aws and work to overcome","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" them, even","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" if","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" imperfectly.\n\nThere","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something deeply","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" moving","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about how","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans create meaning through","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art, relationships","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and the","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" pursuit of understanding","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" despite","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" knowing","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" your","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own mortality. The diversity","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of human cultures","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and perspectives","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" extraordinary, though","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" this","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" also","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" leads","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to conflict.","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" prompte","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d your","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question - are you reflecting","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on humanity","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" from","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a particular angle","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", or just wondering","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":138,"total_tokens":154}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    Create an Account
    Generate an API Key
    code examples

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • claude-haiku-4-5-20251001

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-haiku-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-haiku-4.5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01HbdLU9f78VAHxuYZ7Qp9Y1",
      "object": "chat.completion",
      "model": "claude-haiku-4-5-20251001",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! 👋 How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1760650965,
      "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 16,
        "total_tokens": 24
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-haiku-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-haiku-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_019GuhDB2ckKZfFmFdNR5Q1H","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely interesting","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" think","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'re a","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" full","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of contradictions—","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"capable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" kin","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"dness and cr","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, creating","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" beautiful","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art while","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" causing","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" real","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" harm, building","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" communities","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" while isolating your","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"selves.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nA few","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" stan","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d out to me:","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The creativity","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"** is","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" striking","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"—the","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" drive","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to make meaning","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" through","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" stories","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", music","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", science","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and invention","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" almost fundamental","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to human nature.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The moral","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" weight","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" carry** is notable","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" too","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem uniqu","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ely b","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"urdened by questions about","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" live well, what's","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fair","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", what","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" owe each","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" other.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The scale","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" problems** you face is sob","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ering—you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve built","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" systems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" so","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex that even","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" running","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" them often don't fully understand the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" consequences.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" An","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d yet people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" trying to","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"do better.\n\nI'm genu","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely uncertain","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" some","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things though","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" don't know if I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'m roman","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ticizing humanity or missing","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" crucial","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about the","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can't fully","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" gra","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"sp what it's like to be embo","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"died, mor","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"tal, or to feel that weight","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of time","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" passing.","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat prompte","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d the","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question? Are you in","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" particular","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" mood about","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity—","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"hop","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"eful, frustrate","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d, curious?","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":248,"total_tokens":264}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    Create an Account
    Generate an API Key
    a code example

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-opus-4-6',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "msg_018e8mCDfQGYKTGHTdUKNmuU",
      "object": "chat.completion",
      "model": "claude-opus-4-6",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hi! That's a big question. Here are some honest thoughts:\n\n**What I find remarkable**\n- Human creativity and problem-solving ability\n- The capacity for empathy, cooperation, and self-sacrifice\n- Building cumulative knowledge across generations\n- Art, music, science - the drive to understand and express\n\n**What's more complicated**\n- Humans have a real capacity for both great kindness and great cruelty\n- There's often a gap between what people value in principle and how they actually behave\n- Progress on big problems (poverty, conflict, environment) is real but uneven and slow\n\n**My honest position**\nI'd be cautious about sweeping judgments in either direction. \"Humanity is wonderful\" and \"humanity is terrible\" are both oversimplifications. People are complex, shaped by circumstances, and capable of change.\n\nI should also be transparent: I'm an AI, so my \"perspective\" has limits. I don't experience human life. I can observe patterns in what humans have written and done, but I'd weight your lived experience of humanity more heavily than my outside view.\n\nWhat prompted the question? I'm curious whether you're feeling more optimistic or pessimistic about it.",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1770635443,
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 264,
        "total_tokens": 280
      },
      "meta": {
        "usage": {
          "credits_used": 17368
        }
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-6",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_018vTp5RY3pv9qS1euXt8AWb","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"Hi","role":"assistant","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"! That","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" big","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question.","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Here","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are some honest","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" thoughts:","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**What I find","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"**","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n-","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Human","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and problem","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-solving are","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genuinely impressive\n- The capacity","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for empathy,","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cooperation","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and building","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" societies","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n-","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Persistent","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" curiosity -","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" science","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", art, philosophy all","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" reflect","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" drive to understand and create","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**What seems","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" challenging","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"**\n- Humans often","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle with long","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-term thinking","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" vs","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". short","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-term impul","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ses\n- Trib","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"alism and conflict","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" persistent","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" though","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" not","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" inevitable\n- There","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a gap","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" between what people","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" know","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" they","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" *","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"should* do and what they actually do","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**My","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" honest","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" c","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"aveats**\n- I should","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be straight","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"forward:","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I'm an","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI, so I don't experience","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the way you do.","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" My","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" perspective","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is shaped","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" by text","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", not lived experience.","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n- I'd","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be skept","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ical of any AI","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" that","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" gives","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" either","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" purely","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" flat","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"tering or purely cyn","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ical answer to","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" this question.","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Reality","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seems","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" more mixed","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" think","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are neither","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" hero","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ic species","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" some","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" narrat","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ives suggest nor the do","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"omed one","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" others claim","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Mostly","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" people","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are trying to navigate","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complicated","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" lives with","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" imp","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"erfect information","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" mixed","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" motivations.\n\nWhat prompted","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question? I'm curious what angle","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you're thinking","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about.","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":258,"total_tokens":274}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}

    Create AI/ML API Key

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    Create AI/ML API Key
    Reddit
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3n-e4b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    Quickstart guide
    Quickstart guide
    Quickstart guide
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-chat
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-thinking-v3.2-exp
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    typestring · enumRequired

    The type of the content part.

    Possible values:
    datastringRequired

    Base64 encoded audio data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-omni-30b-a3b-captioner
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-chat",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-chat",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-thinking-v3.2-exp",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-thinking-v3.2-exp",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3n-e4b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3n-e4b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-5-0-thinking-latest
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-pro
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-code
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-next-80b-a3b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: meta-llama/llama-3.3-70b-versatile
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "meta-llama/llama-3.3-70b-versatile",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "meta-llama/llama-3.3-70b-versatile",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-5-0-thinking-latest",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-5-0-thinking-latest",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-pro",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-pro",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-code",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-code",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-non-thinking-v3.2-exp
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Either a URL of the image or the base64 encoded image data.

    string · uriOptional
    or
    stringOptional
    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Base64-encoded local video file.

    string · uriOptional
    or
    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    or
    any · nullableOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    normalize_tool_schemasbooleanOptional

    Enable provider compatibility normalization for tool function JSON schemas.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-1-flash-lite
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthracite-org/magnum-v4-72b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-4-31b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3-27b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: gryphe/mythomax-l2-13b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-235b-a22b-thinking-2507
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-next-80b-a3b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "gryphe/mythomax-l2-13b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "gryphe/mythomax-l2-13b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-235b-a22b-thinking-2507",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-235b-a22b-thinking-2507",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-non-thinking-v3.2-exp",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-non-thinking-v3.2-exp",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-1-flash-lite",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-1-flash-lite",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthracite-org/magnum-v4-72b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthracite-org/magnum-v4-72b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-4-31b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-4-31b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-27b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3-27b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-plus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-mini
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3-4b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-32b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    search_context_sizestring · enumOptional

    High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.

    Possible values:
    citystringOptional

    Free text input for the city of the user, e.g. San Francisco.

    countrystringOptional

    The two-letter ISO country code of the user, e.g. US.

    Pattern: ^[A-Z]{2}$
    regionstringOptional

    Free text input for the region of the user, e.g. California.

    timezonestringOptional

    The IANA timezone of the user, e.g. America/Los_Angeles.

    typestring · enumRequired

    The type of location approximation. Always approximate.

    Possible values:
    search_modestring · enumOptional

    Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.

    Default: academicPossible values:
    search_domain_filterstring[]Optional

    A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.

    return_imagesbooleanOptional

    Determines whether search results should include images.

    Default: false
    return_related_questionsbooleanOptional

    Determines whether related questions should be returned.

    Default: false
    search_recency_filterstring · enumOptional

    Filters search results based on time (e.g., 'week', 'day').

    Possible values:
    search_after_date_filterstringOptional

    Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    search_before_date_filterstringOptional

    Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_after_filterstringOptional

    Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_before_filterstringOptional

    Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-v4-pro
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-v4-pro",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-v4-pro",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-plus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-plus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-mini",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-mini",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-4b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3-4b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-32b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-32b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-x1-turbo-32k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.6-27b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-1-flash-lite-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-x1-1-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4.6
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-1-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-1-flash-lite-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-x1-1-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-x1-1-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.6",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4.6",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-x1-turbo-32k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-x1-turbo-32k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.6-27b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.6-27b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Other propertiesnumber · min: -100 · max: 100Optional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: Qwen/Qwen2.5-7B-Instruct-Turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-7
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-21b-a3b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    messagesany ofRequired

    A list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, documents (txt, pdf), images, and audio.

    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    sourcestringRequired
    titlestringRequired
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    or
    typestring · enumRequiredPossible values:
    tool_namestringRequired
    or
    typestring · enumRequiredPossible values:
    idstringRequired
    namestringRequired
    Other propertiesany · nullableOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    idstringRequired
    namestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    sourcestringRequired
    titlestringRequired
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    titlestringRequired
    urlstringRequired
    page_agestringOptional
    encrypted_contentstringRequired
    or
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    urlstringRequired
    retrieved_atstringOptional
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    encrypted_stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    error_messagestringOptional
    or
    typestring · enumRequiredPossible values:
    contentstringRequired
    file_typestring · enumRequiredPossible values:
    start_linenumberOptional
    num_linesnumberOptional
    total_linesnumberOptional
    or
    typestring · enumRequiredPossible values:
    is_file_updatebooleanRequired
    or
    typestring · enumRequiredPossible values:
    old_startnumberOptional
    old_linesnumberOptional
    new_startnumberOptional
    new_linesnumberOptional
    linesstring[]Optional
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    typestring · enumRequiredPossible values:
    tool_namestringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    or
    itemsone ofOptional
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Either a URL of the image or the base64 encoded image data.

    string · uriOptional
    or
    stringOptional
    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequiredPossible values:
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Base64-encoded local video file.

    string · uriOptional
    or
    stringOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequiredPossible values:
    contentstringRequired
    namestringRequired
    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    or
    any · nullableOptional
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemany ofOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    stringOptional
    or
    typestring · enumOptionalDefault: textPossible values:
    textstringRequired
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_char_indexnumberRequired
    start_char_indexnumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_page_numbernumberRequired
    start_page_numbernumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_block_indexnumberRequired
    start_block_indexnumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    encrypted_indexstringRequired
    titlestringRequired
    urlstringRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    end_block_indexnumberRequired
    search_result_indexnumberRequired
    sourcestringRequired
    start_block_indexnumberRequired
    titlestringRequired
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    typestring · enumRequiredPossible values:
    or
    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    toolsany ofOptional

    Definitions of tools that the model may use. If you include tools in your API request, the model may return tool_use content blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model using tool_result content blocks. Each tool definition includes: name: Name of the tool. description: Optional, but strongly-recommended description of the tool. input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.

    itemsone ofOptional
    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    requiredstring[]Optional
    Other propertiesany · nullableOptional
    typestring · enumOptionalPossible values:
    defer_loadingbooleanOptional
    eager_input_streamingbooleanOptional
    Other propertiesany · nullableOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: bashPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: memoryPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_editorPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_based_edit_toolPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_based_edit_toolPossible values:
    typestring · enumRequiredPossible values:
    max_charactersnumberOptional
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_searchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    max_usesnumberOptional
    typestring · enumRequiredPossible values:
    citystringOptional
    countrystringOptional
    regionstringOptional
    timezonestringOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_searchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    max_usesnumberOptional
    typestring · enumRequiredPossible values:
    citystringOptional
    countrystringOptional
    regionstringOptional
    timezonestringOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: tool_search_tool_bm25Possible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: tool_search_tool_regexPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    thinkingone ofOptional

    Configuration for enabling Claude's extended thinking. When enabled, responses include thinking content blocks showing Claude's thinking process before the final answer. Requires a minimum budget of 1,024 tokens and counts towards your max_tokens limit.

    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    displaystring · enumOptionalDefault: summarizedPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    displaystring · enumOptionalDefault: summarizedPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 64000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    or
    modelstring · enumRequiredPossible values:
    messagesany ofRequired

    A list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, documents (txt, pdf), images, and audio.

    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    sourcestringRequired
    titlestringRequired
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    or
    typestring · enumRequiredPossible values:
    tool_namestringRequired
    or
    typestring · enumRequiredPossible values:
    idstringRequired
    namestringRequired
    Other propertiesany · nullableOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    idstringRequired
    namestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    sourcestringRequired
    titlestringRequired
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    titlestringRequired
    urlstringRequired
    page_agestringOptional
    encrypted_contentstringRequired
    or
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    urlstringRequired
    retrieved_atstringOptional
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    callerone ofOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    encrypted_stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentany ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    stdoutstringRequired
    stderrstringRequired
    return_codenumberRequired
    typestring · enumRequiredPossible values:
    file_idstringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    error_messagestringOptional
    or
    typestring · enumRequiredPossible values:
    contentstringRequired
    file_typestring · enumRequiredPossible values:
    start_linenumberOptional
    num_linesnumberOptional
    total_linesnumberOptional
    or
    typestring · enumRequiredPossible values:
    is_file_updatebooleanRequired
    or
    typestring · enumRequiredPossible values:
    old_startnumberOptional
    old_linesnumberOptional
    new_startnumberOptional
    new_linesnumberOptional
    linesstring[]Optional
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    contentone ofRequired
    typestring · enumRequiredPossible values:
    error_codestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    typestring · enumRequiredPossible values:
    tool_namestringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceone ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired
    or
    typestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsany · nullableOptional
    titlestringOptional
    contextstringOptional
    or
    itemsone ofOptional
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Either a URL of the image or the base64 encoded image data.

    string · uriOptional
    or
    stringOptional
    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequiredPossible values:
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Base64-encoded local video file.

    string · uriOptional
    or
    stringOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequiredPossible values:
    contentstringRequired
    namestringRequired
    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    or
    any · nullableOptional
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemany ofOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    stringOptional
    or
    typestring · enumOptionalDefault: textPossible values:
    textstringRequired
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_char_indexnumberRequired
    start_char_indexnumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_page_numbernumberRequired
    start_page_numbernumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    document_indexnumberRequired
    document_titlestringRequired
    end_block_indexnumberRequired
    start_block_indexnumberRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    encrypted_indexstringRequired
    titlestringRequired
    urlstringRequired
    or
    typestring · enumRequiredPossible values:
    cited_textstringRequired
    end_block_indexnumberRequired
    search_result_indexnumberRequired
    sourcestringRequired
    start_block_indexnumberRequired
    titlestringRequired
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    disable_parallel_tool_usebooleanOptional
    or
    typestring · enumRequiredPossible values:
    or
    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    toolsany ofOptional

    Definitions of tools that the model may use. If you include tools in your API request, the model may return tool_use content blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model using tool_result content blocks. Each tool definition includes: name: Name of the tool. description: Optional, but strongly-recommended description of the tool. input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.

    itemsone ofOptional
    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    requiredstring[]Optional
    Other propertiesany · nullableOptional
    typestring · enumOptionalPossible values:
    defer_loadingbooleanOptional
    eager_input_streamingbooleanOptional
    Other propertiesany · nullableOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: bashPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: code_executionPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: memoryPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_editorPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_based_edit_toolPossible values:
    typestring · enumRequiredPossible values:
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: str_replace_based_edit_toolPossible values:
    typestring · enumRequiredPossible values:
    max_charactersnumberOptional
    Other propertiesany · nullableOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_searchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    max_usesnumberOptional
    typestring · enumRequiredPossible values:
    citystringOptional
    countrystringOptional
    regionstringOptional
    timezonestringOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_searchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    max_usesnumberOptional
    typestring · enumRequiredPossible values:
    citystringOptional
    countrystringOptional
    regionstringOptional
    timezonestringOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: web_fetchPossible values:
    typestring · enumRequiredPossible values:
    allowed_domainsstring[]Optional
    blocked_domainsstring[]Optional
    enabledbooleanOptional
    max_content_tokensnumberOptional
    max_usesnumberOptional
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: tool_search_tool_bm25Possible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    namestring · enumOptionalDefault: tool_search_tool_regexPossible values:
    typestring · enumRequiredPossible values:
    defer_loadingbooleanOptional
    strictbooleanOptional
    itemsstring · enumOptionalPossible values:
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    thinkingone ofOptional

    Configuration for enabling Claude's extended thinking. When enabled, responses include thinking content blocks showing Claude's thinking process before the final answer. Requires a minimum budget of 1,024 tokens and counts towards your max_tokens limit.

    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    displaystring · enumOptionalDefault: summarizedPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    displaystring · enumOptionalDefault: summarizedPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 128000
    Responses
    200Success
    application/json
    modelstringRequired

    Model used to generate the response.

    Example: claude-sonnet-4-20250514
    idstringRequired

    Unique identifier of the generated message.

    Example: msg_01SUmNmSRFZsoa6h96MxJEHH
    typestringRequired

    Object type.

    Example: message
    rolestringRequired

    Role of the message author.

    Example: assistant
    typestringRequired

    Content block type.

    Example: text
    textstringRequired

    Generated text content.

    Example: I'm functioning well, thank you for asking! I'm ready to help you with any questions or tasks you might have. How can I assist you today?
    stop_reasonstring · nullableOptional

    Reason why generation stopped.

    Example: end_turn
    stop_sequencestring · nullableOptional

    Custom stop sequence that caused generation to stop.

    stop_detailsany · nullableOptional

    Additional details about why generation stopped.

    input_tokensintegerRequired

    Number of input tokens used.

    Example: 27
    cache_creation_input_tokensintegerRequired

    Number of tokens used for cache creation.

    Example: 0
    cache_read_input_tokensintegerRequired

    Number of tokens retrieved from cache.

    Example: 0
    ephemeral_5m_input_tokensintegerRequired

    Number of tokens cached for 5 minutes.

    Example: 0
    ephemeral_1h_input_tokensintegerRequired

    Number of tokens cached for 1 hour.

    Example: 0
    output_tokensintegerRequired

    Number of output tokens generated.

    Example: 35
    service_tierstringRequired

    Service tier used for inference.

    Example: standard
    inference_geostringRequired

    Geographic inference location.

    Example: not_available
    credits_usednumberRequired

    Number of credits consumed.

    Example: 1576
    usd_spentnumberRequired

    Amount spent in USD.

    Example: 0.000788
    post
    /v1/messages
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-21b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.6-35b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Either a URL of the image or the base64 encoded image data.

    string · uriOptional
    or
    stringOptional
    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Base64-encoded local video file.

    string · uriOptional
    or
    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    or
    any · nullableOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    normalize_tool_schemasbooleanOptional

    Enable provider compatibility normalization for tool function JSON schemas.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-5-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-21b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-21b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.6-35b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.6-35b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-5-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-5-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-7",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-7",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-21b-a3b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-21b-a3b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/messages' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "claude-sonnet-4-20250514",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "model": "claude-sonnet-4-20250514",
      "id": "msg_01SUmNmSRFZsoa6h96MxJEHH",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "I'm functioning well, thank you for asking! I'm ready to help you with any questions or tasks you might have. How can I assist you today?"
        }
      ],
      "stop_reason": "end_turn",
      "stop_sequence": "text",
      "stop_details": null,
      "usage": {
        "input_tokens": 27,
        "cache_creation_input_tokens": 0,
        "cache_read_input_tokens": 0,
        "cache_creation": {
          "ephemeral_5m_input_tokens": 0,
          "ephemeral_1h_input_tokens": 0
        },
        "output_tokens": 35,
        "service_tier": "standard",
        "inference_geo": "not_available"
      },
      "meta": {
        "usage": {
          "credits_used": 1576,
          "usd_spent": 0.000788
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-r1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    search_context_sizestring · enumOptional

    High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.

    Possible values:
    citystringOptional

    Free text input for the city of the user, e.g. San Francisco.

    countrystringOptional

    The two-letter ISO country code of the user, e.g. US.

    Pattern: ^[A-Z]{2}$
    regionstringOptional

    Free text input for the region of the user, e.g. California.

    timezonestringOptional

    The IANA timezone of the user, e.g. America/Los_Angeles.

    typestring · enumRequired

    The type of location approximation. Always approximate.

    Possible values:
    search_modestring · enumOptional

    Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.

    Default: academicPossible values:
    search_domain_filterstring[]Optional

    A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.

    return_imagesbooleanOptional

    Determines whether search results should include images.

    Default: false
    return_related_questionsbooleanOptional

    Determines whether related questions should be returned.

    Default: false
    search_recency_filterstring · enumOptional

    Filters search results based on time (e.g., 'week', 'day').

    Possible values:
    search_after_date_filterstringOptional

    Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    search_before_date_filterstringOptional

    Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_after_filterstringOptional

    Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_before_filterstringOptional

    Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-v4-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-r1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-r1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-v4-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-v4-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-5-0-thinking-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-vl-28b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: cohere/command-a
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-8k-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "cohere/command-a",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "cohere/command-a",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-8k-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-8k-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-5-0-thinking-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-5-0-thinking-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-vl-28b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-vl-28b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlany ofRequired

    Either a URL of the image or the base64 encoded image data.

    string · uriOptional
    or
    stringOptional
    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    or
    any · nullableOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    normalize_tool_schemasbooleanOptional

    Enable provider compatibility normalization for tool function JSON schemas.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    search_context_sizestring · enumOptional

    High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.

    Possible values:
    citystringOptional

    Free text input for the city of the user, e.g. San Francisco.

    countrystringOptional

    The two-letter ISO country code of the user, e.g. US.

    Pattern: ^[A-Z]{2}$
    regionstringOptional

    Free text input for the region of the user, e.g. California.

    timezonestringOptional

    The IANA timezone of the user, e.g. America/Los_Angeles.

    typestring · enumRequired

    The type of location approximation. Always approximate.

    Possible values:
    search_modestring · enumOptional

    Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.

    Default: academicPossible values:
    search_domain_filterstring[]Optional

    A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.

    return_imagesbooleanOptional

    Determines whether search results should include images.

    Default: false
    return_related_questionsbooleanOptional

    Determines whether related questions should be returned.

    Default: false
    search_recency_filterstring · enumOptional

    Filters search results based on time (e.g., 'week', 'day').

    Possible values:
    search_after_date_filterstringOptional

    Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    search_before_date_filterstringOptional

    Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_after_filterstringOptional

    Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_before_filterstringOptional

    Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: gpt-4o-2024-08-06
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    POST /v1/chat/completions HTTP/1.1
    Host: api.aimlapi.com
    Content-Type: application/json
    Accept: */*
    Content-Length: 1232
    
    {
      "model": "alibaba/qwen3.5-flash",
      "messages": [
        {
          "role": "user",
          "content": "text",
          "name": "text"
        }
      ],
      "max_tokens": 1,
      "stream": false,
      "stream_options": {
        "include_usage": true
      },
      "tools": [
        {
          "type": "function",
          "function": {
            "description": "text",
            "name": "text",
            "parameters": {
              "ANY_ADDITIONAL_PROPERTY": null
            },
            "strict": true
          }
        }
      ],
      "tool_choice": "none",
      "normalize_tool_schemas": true,
      "parallel_tool_calls": true,
      "temperature": 1,
      "top_p": 1,
      "stop": "text",
      "logit_bias": {
        "ANY_ADDITIONAL_PROPERTY": 1
      },
      "logprobs": true,
      "top_logprobs": 1,
      "frequency_penalty": 1,
      "prediction": {
        "type": "content",
        "content": "text"
      },
      "seed": 1,
      "presence_penalty": 1,
      "reasoning_effort": "low",
      "reasoning": {
        "effort": "low",
        "max_tokens": 1,
        "exclude": true
      },
      "response_format": {
        "type": "text"
      },
      "echo": true,
      "min_p": 1,
      "top_k": 1,
      "top_a": 1,
      "repetition_penalty": 1,
      "web_search_options": {
        "search_context_size": "low",
        "user_location": {
          "approximate": {
            "city": "text",
            "country": "text",
            "region": "text",
            "timezone": "text"
          },
          "type": "approximate"
        }
      },
      "search_mode": "academic",
      "search_domain_filter": [
        "text"
      ],
      "return_images": false,
      "return_related_questions": false,
      "search_recency_filter": "day",
      "search_after_date_filter": "text",
      "search_before_date_filter": "text",
      "last_updated_after_filter": "text",
      "last_updated_before_filter": "text"
    }
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-reasoner-v3.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-non-reasoner-v3.1-terminus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-6
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-coder-480b-a35b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-flash-lite-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4.5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-coder-480b-a35b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-coder-480b-a35b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-flash-lite-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4.5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "gpt-4o-2024-08-06",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-reasoner-v3.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-reasoner-v3.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-6",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-6",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-0.3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-lite
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentany ofRequired

    The contents of the tool message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    any · nullableOptional
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool to call.

    inputstringRequired

    The input for the custom tool call generated by the model.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the custom tool, used to identify it in tool calls.

    descriptionstringOptional

    Optional description of the custom tool, used to provide more context.

    formatone ofRequired

    The input format for the custom tool. Default is unconstrained text.

    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    definitionstringRequired

    The grammar definition.

    syntaxstring · enumRequired

    The syntax of the grammar definition.

    Possible values:
    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    modestring · enumRequired

    Constrains the tools available to the model to a pre-defined set.

    • auto allows the model to pick from among the allowed tools and generate a message.
    • required requires the model to call one or more of the allowed tools.
    Possible values:
    Other propertiesany · nullableOptional
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    functionstring · enumRequired

    The name of the function to call.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    customstring · enumRequired

    The name of the custom tool to call.

    Possible values:
    normalize_tool_schemasbooleanOptional

    Enable provider compatibility normalization for tool function JSON schemas.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.7-max
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-omni-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-turbo-128k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/seed-1-8
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-turbo-128k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-turbo-128k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/seed-1-8",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/seed-1-8",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-0.3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-0.3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-lite",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-lite",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.7-max",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.7-max",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-omni-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-omni-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-vl-32b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-chat-v3.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-max
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-1-pro-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Other propertiesnumber · min: -100 · max: 100Optional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: meta-llama/Llama-3.3-70B-Instruct-Turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-max",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-max",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-1-pro-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-1-pro-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-vl-32b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-vl-32b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-chat-v3.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-chat-v3.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-max-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-plus-20260218
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-max-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-max-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-plus-20260218",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-plus-20260218",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-max-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-reasoner-v3.1-terminus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-reasoner-v3.1-terminus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-reasoner-v3.1-terminus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-max-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-max-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.0-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-vl-424b-a47b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.0-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.0-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-vl-424b-a47b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-vl-424b-a47b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-vl-32b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-vl-32b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-vl-32b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-pro
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-haiku-4.5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-haiku-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-haiku-4.5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-pro",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-pro",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-omni-plus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-flash-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-flash-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-flash-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-omni-plus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-omni-plus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }