This guide uses a more advanced model, GPT-4o, and also explains how to use various chat model capabilities:
streaming mode
calling tools
uploading images to the model for analysis
uploading files to the model for analysis
web search
If you need help with API keys or environment configuration, go back to the previous step and follow the detailed quickstart guide for the Gemma 3 model.
Making an API Call
The chat model used in this example is more advanced. In addition to regular user messages, it supports the system role in the messages parameter, which can be used to define global instructions that affect the model’s overall behavior, for example:
messages:[{ role:"system", content:"You are a travel agent. Be descriptive and helpful.",},{ role:"user", content:"Tell me about San Francisco",},],
Here’s the complete code you can use right away in a cURL, Python, or Node.js program. You only need to replace <YOUR_AIMLAPI_KEY> with your AIML API key from your account, provide your behavior instructions in the system prompt, and place your request to the model in the user prompt.
Using Streaming Mode
Streaming lets the model send partial responses as they’re generated instead of waiting for the full output — useful for real‑time feedback.
Full Streaming Response (Raw Events)
This example shows how to consume the streaming response as-is, without abstraction. Each chunk is processed in real time, exposing the full event structure returned by the API.
Use this approach if you need:
access to all event types
fine-grained control over parsing
debugging or logging of raw responses
support for metadata beyond plain text
Example raw streaming response
Streaming Response Processing (Text Extraction)
This example shows how to process the streaming response to extract only the generated text. Instead of handling all event types, the code filters incoming chunks and prints the content as it arrives. Use this approach if you only need the generated text.
Example processed clean streaming response
Tool calling
GPT‑4o can call functions/tools you define in the API request to extend behavior (e.g., performing calculations, retrieving structured data).
How it works
Initial request — The model receives the user prompt and the registered tool, and generates a tool_calls object indicating which function it wants to execute.
Extract and run the tool — Parse the arguments from the tool_calls object and execute the function locally.
Send back the result — Return the computed result to the model using the tool role and the content field.
Final response — The model incorporates the tool’s output and generates a complete answer for the user.
Example response
Image upload
GPT‑4o supports vision inputs: you can send an image URL in the messages request to let the model analyze or describe it.
Example response
Web search integration
With search‑preview models, you can perform live web search queries in combination with the model to get up‑to‑date results and grounded responses.
curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a travel agent. Be descriptive and helpful.",
},
{
"role": "user",
"content": "Tell me about San Francisco"
}
],
"temperature": 0.7,
"max_tokens": 512
}'
systemPrompt = 'You are a travel agent. Be descriptive and helpful.' // instructions
userPrompt = 'Tell me about San Francisco' // your request
async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4o',
messages:[
{
role: 'system',
content: systemPrompt,
},
{
role: 'user',
content: userPrompt
}
],
temperature: 0.7,
max_tokens: 512,
}),
});
const data = await response.json();
const answer = data.choices[0].message.content;
console.log('User:', userPrompt);
console.log('AI:', answer);
}
main();
import requests
import json # for getting a structured output with indentation
system_prompt = "You are a travel agent. Be descriptive and helpful."
user_prompt = "Tell me about San Francisco"
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"gpt-4o",
"messages":[
{
"role":"system",
"content": system_prompt,
},
{
"role":"user",
"content": user_prompt,
}
],
"temperature": 0.7,
"max_tokens": 256,
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"gpt-4o",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
],
"stream": True
}
)
# data = response.json()
print(response.text)
from openai import OpenAI
# Initialize the client
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="YOUR_AIMLAPI_KEY",
base_url="https://api.aimlapi.com/v1"
)
# Create a streaming chat completion
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
stream=True
)
# Print raw chunks (similar to response.text in requests)
for chunk in stream:
print(chunk)
import requests
import json
url = "https://api.aimlapi.com/v1/chat/completions"
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
}
payload = {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Explain quantum computing simply."}
],
"stream": True
}
with requests.post(url, headers=headers, json=payload, stream=True) as r:
# Iterate over the streaming response line by line
for line in r.iter_lines():
if not line:
continue # Skip empty lines
# Decode bytes to string
line = line.decode("utf-8")
# SSE messages start with "data: "
if not line.startswith("data: "):
continue
# Remove the "data: " prefix
data_str = line[len("data: "):]
# "[DONE]" indicates the end of the stream
if data_str.strip() == "[DONE]":
break
try:
# Parse JSON payload
data = json.loads(data_str)
except json.JSONDecodeError:
continue # Skip malformed chunks
# Ensure "choices" exists and is not empty
choices = data.get("choices")
if not choices:
continue
# Extract text delta (OpenAI-style streaming format)
delta = data.get("choices", [{}])[0].get("delta", {})
content = delta.get("content")
# Print text as it arrives
if content:
print(content, end="")
from openai import OpenAI
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="<YOUR_AIMLAPI_KEY>",
base_url="https://api.aimlapi.com/v1"
)
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Explain quantum computing simply."}
],
stream=True
)
# Iterate over streaming chunks
for chunk in stream:
# Ensure choices exist and are not empty
if not chunk.choices:
continue
delta = chunk.choices[0].delta
content = getattr(delta, "content", None)
# Print text as it arrives
if content:
print(content, end="")
Quantum computing is a type of computing that uses principles of quantum mechanics to process information. Unlike classical computers, which use bits to represent data as 0s or 1s, quantum computers use quantum bits or qubits.
Qubits have unique properties that give quantum computers more power in certain tasks:
1. **Superposition**: A qubit can exist in multiple states (i.e., both 0 and 1) simultaneously. This allows quantum computers to process a vast amount of possibilities at once.
2. **Entanglement**: Qubits can be linked together in such a way that the state of one qubit can depend on the state of another, no matter the distance apart. This can lead to more efficient processing and problem-solving.
3. **Quantum Interference**: Quantum algorithms make use of interference, where different quantum states can amplify or cancel each other out, guiding the computation toward the correct answer.
Because of these properties, quantum computers have the potential to solve certain complex problems much faster than classical computers can, potentially revolutionizing fields like cryptography, materials science, and optimization. However, building practical quantum computers is extremely challenging due to issues with qubit stability and error rates.
from openai import OpenAI
import json
# Initialize the client
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="<YOUR_AIMLAPI_KEY>",
base_url="https://api.aimlapi.com/v1"
)
# Prepare the messages with text and image_url
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this scene:"},
{
"type": "image_url",
"image_url": {
"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"
}
}
]
}
]
# Create a chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
# Print full JSON response
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))
{
"id": "chatcmpl-DL3DDPif2s79HbOHySq6bVY8SAsKQ",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "The scene is an iconic Renaissance portrait showing a woman with an enigmatic smile, known for its mastery of detail and composition. The woman is seated against a distant, dreamlike landscape featuring winding paths and rocky formations. She wears a dark dress and light veil, with her hands delicately folded. The background's atmospheric perspective creates depth, with bluish mountains fading into the horizon. The artwork evokes a sense of mystery and balance.",
"refusal": null,
"role": "assistant",
"annotations": [],
"audio": null,
"function_call": null,
"tool_calls": null
}
}
],
"created": 1773909607,
"model": "gpt-4o-2024-08-06",
"object": "chat.completion",
"service_tier": "default",
"system_fingerprint": "fp_0a8aa8bfeb",
"usage": {
"completion_tokens": 85,
"prompt_tokens": 776,
"total_tokens": 861,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 7254
}
}
}
import json
import requests
from typing import Dict, Any
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
API_KEY = "<YOUR_AIMLAPI_KEY>"
BASE_URL = "https://api.aimlapi.com/v1"
HEADERS = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
def search_impl(arguments: Dict[str, Any]) -> Any:
return arguments
def chat(messages):
url = f"{BASE_URL}/chat/completions"
payload = {
"model": "gpt-4o-mini-search-preview",
"messages": messages,
"temperature": 0.6,
"tools": [
{
"type": "builtin_function",
"function": {"name": "$web_search"},
}
]
}
response = requests.post(url, headers=HEADERS, json=payload)
response.raise_for_status()
return response.json()["choices"][0]
def main():
messages = [
{"role": "system", "content": "You are GPT with web search skills."},
{"role": "user", "content": "Please search for AGI and tell me what it is in English."}
]
finish_reason = None
while finish_reason is None or finish_reason == "tool_calls":
choice = chat(messages)
finish_reason = choice["finish_reason"]
message = choice["message"]
if finish_reason == "tool_calls":
messages.append(message)
for tool_call in message["tool_calls"]:
tool_call_name = tool_call["function"]["name"]
tool_call_arguments = json.loads(tool_call["function"]["arguments"])
if tool_call_name == "$web_search":
tool_result = search_impl(tool_call_arguments)
else:
tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
messages.append({
"role": "tool",
"tool_call_id": tool_call["id"],
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print(message["content"])
if __name__ == "__main__":
main()
import json
from typing import Dict, Any
from openai import OpenAI
# Insert your API key
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="YOUR_AIMLAPI_KEY",
base_url="https://api.aimlapi.com/v1"
)
def search_impl(arguments: Dict[str, Any]) -> Any:
return arguments
def chat(messages):
response = client.chat.completions.create(
model="gpt-4o-mini-search-preview",
messages=messages,
temperature=0.6,
tools=[
{
"type": "function",
"function": {
"name": "$web_search",
"parameters": {
"type": "object",
"properties": {},
},
},
}
],
)
return response.choices[0]
def main():
messages = [
{"role": "system", "content": "You are GPT with web search skills."},
{"role": "user", "content": "Please search for AGI and tell me what it is in English."}
]
finish_reason = None
while finish_reason is None or finish_reason == "tool_calls":
choice = chat(messages)
finish_reason = choice.finish_reason
message = choice.message
if finish_reason == "tool_calls":
messages.append(message.model_dump())
for tool_call in message.tool_calls:
tool_call_name = tool_call.function.name
tool_call_arguments = json.loads(tool_call.function.arguments)
if tool_call_name == "$web_search":
tool_result = search_impl(tool_call_arguments)
else:
tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print(message.content)
if __name__ == "__main__":
main()
"AGI" is an acronym that can represent different terms depending on the context:
1. **Adjusted Gross Income**: In the United States, AGI refers to Adjusted Gross Income, which is a taxpayer's total income from all sources minus allowable adjustments. This figure is used to determine taxable income and eligibility for various tax benefits. ([usafacts.org](https://usafacts.org/articles/adjusted-gross-income-agi-definition?utm_source=openai))
2. **Artificial General Intelligence**: In the field of artificial intelligence, AGI stands for Artificial General Intelligence. This concept refers to AI systems that possess the ability to understand, learn, and apply knowledge across a wide range of tasks, matching or surpassing human cognitive abilities. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_general_intelligence?utm_source=openai))
3. **Alliance Graphique Internationale**: AGI also denotes the Alliance Graphique Internationale, an international organization of leading graphic artists and designers. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Alliance_Graphique_Internationale?utm_source=openai))
4. **Agi Language**: Additionally, "Agi" is the name of a Torricelli language spoken in Papua New Guinea. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Agi_language?utm_source=openai))
The specific meaning of "AGI" depends on the context in which it is used.