AI/ML API Documentation
API KeyModelsPlaygroundGitHubGet Support
  • 📞Contact Sales
  • 🗯️Send Feedback
  • Quickstart
    • đź§­Documentation Map
    • Setting Up
    • Supported SDKs
  • API REFERENCES
    • đź“’All Model IDs
    • Text Models (LLM)
      • AI21 Labs
        • jamba-1-5-mini
      • Alibaba Cloud
        • qwen-max
        • qwen-plus
        • qwen-turbo
        • Qwen2-72B-Instruct
        • Qwen2.5-7B-Instruct-Turbo
        • Qwen2.5-72B-Instruct-Turbo
        • Qwen2.5-Coder-32B-Instruct
        • Qwen-QwQ-32B
        • Qwen3-235B-A22B
      • Anthracite
        • magnum-v4
      • Anthropic
        • Claude 3 Haiku
        • Claude 3.5 Haiku
        • Claude 3 Opus
        • Claude 3 Sonnet
        • Claude 3.5 Sonnet
        • Claude 3.7 Sonnet
      • Cohere
        • command-r-plus
      • DeepSeek
        • DeepSeek V3
        • DeepSeek R1
      • Google
        • gemini-1.5-flash
        • gemini-1.5-pro
        • gemini-2.0-flash-exp
        • gemini-2.0-flash-thinking-exp-01-21
        • gemini-2.0-flash
        • gemini-2.5-flash-preview
        • gemini-2.5-pro-exp
        • gemini-2.5-pro-preview
        • gemma-2
        • gemma-3
      • Gryphe
        • MythoMax-L2-13b-Lite
      • Meta
        • Llama-3-chat-hf
        • Llama-3-8B-Instruct-Lite
        • Llama-3.1-8B-Instruct-Turbo
        • Llama-3.1-70B-Instruct-Turbo
        • Llama-3.1-405B-Instruct-Turbo
        • Llama-3.2-11B-Vision-Instruct-Turbo
        • Llama-3.2-90B-Vision-Instruct-Turbo
        • Llama-Vision-Free
        • Llama-3.2-3B-Instruct-Turbo
        • Llama-3.3-70B-Instruct-Turbo
        • Llama-4-scout
        • Llama-4-maverick
      • MiniMax
        • text-01
        • abab6.5s-chat
      • Mistral AI
        • codestral-2501
        • mistral-nemo
        • mistral-tiny
        • Mistral-7B-Instruct
        • Mixtral-8x22B-Instruct
        • Mixtral-8x7B-Instruct
      • NVIDIA
        • Llama-3.1-Nemotron-70B-Instruct-HF
        • llama-3.1-nemotron-70b
      • NeverSleep
        • llama-3.1-lumimaid
      • NousResearch
        • Nous-Hermes-2-Mixtral-8x7B-DPO
      • OpenAI
        • gpt-3.5-turbo
        • gpt-4
        • gpt-4-preview
        • gpt-4-turbo
        • gpt-4o
        • gpt-4o-mini
        • gpt-4o-audio-preview
        • gpt-4o-mini-audio-preview
        • gpt-4o-search-preview
        • gpt-4o-mini-search-preview
        • o1
        • o1-mini
        • o1-preview
        • o3-mini
        • gpt-4.5-preview
        • gpt-4.1
        • gpt-4.1-mini
        • gpt-4.1-nano
        • o4-mini
      • xAI
        • grok-beta
        • grok-3-beta
        • grok-3-mini-beta
    • Image Models
      • Flux
        • flux-pro
        • flux-pro/v1.1
        • flux-pro/v1.1-ultra
        • flux-realism
        • flux/dev
        • flux/dev/image-to-image
        • flux/schnell
      • Google
        • Imagen 3.0
      • OpenAI
        • DALL·E 2
        • DALL·E 3
      • RecraftAI
        • Recraft v3
      • Stability AI
        • Stable Diffusion v3 Medium
        • Stable Diffusion v3.5 Large
    • Video Models
      • Alibaba Cloud
        • Wan 2.1 (Text-to-Video)
      • Google
        • Veo2 (Image-to-Video)
        • Veo2 (Text-to-Video)
      • Kling AI
        • v1-standard/image-to-video
        • v1-standard/text-to-video
        • v1-pro/image-to-video
        • v1-pro/text-to-video
        • v1.6-standard/text-to-video
        • v1.6-standard/image-to-video
        • v1.6-pro/image-to-video
        • v1.6-pro/text-to-video
        • v1.6-standard/effects
        • v1.6-pro/effects
        • v2-master/image-to-video
        • v2-master/text-to-video
      • Luma AI
        • Text-to-Video v2
        • Text-to-Video v1 (legacy)
      • MiniMax
        • video-01
        • video-01-live2d
      • Runway
        • gen3a_turbo
        • gen4_turbo
    • Music Models
      • MiniMax
        • minimax-music [legacy]
        • music-01
      • Stability AI
        • stable-audio
    • Voice/Speech Models
      • Speech-to-Text
        • stt [legacy]
        • Deepgram
          • nova-2
        • OpenAI
          • whisper-base
          • whisper-large
          • whisper-medium
          • whisper-small
          • whisper-tiny
      • Text-to-Speech
        • Deepgram
          • aura
    • Content Moderation Models
      • Meta
        • Llama-Guard-3-11B-Vision-Turbo
        • LlamaGuard-2-8b
        • Meta-Llama-Guard-3-8B
    • 3D-Generating Models
      • Stability AI
        • triposr
    • Vision Models
      • Image Analysis
      • OCR: Optical Character Recognition
        • Google
          • Google OCR
        • Mistral AI
          • mistral-ocr-latest
      • OFR: Optical Feature Recognition
    • Embedding Models
      • Anthropic
        • voyage-2
        • voyage-code-2
        • voyage-finance-2
        • voyage-large-2
        • voyage-large-2-instruct
        • voyage-law-2
        • voyage-multilingual-2
      • BAAI
        • bge-base-en
        • bge-large-en
      • Google
        • textembedding-gecko
        • text-multilingual-embedding-002
      • OpenAI
        • text-embedding-3-large
        • text-embedding-3-small
        • text-embedding-ada-002
      • Together AI
        • m2-bert-80M-retrieval
  • Solutions
    • Bagoodex
      • AI Search Engine
        • Find Links
        • Find Images
        • Find Videos
        • Find the Weather
        • Find a Local Map
        • Get a Knowledge Structure
    • OpenAI
      • Assistants
        • Assistant API
        • Thread API
        • Message API
        • Run and Run Step API
        • Events
  • Use Cases
    • Create Images: Illustrate an Article
    • Animate Images: A Children’s Encyclopedia
    • Create an Assistant to Discuss a Specific Document
    • Create a 3D Model from an Image
    • Create a Looped GIF for a Web Banner
    • Read Text Aloud and Describe Images: Support People with Visual Impairments
    • Summarize Websites with AI-Powered Chrome Extension
  • Capabilities
    • Completion and Chat Completion
    • Streaming Mode
    • Code Generation
    • Thinking / Reasoning
    • Function Calling
    • Vision in Text Models (Image-To-Text)
    • Web Search
    • Features of Anthropic Models
    • Model comparison
  • FAQ
    • Can I use API in Python?
    • Can I use API in NodeJS?
    • What are the Pro Models?
    • How to use the Free Tier?
    • Are my requests cropped?
    • Can I call API in the asynchronous mode?
    • OpenAI SDK doesn't work?
  • Errors and Messages
    • General Info
    • Errors with status code 4xx
    • Errors with status code 5xx
  • Glossary
    • Concepts
  • Integrations
    • đź§©Our Integration List
    • Langflow
    • LiteLLM
Powered by GitBook
On this page
  • Solution Overview
  • Main Entities in Assistants' Workflow
  • How to Use Assistant API
  • Example #1: Chat With Assistant (Without Streaming)
  • Example #2: Chat With Assistant (With Streaming)

Was this helpful?

  1. Solutions
  2. OpenAI

Assistants

PreviousOpenAINextAssistant API

Last updated 11 days ago

Was this helpful?

Solution Overview

OpenAI Assistants provide a powerful business solution for integrating AI-driven interactions into applications and workflows. By defining specific roles, instructions, and tools, businesses can create tailored AI assistants capable of handling customer support, data analysis, content generation, and more. With structured thread-based conversations and API-driven automation, Assistants streamline complex processes, enhance user engagement, and improve operational efficiency. Whether used for internal productivity or customer-facing interactions, they offer a scalable way to incorporate advanced AI capabilities into any business environment.


Main Entities in Assistants' Workflow

Here's a simple explanation of how Messages, Threads, Assistants, Runs, and Tools work in OpenAI's Assistants' workflow:

  • A Message is a single exchange in a conversation. It can be something a user sends to the Assistant (like a question or command) or something the Assistant replies with. Think of it as one text bubble in a chat.

  • A Thread is a collection of Messages that belong to the same conversation. It keeps all the back-and-forth exchanges together, so the Assistant can remember what was said earlier in that conversation. Each time a user starts a new discussion, it creates a new Thread. There is no limit to the number of Messages you can store in a Thread. When the size of the Messages exceeds the model's context window, the Thread will smartly truncate messages before fully dropping the least important ones.

  • An Assistant is the AI itself—its personality, skills, and behavior. When you set up an Assistant, you define what it knows, how it should respond, and whether it can use extra Tools (like APIs or file uploads). One Assistant can be used across multiple Threads and users.

  • A Run is the process that executes the Assistant’s response within a Thread. When a user sends a Message, a Run is created to generate the Assistant’s reply based on the conversation history and its predefined behavior. Runs manage the execution flow, including any tool calls the Assistant might need to make. A Run goes through different statuses—such as queued, in_progress, completed, or failed—indicating its current state. Each Thread can have multiple Runs, ensuring smooth and structured interactions between the user and the Assistant.

  • A Tool is an additional capability that an Assistant can use to enhance its responses. Tools allow the Assistant to perform external actions, such as calling APIs, retrieving files, or running custom functions. (You may have already encountered this concept by using when calling text models without creating Assistants.) When a Tool is enabled, the Assistant can decide when to use it based on the conversation context. If a Tool is required during a Run, the process pauses until the necessary output is provided. This makes Tools essential for handling complex tasks that go beyond simple text-based interactions. Currently, three Tool options are available:

    • Code Interpreter, which can write or debug code for you in different programming languages,

    • File Search, which can analyze the file you provided and is capable of discussing its contents,

    • Function Calling, which can call your custom functions.


How to Use Assistant API

1
from openai import OpenAI
client = OpenAI()

assistant = client.beta.assistants.create(
  name="Math Tutor",
  instructions="You are a personal math tutor. Write and run code to answer math questions.",
  tools=[{"type": "code_interpreter"}],
  model="gpt-4o",
)
2
thread = client.beta.threads.create()
3
message = client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content="I need to solve the equation `3x + 11 = 14`. Can you help me?"
)
4

Note that runs expire ten minutes after creation. Be sure to submit your Tool outputs before the 10-minute mark.

run = client.beta.threads.runs.create_and_poll(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="Please address the user as Jane Doe. The user has a premium account."
)

Example #1: Chat With Assistant (Without Streaming)

Full Step-by-Step Explanation

Unlike the streaming version, this version waits for the full response before displaying it. Below is a detailed breakdown of its execution.


1. Initialization Phase

Objects Created:

  1. OpenAI Client (client)

    • Connects to OpenAI's API using an AIMLAPI key and base URL.

  2. Assistant (my_assistant)

    • Created using client.beta.assistants.create().

    • Receives predefined instructions ("You are a helpful assistant.").

    • Assigned a model ("gpt-4o").

    • Generates an Assistant ID (assistant_id), which will be used to process Messages.

  3. Thread (thread)

    • Created using client.beta.threads.create().

    • The conversation takes place within this Thread, ensuring context persistence.

    • Generates a Thread ID (thread_id), which links all messages together.


2. Chat Loop Execution

The script enters a while loop, allowing continuous conversation:

  1. User Input (user_input)

    • The user types a Message (stored in user_input).

    • If the user types "exit" or "quit", the loop terminates.

  2. Message Sent to the Assistant (send_message(user_message))

    • If the Message is empty, a warning is displayed.

    • The Message is added to the Thread using:

      client.beta.threads.messages.create(
          thread_id=thread_id,
          role="user",
          content=user_message
      )
  3. Assistant Processes the Message (run = client.beta.threads.runs.create_and_poll())

    • This function:

      • Starts processing the Message.

      • Waits until the Assistant completes its response.

      • Returns the Run status.

  4. Fetching the Assistant’s Response (client.beta.threads.messages.list())

    • If the Run status is "completed", the script retrieves all Messages from the Thread history.

    • The last Assistant Message is extracted and displayed:

      for message in reversed(messages.data):
          if message.role == "assistant":
              print(f"assistant > {message.content[0].text.value}")
              return
  5. Error Handling

    • If something goes wrong, an error message is displayed:

      print("⚠️ Error: Failed to get a response from the assistant.")
  6. Loop Repeats Until User Exits


Summary of Identifiers Passed Between Functions

Identifier
Purpose
Where It's Used

assistant_id

Identifies the Assistant instance

Created during initialization, used in send_message()

thread_id

Links all Messages in the conversation

Created during initialization, used in send_message() and fetching responses

user_message

Stores user input

Passed from the chat loop to send_message()

run.status

Tracks response completion

Checked after create_and_poll()

message.content[0].text.value

Holds the Assistant's response

Extracted from messages.list()


Final Execution Flow Summary

  1. Initialize API client, Assistant, and Thread (IDs stored).

  2. Enter chat loop → user types a Message.

  3. Send Message to Assistant (send_message()).

  4. Wait for Assistant to generate the full response.

  5. Fetch and display the response.

  6. Loop repeats until user types "exit".

🚀 This version provides a more structured, response-driven approach to interacting with the assistant.

import openai
from openai import OpenAI

# Connect to OpenAI API
client = OpenAI(
    api_key="<YOUR_AIMLAPI_KEY>",
    base_url="https://api.aimlapi.com/"
)

# Create an Assistant
my_assistant = client.beta.assistants.create(
    instructions="You are a helpful assistant.",
    name="AI Assistant",
    model="gpt-4o",  # Specify the model
)

assistant_id = my_assistant.id  # Store Assistant ID
thread = client.beta.threads.create()  # Create a new Thread
thread_id = thread.id  # Store the Thread ID


def send_message(user_message):
    """Send a message to the Assistant and receive a full response"""
    if not user_message.strip():
        print("⚠️ Message cannot be empty!")
        return

    # Add the user's Message to the thread
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=user_message
    )

    # Start a new Run and wait for completion
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread_id,
        assistant_id=assistant_id,
        instructions="Keep responses concise and clear."
    )

    # Check if the Run was successful
    if run.status == "completed":
        # Retrieve messages from the thread
        messages = client.beta.threads.messages.list(thread_id=thread_id)
        
        # Find the last Assistant Message
        for message in reversed(messages.data):
            if message.role == "assistant":
                print()  # Add an empty line for spacing
                print(f"Assistant > {message.content[0].text.value}")
                return

    print("⚠️ Error: Failed to get a response from the Assistant.")


# Main chat loop
print("🤖 AI Assistant is ready! Type 'exit' to quit.")
while True:
    user_input = input("\nYou > ")
    if user_input.lower() in ["exit", "quit"]:
        print("đź‘‹ Chat session ended. See you next time!")
        break
    send_message(user_input)
Assistant Interaction Example
🤖 AI Assistant is ready! Type 'exit' to quit.

You > Hi! Could you write me a python function that squares the sum of two numbers?                     

Assistant > Certainly! Here's a simple Python function that squares the sum of two numbers:

```python
def square_of_sum(a, b):
    return (a + b) ** 2

# Example usage:
result = square_of_sum(3, 4)
print(result)  # Output: 49
```

This function takes two arguments, `a` and `b`, calculates their sum, and then returns the square of that sum.
Đ’Ń‹ > Wow! Thanks a lot!

Assistant > You're welcome! If you have any more questions or need further assistance, feel free to ask. 
Happy coding!

You > exit 

Assistant > đź‘‹ Chat session ended. See you next time!

Example #2: Chat With Assistant (With Streaming)

Full Step-by-Step Explanation

1. Initialization Phase

Objects Created:

  1. OpenAI Client (client)

    • Connects to OpenAI's API with an AIMLAPI key and base URL.

  2. Assistant (my_assistant)

    • Created using client.beta.assistants.create().

    • Receives predefined instructions ("You are a helpful assistant.").

    • Assigned a model ("gpt-4o").

    • Generates an Assistant ID (assistant_id), which will be used later for Message processing.

  3. Thread (thread)

    • Created using client.beta.threads.create().

    • Each chat session operates within a single Thread, meaning all Messages in this session will share context.

    • Generates a Thread ID (thread_id), which links Messages together.


2. Handling Events: AssistantEventHandler

To enable streaming responses, the script defines a custom event handler:

Event: on_text_created(self, text)

  • Triggered when the Assistant starts generating a response.

  • Displays assistant > to indicate that a response is coming.

Event: on_text_delta(self, delta, snapshot)

  • Triggered when the Assistant sends a new portion of text.

  • Prints the received delta.value in real time, making the response feel dynamic.

Event: on_tool_call_created(self, tool_call)

  • Triggered when the Assistant calls an external Tool (e.g., a Code Interpreter).

  • Displays assistant > {tool_call.type} to indicate the Tool being used.

Event: on_tool_call_delta(self, delta, snapshot)

  • Triggered when the Assistant sends updates related to Tool execution.

  • If the Assistant runs a Code Interpreter, it prints:

    • The input command being executed.

    • Any output logs produced.


3. Chat Loop Execution

The script enters a while loop, allowing continuous conversation:

  1. User Input (user_input)

    • The user types a Message (stored in user_input).

    • If the user types "exit" or "quit", the loop terminates.

  2. Message Sent to the Assistant (send_message(user_message))

    • If the Message is empty, a warning is displayed.

    • The Message is added to the Thread using:

      client.beta.threads.messages.create(
          thread_id=thread_id,
          role="user",
          content=user_message
      )
    • The Assistant processes the Message via:

      with client.beta.threads.runs.stream(
          thread_id=thread_id,
          assistant_id=assistant_id,
          instructions="Keep responses concise and clear.",
          event_handler=EventHandler(),
      ) as stream:
          stream.until_done()
    • The Thread ID and Assistant ID ensure the assistant maintains context.


4. Streaming Response Processing

  • The Assistant starts generating a response asynchronously.

  • The event handler (EventHandler) captures each update and displays it in real time.

  • The loop repeats until the user exits.


Summary of Identifiers Passed Between Functions

Identifier
Purpose
Where It's Used

assistant_id

Identifies the assistant instance

Created during initialization, used in send_message()

thread_id

Links all messages in the conversation

Created during initialization, used in send_message() and streaming

user_message

Stores user input

Passed from the chat loop to send_message()

delta.value

Holds streamed response text

Displayed by on_text_delta()


Final Execution Flow Summary

  1. Initialize API client, Assistant, and Thread (IDs stored).

  2. Enter chat loop → user types a Message.

  3. Send Message to Assistant (send_message()).

  4. Stream Assistant's response (processed event by event).

  5. Loop repeats until user types "exit".

🚀 This structure ensures a smooth, real-time chat experience with persistent context.

import openai
from openai import OpenAI
from typing_extensions import override
from openai import AssistantEventHandler

# Connect to OpenAI API
client = OpenAI(
    api_key="<YOUR_AIMLAPI_KEY>",
    base_url="https://api.aimlapi.com/"
)

# Create an assistant
my_assistant = client.beta.assistants.create(
    instructions="You are a helpful assistant.",
    name="AI Assistant",
    model="gpt-4o",  # Specify the model
)

assistant_id = my_assistant.id  # Store assistant ID
thread = client.beta.threads.create()  # Create a new thread
thread_id = thread.id  # Store the thread ID


# Event handler for streaming responses
class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        print("\nAssistant >", end=" ", flush=True)

    @override
    def on_text_delta(self, delta, snapshot):
        print(delta.value, end="", flush=True)

    def on_tool_call_created(self, tool_call):
        print(f"\nAssistant > {tool_call.type}\n", flush=True)

    def on_tool_call_delta(self, delta, snapshot):
        if delta.type == 'code_interpreter':
            if delta.code_interpreter.input:
                print(delta.code_interpreter.input, end="", flush=True)
            if delta.code_interpreter.outputs:
                print(f"\n\noutput >", flush=True)
                for output in delta.code_interpreter.outputs:
                    if output.type == "logs":
                        print(f"\n{output.logs}", flush=True)


def send_message(user_message):
    """Send a message to the Assistant and display the response"""
    if not user_message.strip():
        print("⚠️ Message cannot be empty!")
        return

    # Add the user's Message to the Thread
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=user_message
    )

    # Start processing the Message with streaming output
    with client.beta.threads.runs.stream(
        thread_id=thread_id,
        assistant_id=assistant_id,
        instructions="Keep responses concise and clear.",
        event_handler=EventHandler(),
    ) as stream:
        stream.until_done()


# Main chat loop
print("🤖 AI Assistant is ready! Type 'exit' to quit.")
while True:
    user_input = input("\nYou > ")
    if user_input.lower() in ["exit", "quit"]:
        print("đź‘‹ Chat session ended. See you next time!")
        break
    send_message(user_input)
Assistant Interaction Example
🤖 AI Assistant is ready! Type 'exit' to quit.

You > Hi! Please write 3 good things about corn flakes for me 

Assistant > Hi! Here are three good things about corn flakes:

1. **Nutritional Value**: Corn flakes are typically fortified with essential vitamins and minerals such as iron, vitamin B, and folic acid, contributing to a balanced diet.

2. **Low in Fat**: Corn flakes are generally low in fat, making them a suitable option for those looking 
to maintain or reduce their fat intake.

3. **Convenience**: They are quick and easy to prepare, offering a convenient breakfast option or snack with minimal preparation.

You > Thank you! And now, after the flakes, it's a good day for Python, isn't it? :)

Assistant > Absolutely! A good bowl of ideas with a sprinkle of Python code can make any day better. Whether it's for data analysis, web development, automation, or just playing around with new concepts, Python is versatile and beginner-friendly, making it a great choice to dive into. Happy coding!

You > Bye! exit

Assistant > Goodbye! If you have more questions in the future, feel free to reach out. Have a great day!

. To set up an Assistant, you need to choose an AI model that will handle chat completion. The selected model determines the Assistant’s capabilities and response quality. Additionally, you should define the Assistant’s role and behavior by providing instructions—this will guide how it interacts with users. Additionally, you can add files to further train the Assistant on specific materials, enhancing its ability to provide more tailored responses. Enabling tools like Code Interpreter, File Search, and Function Calling can further improve its functionality. Once created, the Assistant will be assigned a unique ID, which you’ll use in further interactions. The method returns the ID of the created Assistant, which you can use later to link it with Threads and Runs.

where the interaction between the created Assistant and the user (a human) will take place. This method returns the ID of the created Thread.

either directly in the code or by passing it from a form. This message will be the first in your Thread. You must specify role: 'user' and the corresponding thread_id.

to initiate the Chat Completion process for messages within the specified Thread using the specified Assistant. This is very similar to calling a model without using the Assistants and Threads framework. If external Tool calls might be needed during processing, you must predefine the available Tools in the tools parameter and set tool_choice to 'auto'.

This Python script implements a console-based chat using OpenAI's Assistant API, with gpt-4o as the default model. However, you can replace it with any of your choice.

This Python script implements a console-based chat using OpenAI's Assistant API, with steaming mode and gpt-4o as the default model. However, you can replace it with any of your choice, except for o1, as it does not support streaming. Below is a detailed breakdown of what happens during execution.

Function Calling
OpenAI text model
OpenAI text model
Create the first user message
Create a Thread
Create a Run
Create an Assistant