Quickstart
Access leading AI models (GPT-4o, Gemini, and others) through a single unified API. Initial setup takes just a few minutes. New accounts can make up to 10 free requests per hour.
If you are a manager and simply want to test a model to evaluate its performance, for instance in content generation, the quickest approach is to use our Playground. It offers an intuitive, user-friendly interface—no coding required.
Programmatic API calls are best suited for developers who want to integrate a model into their own apps.
Here, you'll learn how to start using our API in your code. The following steps must be completed regardless of which of our models you plan to call:
Let's walk through an example of connecting to the free-tier Gemma 3 model via REST API. After completing the steps, you will be able to generate text with this model at no cost.
Generating an AIML API Key
What is an API Key?
You can find your AIML API key on the account page.
An AIML API key is a credential that grants you access to our API from your code. It is a sensitive string that is shown only at creation time and should be kept confidential. Do not share this key with anyone, as it could be misused without your knowledge. If you lose it, generate a new key from your dashboard.
⚠️ Note that API keys from third-party organizations cannot be used with our API: you need an AIML API Key.
To use the AIML API, you need to create an account and generate an AIML API key. Follow these steps:
Create an Account: Visit the AI/ML API website and create an account.
Generate an API Key: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

Choosing the Development Environment
Each language has recommended environments for running code samples.
cURL
Python
Jupyter Notebook is a popular online environment for running Python code and is the fastest option if you do not want to install anything locally.
Visual Studio Code (VS Code) is a lightweight and widely used code editor that supports both Python and Node.js. It is suitable for running and debugging local examples and for working on real projects.
JavaScript
Visual Studio Code (VS Code)
In the examples below for cURL, JavaScript and Python, we use the REST API. This approach works with all of our APIs, but it is not the only way to integrate. You can use other supported SDKs.
Making an API Call
Based on your environment, you will call our API differently. Below are three common ways to call our API using two popular languages: cURL (a command-line format for making HTTP requests rather than a programming language), Python, and JavaScript (NodeJS).
If you want to get started really quickly, choose one of the four expandable sections below. Each one contains instructions for calling our model using different tools and environments. The first two options are especially simple and suitable even for beginners.
For completeness, the same example is explained in detail in the Code Step-by-Step section.
⭐ How to run a cURL example in a web-based REST client (REQBIN)
Calling the API via cURL through a web service like this is the simplest and fastest method, requiring no additional libraries. However, there is a downside: cURL is not a programming language, which means it has very limited capabilities for adding logic—only API calls, no loops or conditions. You can’t even extract just the specific field with the model’s text response—cURL returns the model’s full output, as you’ll see below.
1. Copy the cURL example above and paste it into a text editor, such as Notepad or Notepad++.
2. Replace the placeholder <YOUR_AIMLAPI_KEY> with your actual AIMLAPI Key.
3. If needed, modify the prompt (the content field).
4. Copy the modified example, go to the REQBIN website, paste it into the designated field and click Run:

5. After the model processes your request, the model’s full output will be shown directly below the input field.
Pro tip: try experimenting with the three different ways of displaying the model’s output. Some are more readable than others.

⭐ How to run a Python example in an online Jupyter Notebook
The second fastest option, and a much more convenient choice, while offering more flexibility for customizing how the output is displayed in code.
1. When you open Jupyter Notebook for the first time, select “Python 3.13 (XPython)” in the pop-up window to indicate the programming language kernel you will be working with:

In some browsers, the kernel selection may look different:

2. Enter the following command in the first cell to install the requests library:
Click the Run button in the toolbar above the cell to execute it:

3. Paste our example into the second cell, replace the placeholder with your AIMLAPI Key, then click the Run button in the toolbar:

4. After the model processes your request, the result will be shown directly below the cell:

How to run a Python example locally from the command line (without an IDE)
Let's start from very beginning. We assume you already installed Python (with venv), if not, here a guide for the beginners.
Create a new folder for test project, name it as aimlapi-welcome and change to it.
(Optional) If you use IDE then we recommend to open created folder as workspace. On example, in Visual Studio Code you can do it with:
Run a terminal inside created folder and create virtual envorinment with a command:
Activate created virtual environment:
Install requirement dependencies. In our case (REST API SDK) we need only request library:
Create new file and name it as travel.py:
Paste following content inside this travel.py and replace <YOUR_AIMLAPI_KEY> with your API key you got on first step:
Run the application:
If you done all correct, you will see following output:
How to run a JavaScript example locally from the command line (without an IDE)
We assume you already have Node.js installed. If not, here is a guide for beginners.
Create a new folder for the example project:
Create a project file:
Create a file with the source code:
And paste the following content to the file and save it:
Run the file:
You will see a response that looks like this:
Code Step-by-Step
Below is a step-by-step explanation of the same API call in three variants: cURL, JavaScript, and Python. All three examples send an identical request to the google/gemma-3-4b-it chat model.
cURL
1. Command start
Runs the cURL HTTP client. The -L flag tells cURL to follow redirects (if any).
2. HTTP method
Specifies that the request uses the POST method.
3. Endpoint
The full endpoint URL used to call chat models.
4. Authorization header
Sends your AIMLAPI key in the Authorization header.
5. Content type
Indicates that the request body is JSON.
6. Request body
This is the payload sent to the API:
model– the model identifier.messages– the chat history.role: "user"– the user message.content– the user prompt.
temperature– controls output randomness.max_tokens– the maximum number of tokens in the response.
These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.
With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.
7. Response
In the cURL example, you receive the entire JSON response. No fields are extracted — cURL simply prints the raw output.
JavaScript (Node.js)
1. Define the user prompt
Stores the text of the user request.
2. Call the API
Sends an HTTP request to the endpoint.
3. HTTP method
Specifies that the request uses the POST method.
4. Headers
Sends your AIMLAPI key in the
Authorizationheader.Indicates that the request body is JSON.
5. Request body
This is the payload sent to the API:
model– the model identifier.messages– the chat history.role: "user"– the user message.content– the user prompt.
temperature– controls output randomness.max_tokens– the maximum number of tokens in the response.
These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.
With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.
6. Parse the response
Converts the API response into a JavaScript object.
7. Extract the model’s text output
Reads the text of the first generated message.
8. Print the result
Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.
Python
1. Import the HTTP library
The requests library is used to send HTTP requests.
2. Define the user prompt
Stores the text of the user query.
3. Call the API
Sends a POST request to the endpoint.
4. Headers
Sends your AIMLAPI key in the
Authorizationheader.Indicates that the request body is JSON.
5. Request body
This is the payload sent to the API:
model– the model identifier.messages– the chat history.role: "user"– the user message.content– the user prompt.
temperature– controls output randomness.max_tokens– the maximum number of tokens in the response.
These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.
With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.
6. Parse the response
Converts the JSON response into a Python dictionary.
7. Extract the model’s text output
Reads the text of the first generated message.
8. Print the result
Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.
Future Steps
Know more about supported SDKs
Last updated
Was this helpful?