AI Search Engine
Overview
AI Web Search Engine is designed to retrieve real-time information from the internet. This solution processes user queries and return relevant data from various online sources, making them useful for tasks that require up-to-date knowledge beyond static datasets. It supports two usage options:
Using six specialized API endpoints, each designed to search for only one specific type of information. These endpoints return structured responses, making them more suitable for integration into specialized services (e.g., a weather widget). Here are the types of information you can retrieve this way:
See API references and examples on the subpages.
As a general chat completion solution (but searching on the internet): enter a query in the prompt and receive an internet-sourced answer, similar to asking a question on a search engine through a browser. See the API Schema below or check how this call is made in the Python example on the bottom of this page.
How to make a call
Check how this call is made in the examples below.
Note that queries can include advanced search syntax:
Search for an exact match: Enter a word or phrase using
\"
before and after it. For example,\"tallest building\"
.Search for a specific site: Enter
site:
in front of a site or domain. For example,site:youtube.com cat videos
.Exclude words from your search: Enter
-
in front of a word that you want to leave out. For example,jaguar speed -car
.
You can also personalize the AI Search Engine output by passing the ip
parameter.
See Example #2 below.
API Schema
Creates a chat completion using a language model, allowing interactive conversation by predicting the next response based on the given chat history. This is useful for AI-driven dialogue systems and virtual assistants.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
512
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
false
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
An object specifying the format that the model must output.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number between 0 and 1 that can be used as an alternative to top_p and top_k.
If True, the generation prompt will be added to the chat template. This is a parameter used by chat template in tokenizer config of the model.
If True, special tokens (e.g. BOS) will be added to the prompt on top of what is added by the chat template. For most models, the chat template takes care of adding the special tokens so this should be set to False (as is the default).
A Jinja template to use for this conversion. If this is not passed, the model's default chat template will be used instead.
Whether to include the stop string in the output. This is only applied when the stop or stop_token_ids is set
If specified, the output will follow the JSON schema.
If specified, the output will follow the regex pattern.
If specified, the output will be exactly one of the choices.
If specified, the output will follow the context free grammar.
If specified, will override the default guided decoding backend of the server for this specific request. If set, must be either 'outlines' / 'lm-format-enforcer'
If specified, will override the default whitespace pattern for guided json decoding.
IP from which a request is executed
POST /v1/chat/completions HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer <YOUR_AIMLAPI_KEY>
Content-Type: application/json
Accept: */*
Content-Length: 1035
{
"model": "bagoodex/bagoodex-search-v1",
"messages": [
{
"role": "user",
"content": "text",
"name": "text"
}
],
"max_tokens": 512,
"stream": false,
"stream_options": {
"include_usage": true
},
"frequency_penalty": 1,
"logit_bias": {
"ANY_ADDITIONAL_PROPERTY": 1
},
"logprobs": true,
"top_logprobs": 1,
"n": 1,
"presence_penalty": 1,
"response_format": {
"type": "text"
},
"seed": 1,
"stop": "text",
"temperature": 1,
"top_p": 1,
"echo": true,
"repetition_penalty": 1,
"top_k": 1,
"min_p": 1,
"user": "text",
"best_of": 1,
"use_beam_search": true,
"length_penalty": 1,
"early_stopping": true,
"ignore_eos": true,
"min_tokens": 1,
"stop_token_ids": [
1
],
"skip_special_tokens": true,
"spaces_between_special_tokens": null,
"add_generation_prompt": true,
"add_special_tokens": true,
"documents": [
{
"ANY_ADDITIONAL_PROPERTY": "text"
}
],
"chat_template": "text",
"chat_template_kwargs": {
"ANY_ADDITIONAL_PROPERTY": null
},
"include_stop_str_in_output": true,
"guided_json": "text",
"guided_regex": "text",
"guided_choice": [
"text"
],
"guided_grammar": "text",
"guided_decoding_backend": "outlines",
"guided_whitespace_pattern": "text",
"ip": "text"
}
No content
Example #1
import requests
from openai import OpenAI
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
API_KEY = '<YOUR_AIMLAPI_KEY>'
API_URL = 'https://api.aimlapi.com'
def complete_chat():
client = OpenAI(
base_url=API_URL,
api_key=API_KEY,
)
response = client.chat.completions.create(
model="bagoodex/bagoodex-search-v1",
messages=[
{
"role": "user",
# Enter your query here
"content": 'how to make a slingshot',
},
],
)
print(response.choices[0].message.content)
# Run the function
complete_chat()
Example #2: Using the IP Parameter for Personalized Model Output
When using regular search engines in browsers, we can simply ask, 'Weather today' without specifying our location. In this case, the search engine automatically uses your IP address to determine your location and provide a more relevant response. The AI Search Engine also supports IP-based personalization.
In the example below, the query does not specify a city, but since the request includes an IP address registered in Stockholm, the system automatically adjusts, and the response will contain today's weather forecast for that city.
Note that when making a request via Python, the ip
parameter should be included inside the extra_body
parameter (see example below). When using other languages, this is not required, and the ip
parameter can be passed like any other parameter.
import requests
from openai import OpenAI
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
API_KEY = '<YOUR_AIMLAPI_KEY>'
API_URL = 'https://api.aimlapi.com'
# Call the standart chat completion endpoint to get an ID
def complete_chat():
client = OpenAI(
base_url=API_URL,
api_key=API_KEY,
)
response = client.chat.completions.create(
model="bagoodex/bagoodex-search-v1",
messages=[
{
"role": "user",
"content": "Weather today",
},
],
# insert your IP into this section
extra_body={
'ip': '192.44.242.19' # we used a random IP address from Stockholm
}
)
print(response.choices[0].message.content)
return response
# Run the function
complete_chat()
Keep in mind that the system caches the IP address for a period of two weeks. This means that after specifying an IP address once, any queries without an explicit location will continue to return responses linked to Stockholm for the next two weeks, even if you don't include the IP address in subsequent requests. If you need to change the location, simply provide a new IP address in your next request.
If an IP address registered in one location is used while explicitly specifying a different location in the query, AI Search Engine will prioritize the location from the query:
Last updated
Was this helpful?