sonar
Model Overview
A model built on top of Llama 3.3 70B and optimized for Perplexity search. Fast, cost-effective, everyday search and Q&A. Ideal for simple queries, topic summaries, and fact-checking.
How to Make a Call
API Schema
Creates a chat completion using a language model, allowing interactive conversation by predicting the next response based on the given chat history. This is useful for AI-driven dialogue systems and virtual assistants.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
512
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
false
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.
academic
Possible values: A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.
Determines whether search results should include images.
false
Determines whether related questions should be returned.
false
Filters search results based on time (e.g., 'week', 'day').
Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
POST /v1/chat/completions HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer <YOUR_AIMLAPI_KEY>
Content-Type: application/json
Accept: */*
Content-Length: 812
{
"model": "perplexity/sonar",
"messages": [
{
"role": "user",
"content": "text",
"name": "text"
}
],
"max_tokens": 512,
"stream": false,
"stream_options": {
"include_usage": true
},
"temperature": 1,
"top_p": 1,
"logprobs": true,
"top_logprobs": 1,
"logit_bias": {
"ANY_ADDITIONAL_PROPERTY": 1
},
"frequency_penalty": 1,
"presence_penalty": 1,
"seed": 1,
"response_format": {
"type": "text"
},
"web_search_options": {
"search_context_size": "low",
"user_location": {
"approximate": {
"city": "text",
"country": "text",
"region": "text",
"timezone": "text"
},
"type": "approximate"
}
},
"top_k": 1,
"search_mode": "academic",
"search_domain_filter": [
"text"
],
"return_images": false,
"return_related_questions": false,
"search_recency_filter": "text",
"search_after_date_filter": "text",
"search_before_date_filter": "text",
"last_updated_after_filter": "text",
"last_updated_before_filter": "text"
}
No content
Code Example
import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
"Content-Type":"application/json",
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"perplexity/sonar",
"messages":[
{
"role":"user",
# Insert your question for the model here, instead of Hello:
"content":"Hello"
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
Last updated
Was this helpful?