AI/ML API Documentation
API KeyModelsPlaygroundGitHubGet Support
  • 📞Contact Sales
  • 🗯️Send Feedback
  • Quickstart
    • 🧭Documentation Map
    • Setting Up
    • Supported SDKs
  • API REFERENCES
    • 📒All Model IDs
    • Text Models (LLM)
      • AI21 Labs
        • jamba-1-5-mini
      • Alibaba Cloud
        • qwen-max
        • qwen-plus
        • qwen-turbo
        • Qwen2-72B-Instruct
        • Qwen2.5-7B-Instruct-Turbo
        • Qwen2.5-72B-Instruct-Turbo
        • Qwen2.5-Coder-32B-Instruct
        • Qwen-QwQ-32B
        • Qwen3-235B-A22B
      • Anthracite
        • magnum-v4
      • Anthropic
        • Claude 3 Haiku
        • Claude 3.5 Haiku
        • Claude 3 Opus
        • Claude 3 Sonnet
        • Claude 3.5 Sonnet
        • Claude 3.7 Sonnet
      • Cohere
        • command-r-plus
      • DeepSeek
        • DeepSeek V3
        • DeepSeek R1
      • Google
        • gemini-1.5-flash
        • gemini-1.5-pro
        • gemini-2.0-flash-exp
        • gemini-2.0-flash-thinking-exp-01-21
        • gemini-2.0-flash
        • gemini-2.5-flash-preview
        • gemini-2.5-pro-exp
        • gemini-2.5-pro-preview
        • gemma-2
        • gemma-3
      • Gryphe
        • MythoMax-L2-13b-Lite
      • Meta
        • Llama-3-chat-hf
        • Llama-3-8B-Instruct-Lite
        • Llama-3.1-8B-Instruct-Turbo
        • Llama-3.1-70B-Instruct-Turbo
        • Llama-3.1-405B-Instruct-Turbo
        • Llama-3.2-11B-Vision-Instruct-Turbo
        • Llama-3.2-90B-Vision-Instruct-Turbo
        • Llama-Vision-Free
        • Llama-3.2-3B-Instruct-Turbo
        • Llama-3.3-70B-Instruct-Turbo
        • Llama-4-scout
        • Llama-4-maverick
      • MiniMax
        • text-01
        • abab6.5s-chat
      • Mistral AI
        • codestral-2501
        • mistral-nemo
        • mistral-tiny
        • Mistral-7B-Instruct
        • Mixtral-8x22B-Instruct
        • Mixtral-8x7B-Instruct
      • NVIDIA
        • Llama-3.1-Nemotron-70B-Instruct-HF
        • llama-3.1-nemotron-70b
      • NeverSleep
        • llama-3.1-lumimaid
      • NousResearch
        • Nous-Hermes-2-Mixtral-8x7B-DPO
      • OpenAI
        • gpt-3.5-turbo
        • gpt-4
        • gpt-4-preview
        • gpt-4-turbo
        • gpt-4o
        • gpt-4o-mini
        • gpt-4o-audio-preview
        • gpt-4o-mini-audio-preview
        • gpt-4o-search-preview
        • gpt-4o-mini-search-preview
        • o1
        • o1-mini
        • o1-preview
        • o3-mini
        • gpt-4.5-preview
        • gpt-4.1
        • gpt-4.1-mini
        • gpt-4.1-nano
        • o4-mini
      • xAI
        • grok-beta
        • grok-3-beta
        • grok-3-mini-beta
    • Image Models
      • Flux
        • flux-pro
        • flux-pro/v1.1
        • flux-pro/v1.1-ultra
        • flux-realism
        • flux/dev
        • flux/dev/image-to-image
        • flux/schnell
      • Google
        • Imagen 3.0
      • OpenAI
        • DALL·E 2
        • DALL·E 3
      • RecraftAI
        • Recraft v3
      • Stability AI
        • Stable Diffusion v3 Medium
        • Stable Diffusion v3.5 Large
    • Video Models
      • Alibaba Cloud
        • Wan 2.1 (Text-to-Video)
      • Google
        • Veo2 (Image-to-Video)
        • Veo2 (Text-to-Video)
      • Kling AI
        • v1-standard/image-to-video
        • v1-standard/text-to-video
        • v1-pro/image-to-video
        • v1-pro/text-to-video
        • v1.6-standard/text-to-video
        • v1.6-standard/image-to-video
        • v1.6-pro/image-to-video
        • v1.6-pro/text-to-video
        • v1.6-standard/effects
        • v1.6-pro/effects
        • v2-master/image-to-video
        • v2-master/text-to-video
      • Luma AI
        • Text-to-Video v2
        • Text-to-Video v1 (legacy)
      • MiniMax
        • video-01
        • video-01-live2d
      • Runway
        • gen3a_turbo
        • gen4_turbo
    • Music Models
      • MiniMax
        • minimax-music [legacy]
        • music-01
      • Stability AI
        • stable-audio
    • Voice/Speech Models
      • Speech-to-Text
        • stt [legacy]
        • Deepgram
          • nova-2
        • OpenAI
          • whisper-base
          • whisper-large
          • whisper-medium
          • whisper-small
          • whisper-tiny
      • Text-to-Speech
        • Deepgram
          • aura
    • Content Moderation Models
      • Meta
        • Llama-Guard-3-11B-Vision-Turbo
        • LlamaGuard-2-8b
        • Meta-Llama-Guard-3-8B
    • 3D-Generating Models
      • Stability AI
        • triposr
    • Vision Models
      • Image Analysis
      • OCR: Optical Character Recognition
        • Google
          • Google OCR
        • Mistral AI
          • mistral-ocr-latest
      • OFR: Optical Feature Recognition
    • Embedding Models
      • Anthropic
        • voyage-2
        • voyage-code-2
        • voyage-finance-2
        • voyage-large-2
        • voyage-large-2-instruct
        • voyage-law-2
        • voyage-multilingual-2
      • BAAI
        • bge-base-en
        • bge-large-en
      • Google
        • textembedding-gecko
        • text-multilingual-embedding-002
      • OpenAI
        • text-embedding-3-large
        • text-embedding-3-small
        • text-embedding-ada-002
      • Together AI
        • m2-bert-80M-retrieval
  • Solutions
    • Bagoodex
      • AI Search Engine
        • Find Links
        • Find Images
        • Find Videos
        • Find the Weather
        • Find a Local Map
        • Get a Knowledge Structure
    • OpenAI
      • Assistants
        • Assistant API
        • Thread API
        • Message API
        • Run and Run Step API
        • Events
  • Use Cases
    • Create Images: Illustrate an Article
    • Animate Images: A Children’s Encyclopedia
    • Create an Assistant to Discuss a Specific Document
    • Create a 3D Model from an Image
    • Create a Looped GIF for a Web Banner
    • Read Text Aloud and Describe Images: Support People with Visual Impairments
    • Find Relevant Answers: Semantic Search with Text Embeddings
    • Summarize Websites with AI-Powered Chrome Extension
  • Capabilities
    • Completion and Chat Completion
    • Streaming Mode
    • Code Generation
    • Thinking / Reasoning
    • Function Calling
    • Vision in Text Models (Image-To-Text)
    • Web Search
    • Features of Anthropic Models
    • Model comparison
  • FAQ
    • Can I use API in Python?
    • Can I use API in NodeJS?
    • What are the Pro Models?
    • How to use the Free Tier?
    • Are my requests cropped?
    • Can I call API in the asynchronous mode?
    • OpenAI SDK doesn't work?
  • Errors and Messages
    • General Info
    • Errors with status code 4xx
    • Errors with status code 5xx
  • Glossary
    • Concepts
  • Integrations
    • 🧩Our Integration List
    • Langflow
    • LiteLLM
Powered by GitBook
On this page
  • Idea and Step-by-Step Plan
  • Full Walkthrough
  • 1. Prepare the data
  • 2. Generate embeddings
  • 3. Embed the question
  • 4. Find similar phrases
  • Full Code Example & Results
  • Room for Improvement

Was this helpful?

  1. Use Cases

Find Relevant Answers: Semantic Search with Text Embeddings

PreviousRead Text Aloud and Describe Images: Support People with Visual ImpairmentsNextSummarize Websites with AI-Powered Chrome Extension

Last updated 10 hours ago

Was this helpful?

Idea and Step-by-Step Plan

Today, we are going to use to transform a list of phrases into vectors. When a user asks a question, we will convert it into a vector as well and find the phrases from the list that are semantically closest. This approach is useful, for example, to immediately suggest relevant FAQ sections to the user and reduce the need for full support requests.

So, here's a plan:

  1. Prepare the data: Create a numbered list of text phrases.

  2. Generate embeddings: Use a model to embed each phrase into a vector.

  3. Embed the question: When the user asks something, embed the question text.

  4. Find similar phrases: Calculate the similarity (e.g., cosine similarity) between the question vector and the list vectors. Show the top 1–3 most similar phrases as the answer.

Full Walkthrough

1. Prepare the data

We have compiled the following list of FAQ headings:

"How to grow tomatoes at home",
"Learning about birds",
"Best practices for machine learning models",
"How to train a dog",
"Tips for painting landscapes",
"Learning Python for data analysis",
"Everyday Life of a Cynologist"

2. Generate embeddings

Now each of our headings has a corresponding embedding vector.

3. Embed the question

Similarly, we process the user's query. We save the embedding vector generated by the model into a separate variable.

4. Find similar phrases

We calculate the similarity between the question vector and the list vectors.

There are different metrics and functions you can use for this, such as cosine similarity, dot product, or Euclidean distance.

In this example, we use cosine similarity because it measures the angle between two vectors and is a popular choice for comparing text embeddings, especially when the magnitude of the vectors is less important than their direction.

pip install scikit-learn

Full Code Example & Results

In this section, you will find the complete Python code for the described use case, along with an example of the program's output.

Python code
import numpy as np
from openai import OpenAI
from sklearn.metrics.pairwise import cosine_similarity

# Initialize the API client
client = OpenAI(
    base_url="https://api.aimlapi.com/v2",
    api_key="<YOUR_AIMLAPI_KEY>",
)

# Example list of headings
items = [
    "How to grow tomatoes at home",
    "Learning about birds",
    "Best practices for machine learning models",
    "How to train a dog",
    "Tips for painting landscapes",
    "Learning Python for data analysis",
    "Everyday Life of a Cynologist"
]

# Generate embeddings for each phrase in the list
response = client.embeddings.create(
    model="text-embedding-3-large",  # Choose your fighter :)
    input=items
)

item_embeddings = np.array([e.embedding for e in response.data])

# When a user submits a question
query = "How to teach pets new tricks?"

# Generate an embedding for the user's question
query_response = client.embeddings.create(
    model="text-embedding-3-large",
    input=[query]
)
query_embedding = np.array(query_response.data[0].embedding)

# Calculate cosine similarity between the user question and each phrase
similarities = cosine_similarity([query_embedding], item_embeddings)[0]

# Find the indices of the most similar phrases
top_indices = similarities.argsort()[::-1]  # Sort in descending order

print("Query:", query)
print("\nMost similar items:")

for idx in top_indices[:3]:  # Show the top 3 most similar phrases
    print(f"- {items[idx]} (similarity: {similarities[idx]:.3f})")
Response when using a large embedding model
Query: How to teach pets new tricks?

Most similar items:
- How to train a dog (similarity: 0.590)
- Everyday Life of a Cynologist (similarity: 0.281)
- Learning about birds (similarity: 0.255)
Response when using a small embedding model
Query: How to teach pets new tricks?

Most similar items:
- How to train a dog (similarity: 0.534)
- Learning about birds (similarity: 0.322)
- Tips for painting landscapes (similarity: 0.244)

Room for Improvement

Naturally, this is a simplified example. You can develop a more comprehensive implementation by introducing features such as:

  • Add a minimum similarity threshold to filter out irrelevant results,

  • Cache embeddings for faster lookup without recalculating them each time,

  • Allow partial matches or fuzzy search for broader results,

  • Handle multiple user questions at once (batch processing) — and more.

Let's save our headings as a list and pass them to the model. We chose the model — it has been trained on a large dataset and is powerful enough to build complex semantic connections.

Please note that to use the cosine similarity function, you need to install the library separately. You can install it with the following command:

Do not forget to replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from on our platform.

Here is the program output after we switched to the small version of the model, :

Apparently, it was trained a bit less thoroughly and doesn't recognize who cynologists are We didn't notice much difference in speed, but the larger version is somewhat more expensive.

🤷
text embeddings
text-embedding-3-large
scikit-learn
your account
text-embedding-3-small