# Google

Google develops multiple families of AI models focused on reasoning, coding, multimodal understanding, and long-context interaction across text, images, audio, and video. As part of the broader Google ecosystem, the company integrates frontier AI capabilities into developer platforms, productivity tools, search systems, and enterprise infrastructure.

The currently supported model families include:

* **Gemini models** — Google’s flagship multimodal and reasoning-oriented models designed for production AI workloads. The Gemini lineup includes:
  * **Flash models** optimized for low latency and cost efficiency.
  * **Pro models** intended for balanced general-purpose production use.
  * Advanced reasoning-capable variants focused on coding, analytical tasks, tool usage, and agentic workflows.\
    Gemini models place strong emphasis on multimodal processing, long context windows, native tool integration, and real-time interaction capabilities.
* **Gemma models** — lightweight open-weight models designed for efficient deployment, customization, and research workflows. Gemma models are optimized for smaller-scale inference environments while still supporting strong reasoning, coding, and conversational capabilities across a wide range of applications.

All Google models in this developer are accessed through the standard `/v1/chat/completions` endpoint, providing a unified OpenAI-compatible integration layer across the entire model catalog.

***

Supported capabilities vary depending on the specific model, with different models offering different combinations of the features listed below.

* **Text completions**: Build conversational systems and advanced text-processing pipelines.
* **Function Calling**: Utilize tools, APIs, and structured workflows.
* **Stream mode**: Receive partial responses incrementally as tokens are generated.
* **Batch Processing**: Execute multiple independent requests within a single API call.
* **Vision Tasks**: Process and analyze images and visual inputs.
* **Audio Tasks**: Transcribe, generate, and process audio content.
* **Video Tasks**: Analyze and reason over video inputs.

***

Other model categories from this provider are available as well.

* [Image](/api-references/image-models/google.md)&#x20;
* [Video](/api-references/video-models/google.md)&#x20;
* [Music](/api-references/vision-models/ocr-optical-character-recognition/google.md)&#x20;
* [Vision(OCR)](/api-references/music-models/google.md)&#x20;
* [Embedding](/api-references/embedding-models/google.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.aimlapi.com/api-references/text-models-llm/google.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
