> For the complete documentation index, see [llms.txt](https://docs.aimlapi.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.aimlapi.com/api-references/speech-models/speech-to-text/deepgram/nova-3-medical.md).

# Nova 3 Medical

{% columns %}
{% column width="66.66666666666666%" %}
{% hint style="info" %}
This documentation is valid for the following list of our models:

* `nova-3-medical`
  {% endhint %}
  {% endcolumn %}

{% column width="33.33333333333334%" %} <a href="https://aimlapi.com/app/deepgram/nova-3-medical" class="button primary">Try in Playground</a>
{% endcolumn %}
{% endcolumns %}

Nova-3 Medical — speech-to-text model from Deepgram fine-tuned for clinical terminology, healthcare audio, and medical transcription workflows. Accurately recognizes drug names, diagnoses, procedures, and medical jargon in real-time and batch transcription.

{% hint style="success" %}
This model uses per-minute billing. The cost of audio transcription is based on the duration of the input audio file.
{% endhint %}

## Setup your API Key

If you don't have an API key for the AI/ML API yet, feel free to use our [Quickstart guide](https://docs.aimlapi.com/quickstart/setting-up).

## API Schemas

## POST /v1/stt/create

>

```json
{"openapi":"3.0.0","info":{"title":"AIML API","version":"1.0.0"},"servers":[{"url":"https://api.aimlapi.com"}],"paths":{"/v1/stt/create":{"post":{"operationId":"_v1_stt_create","requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","properties":{"model":{"type":"string","enum":["nova-3-medical"]},"url":{"type":"string","format":"uri","description":"URL of the input audio file. Provide either url or audio — exactly one is required, not both."},"audio":{"type":"string","format":"binary","description":"The audio file to transcribe. Provide either url or audio — exactly one is required, not both."},"language":{"type":"string","description":"The BCP-47 language tag that hints at the primary spoken language. Depending on the model and API endpoint you choose, only certain languages are available."},"punctuate":{"type":"boolean","nullable":true,"description":"Adds punctuation and capitalization to the transcript."},"smart_format":{"type":"boolean","nullable":true,"description":"Applies formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability."},"diarize":{"type":"boolean","nullable":true,"description":"Recognizes speaker changes. Each word in the transcript will be assigned a speaker number starting at 0."},"multichannel":{"type":"boolean","nullable":true,"description":"Enable Multichannel transcription, can be true or false."},"paragraphs":{"type":"boolean","nullable":true,"description":"Splits audio into paragraphs to improve transcript readability."},"utterances":{"type":"boolean","nullable":true,"description":"Segments speech into meaningful semantic units."},"utt_split":{"type":"number","description":"Seconds to wait before detecting a pause between words in submitted audio."},"detect_language":{"anyOf":[{"type":"array","items":{"type":"string"}},{"type":"boolean","nullable":true}],"description":"Enables language detection to identify the dominant language spoken in the submitted audio."},"detect_entities":{"type":"boolean","nullable":true,"description":"When Entity Detection is enabled, the Punctuation feature will be enabled by default."},"intents":{"type":"boolean","nullable":true,"description":"Recognizes speaker intent throughout a transcript or text."},"sentiment":{"type":"boolean","nullable":true,"description":"Recognizes the sentiment throughout a transcript or text."},"topics":{"type":"boolean","nullable":true,"description":"Detects topics throughout a transcript or text."},"summarize":{"anyOf":[{"type":"string"},{"type":"boolean"}],"description":"Summarizes content. For Listen API, supports string version option. For Read API, accepts boolean only."},"custom_intent":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"A custom intent you want the model to detect within your input audio if present. Submit up to 100."},"custom_intent_mode":{"type":"string","enum":["strict","extended"],"description":"Sets how the model will interpret strings submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in addition to those submitted."},"custom_topic":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"A custom topic you want the model to detect within your input audio if present. Submit up to 100."},"custom_topic_mode":{"type":"string","enum":["strict","extended"],"description":"Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted."},"keyterm":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Keyterm Prompting (Nova-3 Medical only): boost recognition of specialized and clinical terms, drug names, and medical proper nouns. Pass a single term or an array of terms."},"filler_words":{"type":"boolean","nullable":true,"description":"Filler Words can help transcribe interruptions in your audio, like \"uh\" and \"um\"."},"profanity_filter":{"type":"boolean","nullable":true,"description":"Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely."},"redact":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Redact sensitive information from the transcript. Common values: pii, pci, numbers."},"replace":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Search for terms and replace them in the transcript. Provide \"find:replace\" pairs."},"search":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Search for terms or phrases in submitted audio."},"dictation":{"type":"boolean","nullable":true,"description":"Identifies and extracts key entities from content in submitted audio."},"measurements":{"type":"boolean","nullable":true,"description":"Spoken measurements will be converted to their corresponding abbreviations."},"numerals":{"type":"boolean","nullable":true,"description":"Numerals converts numbers from written format to numerical format."},"encoding":{"type":"string","enum":["linear16","flac","mulaw","amr-nb","amr-wb","opus","speex","g729"],"description":"Expected encoding of the submitted audio (used mainly for raw audio)."},"tag":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Labels your requests for the purpose of identification during usage reporting."},"extra":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}],"description":"Arbitrary key-value pairs that are attached to the API response for usage in downstream processing."},"mip_opt_out":{"type":"boolean","nullable":true,"description":"Opt out of the Deepgram Model Improvement Program for this request."},"version":{"type":"string","description":"Model version to use (e.g. \"latest\" or a specific version string). Defaults to latest."}},"required":["model"],"title":"nova-3-medical"}}}},"responses":{"200":{"content":{"application/json":{"schema":{"type":"object","properties":{"generation_id":{"type":"string","description":"The unique identifier of the created transcription task. Use this ID to retrieve the result via GET /v1/stt/{generation_id}."}},"required":["generation_id"]}}}}}}}}}
```

## Quick Example: Processing a Speech Audio File via URL

Let's transcribe the following audio fragment:

{% embed url="<https://drive.google.com/file/d/1ZN-28NUbK1TXHt6oEPj42zUJCv82e9L4/view?usp=sharing>" %}

{% code overflow="wrap" %}

```python
import time
import requests
import json

base_url = "https://api.aimlapi.com/v1"
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key = "<YOUR_AIMLAPI_KEY>"

# Creating and sending a speech-to-text conversion task to the server
def create_stt():
    url = f"{base_url}/stt/create"
    headers = {
        "Authorization": f"Bearer {api_key}",
    }

    data = {
        "model": "nova-3-medical",
        "url": "https://audio-samples.github.io/samples/mp3/blizzard_primed/sample-0.mp3"
    }

    response = requests.post(url, json=data, headers=headers)

    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data

# Requesting the result of the task from the server using the generation_id
def get_stt(gen_id):
    url = f"{base_url}/stt/{gen_id}"
    headers = {
        "Authorization": f"Bearer {api_key}",
    }
    response = requests.get(url, headers=headers)
    return response.json()

# First, start the generation, then repeatedly request the result from the server every 10 seconds.
def main():
    stt_response = create_stt()
    gen_id = stt_response.get("generation_id")

    if gen_id:
        start_time = time.time()
        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_stt(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break

            status = response_data.get("status")

            if status == "waiting" or status == "active":
                print("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:\n", response_data["result"]["results"]["channels"][0]["alternatives"][0]["transcript"])

                # Uncomment the line below to print the entire result object with all service data
                # print("Processing complete:\n", json.dumps(response_data["result"], indent=2, ensure_ascii=False))
                return response_data

        print("Timeout reached. Stopping.")
        return None


if __name__ == "__main__":
    main()
```

{% endcode %}

<details>

<summary>Response</summary>

{% code overflow="wrap" %}

```json5
{'generation_id': 'c3d4e5f6-a7b8-9012-cdef-123456789012'}
Still waiting... Checking again in 10 seconds.
Processing complete:
{
  "status": "completed",
  "result": {
    "metadata": {
      "request_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
      "created": "2025-06-04T12:00:00.000Z",
      "duration": 11.4,
      "channels": 1,
      "models": ["nova-3-medical"],
      "model_info": {
        "nova-3-medical": {
          "name": "nova-3-medical",
          "version": "2024-12-18.0",
          "arch": "nova-3"
        }
      }
    },
    "results": {
      "channels": [
        {
          "alternatives": [
            {
              "transcript": "He doesn't belong to you, and I don't see how you have anything to do with what is be his power, if he possess only that from this stage to you.",
              "confidence": 0.9882813,
              "words": [
                {
                  "word": "he",
                  "start": 0.32,
                  "end": 0.4,
                  "confidence": 0.9882813,
                  "speaker": 0,
                  "punctuated_word": "He"
                }
              ]
            }
          ]
        }
      ]
    }
  }
}
```

{% endcode %}

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.aimlapi.com/api-references/speech-models/speech-to-text/deepgram/nova-3-medical.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
