qwen3-tts-flash
Qwen Speech Synthesis offers a range of natural, human-like voices with support for multiple languages and dialects. It can produce multilingual speech in a consistent voice, adapting tone and intonation to deliver smooth, expressive narration even for complex text.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
Code Example
import os
import requests
def main():
url = "https://api.aimlapi.com/v1/tts"
headers = {
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
}
payload = {
"model": "alibaba/qwen3-tts-flash",
"text": "Qwen3 Speech Synthesis offers a range of natural, human-like voices with support for multiple languages and dialects. It can produce multilingual speech in a consistent voice, adapting tone and intonation to deliver smooth, expressive narration even for complex text.",
"voice": "Cherry"
}
try:
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
response_data = response.json()
audio_url = response_data["audio"]["url"]
file_name = response_data["audio"]["file_name"]
audio_response = requests.get(audio_url, stream=True)
audio_response.raise_for_status()
# Save with the original file extension from the API
# dist = os.path.join(os.path.dirname(__file__), file_name) # if you run this code as a .py file
dist = "audio.wav" # if you run this code in Jupyter Notebook
with open(dist, "wb") as write_stream:
for chunk in audio_response.iter_content(chunk_size=8192):
if chunk:
write_stream.write(chunk)
print("Audio saved to:", dist)
print(f"Duration: {response_data['duration']} seconds")
print(f"Sample rate: {response_data['sample_rate']} Hz")
except requests.exceptions.RequestException as e:
print(f"Error making request: {e}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Listen to the dialogue we generated:
To embed an audio file using the provided URL, you can use the HTML <audio>
tag within your markdown like this:
API Schema
post
Authorizations
Body
modelundefined · enumRequiredPossible values:
textstring · min: 1 · max: 600Required
The text content to be converted to speech.
voicestring · enumRequiredPossible values:
Name of the voice to be used
Responses
201Success
application/json
post
POST /v1/tts HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 66
{
"model": "alibaba/qwen3-tts-flash",
"text": "text",
"voice": "Cherry"
}
201Success
{
"metadata": {
"transaction_key": "text",
"request_id": "text",
"sha256": "text",
"created": "2025-10-09T08:54:35.285Z",
"duration": 1,
"channels": 1,
"models": [
"text"
],
"model_info": {
"ANY_ADDITIONAL_PROPERTY": {
"name": "text",
"version": "text",
"arch": "text"
}
}
}
}
Last updated
Was this helpful?