import os
import requests
def main():
url = "https://api.aimlapi.com/v1/tts"
headers = {
# Insert your AI/ML API key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
}
payload = {
"model": "elevenlabs/eleven_multilingual_v2",
"text": '''
Cities of the future promise to radically transform how people live, work, and move.
Instead of sprawling layouts, we’ll see vertical structures that integrate residential, work, and public spaces into single, self-sustaining ecosystems.
Architecture will adapt to climate conditions, and buildings will be energy-efficient—generating power through solar panels, wind turbines, and even foot traffic.
''',
"voice": "Alice"
}
response = requests.post(url, headers=headers, json=payload, stream=True)
# result = os.path.join(os.path.dirname(__file__), "audio.wav") # if you run this code as a .py file
result = "audio.wav" # if you run this code in Jupyter Notebook
with open(result, "wb") as write_stream:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
write_stream.write(chunk)
print("Audio saved to:", result)
main()const https = require("https");
const fs = require("fs");
// Insert your AI/ML API key instead of <YOUR_AIMLAPI_KEY>:
const apiKey = "<YOUR_AIMLAPI_KEY>";
const data = JSON.stringify({
model: "elevenlabs/eleven_multilingual_v2",
text: `
Cities of the future promise to radically transform how people live, work, and move.
Instead of sprawling layouts, we’ll see vertical structures that integrate residential, work, and public spaces into single, self-sustaining ecosystems.
Architecture will adapt to climate conditions, and buildings will be energy-efficient—generating power through solar panels, wind turbines, and even foot traffic.
`,
voice: "Giovanni",
});
const options = {
hostname: "api.aimlapi.com",
path: "/v1/tts",
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(data),
}
};
const req = https.request(options, (res) => {
if (res.statusCode >= 400) {
let error = "";
res.on("data", chunk => error += chunk);
res.on("end", () => {
console.error(`Error ${res.statusCode}:`, error);
});
return;
}
const file = fs.createWriteStream("audio.wav");
res.pipe(file);
file.on("finish", () => {
file.close();
console.log("Audio saved to audio.wav");
});
});
req.on("error", (e) => {
console.error("Request error:", e);
});
req.write(data);
req.end();Audio saved to: audio.wavThe text content to be converted to speech.
Name of the voice to be used.
RachelPossible values: This parameter controls text normalization with three modes: 'auto', 'on', and 'off'. When set to 'auto', the system will automatically decide whether to apply text normalization (e.g., spelling out numbers). With 'on', text normalization will always be applied, while with 'off', it will be skipped.
Format of the output content for non-streaming requests. Controls how the generated audio data is encoded in the response.
Determines how stable the voice is and the randomness between each generation. Lower values introduce broader emotional range for the voice. Higher values can result in a monotonous voice with limited emotion.
0.5This setting boosts the similarity to the original speaker. Using this setting requires a slightly higher computational load, which in turn increases latency.
trueDetermines how closely the AI should adhere to the original voice when attempting to replicate it.
0.75Determines the style exaggeration of the voice. This setting attempts to amplify the style of the original speaker. It does consume additional computational resources and might increase latency if set to anything other than 0.
0Adjusts the speed of the voice. A value of 1.0 is the default speed, while values less than 1.0 slow down the speech, and values greater than 1.0 speed it up.
1If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.
trueThe text that comes after the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
The text that came before the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06