USO (Image-to-Image)

This documentation is valid for the following list of our models:

  • bytedance/uso

Model Overview

USO (Unified Style-Subject Optimized) — a single model that seamlessly combines style-based and subject-based image generation.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

API Schema

post
Authorizations
Body
modelundefined · enumRequiredPossible values:
image_urlsstring · uri[] · min: 1 · max: 3Required

An array of up to 3 image URLs. The first image is always treated as the primary input for image-to-image generation, while the remaining images (if provided) serve as visual style references for the output.

image_sizeany ofOptional

The size of the generated image.

Default: square_hd
string · enumOptionalPossible values:
or
negative_promptstringOptional

The description of elements to avoid in the generated image.

Default: ""
num_inference_stepsinteger · min: 1 · max: 50Optional

The number of inference steps to perform.

Default: 28
guidance_scalenumber · min: 1 · max: 20Optional

The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt when looking for a related image to show you.

Default: 4
keep_sizebooleanOptional
num_imagesnumber · min: 1 · max: 4Optional

The number of images to generate.

Default: 1
seedinteger · min: 1Optional

The same seed and the same prompt given to the same version of the model will output the same image every time.

sync_modebooleanOptional

If set to true, the function will wait for the image to be generated and uploaded before returning the response. This will increase the latency of the function but it allows you to get the image directly in the response without going through the CDN.

Default: false
enable_safety_checkerbooleanOptional

If set to True, the safety checker will be enabled.

Default: true
output_formatstring · enumOptional

The format of the generated image.

Default: pngPossible values:
promptstring · max: 4000Required

The text prompt describing the content, style, or composition of the image to be generated.

Responses
201

Successfully generated image

application/json
post
async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/images/generations', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'bytedance/uso',
      prompt: "Add a crown to the T-rex's head.",
      image_urls: ['https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/t-rex.png'],
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();
201

Successfully generated image

{
  "data": [
    {
      "url": "text",
      "b64_json": "text"
    }
  ]
}

Quick Example

Let's generate an image of the specified size using a simple prompt.

import requests
import json  # for getting a structured output with indentation

response = requests.post(
    "https://api.aimlapi.com/v1/images/generations",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type":"application/json",
    },
    json={
        "model":"bytedance/uso",
        "prompt": "The T-Rex is wearing a business suit, sitting in a cozy small café, drinking from a mug. Blur the background slightly to create a bokeh effect.",
        "image_urls": [ 
             "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/t-rex.png"
        ]
    }
)

data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))
Response
{
  "images": [
    {
      "url": "https://cdn.aimlapi.com/eagle/files/penguin/sMMWnB7wyBK8o_XiAohle.png",
      "content_type": "image/png",
      "file_name": null,
      "file_size": null,
      "width": 1024,
      "height": 1024
    }
  ],
  "seed": 351168504,
  "has_nsfw_concepts": [
    false
  ],
  "prompt": "The T-Rex is wearing a business suit, sitting in a cozy small café, drinking from a mug. Blur the background slightly to create a bokeh effect.",
  "timings": {
    "inference": 10.547778039996047
  },
  "data": [
    {
      "url": "https://cdn.aimlapi.com/eagle/files/penguin/sMMWnB7wyBK8o_XiAohle.png",
      "content_type": "image/png",
      "file_name": null,
      "file_size": null,
      "width": 1024,
      "height": 1024
    }
  ],
  "meta": {
    "usage": {
      "tokens_used": 420000
    }
  }
}
Reference Image
Generated Image
(original)
"The T-Rex is wearing a business suit, sitting in a cozy small café, drinking from a mug. Blur the background slightly to create a bokeh effect."

Last updated

Was this helpful?