Guides
Structured image generation
֍

Learn how to use Inpainting and Control Net techniques to reimagine a photo, and use

ComputeText

Compute text using a language model.

ComputeText( prompt="Who is Don Quixote?", temperature=0.4, max_tokens=800, )
Output
{ "text": "Don Quixote is a fictional character in the novel of the same name by Miguel de Cervantes." }
to describe the resulting images.

original
The original image – Substrate HQ.
edge tokyo

"The living room boasts a spacious design with a large window allowing natural light and a cozy couch. Unique is the open-concept office area, separated by a partial wall, offering a functional workspace while maintaining unity, enhanced by plants, lamps, and a rug in the modern, minimalist decor."

The four parts of the guide cover:

  1. Inpainting: Use
    StableDiffusionXLInpaint

    Edit an image using Stable Diffusion XL. Supports inpainting (edit part of the image with a mask) and image-to-image (edit the full image).

    StableDiffusionXLInpaint( image_uri="https://media.substrate.run/docs-klimt-park.jpg", mask_image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="large tropical colorful bright birds in a jungle, high resolution oil painting", negative_prompt="dark, cartoon, anime", strength=0.8, num_images=2, store="hosted", seeds=[ 1607280, 1720395, ], )
    Output
    { "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607326 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
    to generate variations of a photo of a room in different styles.
  2. Control Net – Edge Detection: Use
    StableDiffusionXLControlNet

    Generate an image with generation structured by an input image, using Stable Diffusion XL with ControlNet.

    StableDiffusionXLControlNet( image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="the futuristic solarpunk city of atlantis at sunset, cinematic bokeh HD", control_method="illusion", conditioning_scale=1.0, strength=1.0, store="hosted", num_images=2, seeds=[ 1607226, 1720395, ], )
    Output
    { "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607266 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
    with the edge method to generate variations structured by the edges of the original image.
  3. Control Net – Depth Detection: Use
    StableDiffusionXLControlNet

    Generate an image with generation structured by an input image, using Stable Diffusion XL with ControlNet.

    StableDiffusionXLControlNet( image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="the futuristic solarpunk city of atlantis at sunset, cinematic bokeh HD", control_method="illusion", conditioning_scale=1.0, strength=1.0, store="hosted", num_images=2, seeds=[ 1607226, 1720395, ], )
    Output
    { "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607266 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
    with the depth method to generate variations structured by a depth map of the original image.
  4. Describing images: Use
    ComputeText

    Compute text using a language model.

    ComputeText( prompt="Who is Don Quixote?", temperature=0.4, max_tokens=800, )
    Output
    { "text": "Don Quixote is a fictional character in the novel of the same name by Miguel de Cervantes." }
    to describe the generated images.

First, initialize Substrate:

Python
TypeScript
from substrate import (
Substrate,
StableDiffusionXLControlNet,
StableDiffusionXLInpaint,
ComputeText,
ComputeText,
sb,
)

s = Substrate(api_key=YOUR_API_KEY)

1. Inpainting

Let's try generating variations of the room using

StableDiffusionXLInpaint

Edit an image using Stable Diffusion XL. Supports inpainting (edit part of the image with a mask) and image-to-image (edit the full image).

StableDiffusionXLInpaint( image_uri="https://media.substrate.run/docs-klimt-park.jpg", mask_image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="large tropical colorful bright birds in a jungle, high resolution oil painting", negative_prompt="dark, cartoon, anime", strength=0.8, num_images=2, store="hosted", seeds=[ 1607280, 1720395, ], )
Output
{ "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607326 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
.

  • This node can also be used to inpaint the masked part of an image if a mask_image_uri is provided. Here, we'll inpaint in the entire image.
  • The strength parameter controls the strength of the generation process over the original image. Higher strength values produces images that are further from the original.
Python
TypeScript
styles = ["sunlit onsen style tokyo office", "80s disco style berlin office at night"]
images = [
StableDiffusionXLInpaint(
image_uri="https://media.substrate.run/office.jpg",
strength=0.75,
prompt=s,
num_images=1,
)
for s in styles
]
res = s.run(*images)
inpaint tokyo
sunlit onsen style tokyo office
inpaint berlin
80s disco style berlin office at night

When using this strength value, some of the quality of the original is preserved in the variations, but they're quite different.

֍

InpaintImage

Edit an image using image generation inside part of the image or the full image.

InpaintImage( image_uri="https://media.substrate.run/docs-klimt-park.jpg", mask_image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="large tropical colorful bright anime birds in a dark jungle full of vines, high resolution", store="hosted", )
Output
{ "image_uri": "https://assets.substrate.run/84848484.jpg" }
is a high-level alternative to StableDiffusionXLControlNet. You should use high-level nodes if you want your node to automatically update to the latest, best model.

2. Control Net – Edge Detection

Let's try using

StableDiffusionXLControlNet

Generate an image with generation structured by an input image, using Stable Diffusion XL with ControlNet.

StableDiffusionXLControlNet( image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="the futuristic solarpunk city of atlantis at sunset, cinematic bokeh HD", control_method="illusion", conditioning_scale=1.0, strength=1.0, store="hosted", num_images=2, seeds=[ 1607226, 1720395, ], )
Output
{ "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607266 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
with the edge method, which processes the original image with an edge detection algorithm and uses edges to structure generation.

Python
TypeScript
styles = ["sunlit onsen style tokyo office", "80s disco style berlin office at night"]
images = [
StableDiffusionXLControlNet(
image_uri="https://media.substrate.run/office.jpg",
control_method="edge",
prompt=s,
num_images=1,
)
for s in styles
]
res = s.run(*images)
edge tokyo
sunlit onsen style tokyo office
edge berlin
80s disco style berlin office at night

3. Control Net – Depth Detection

Let's try using

StableDiffusionXLControlNet

Generate an image with generation structured by an input image, using Stable Diffusion XL with ControlNet.

StableDiffusionXLControlNet( image_uri="https://media.substrate.run/spiral-logo.jpeg", prompt="the futuristic solarpunk city of atlantis at sunset, cinematic bokeh HD", control_method="illusion", conditioning_scale=1.0, strength=1.0, store="hosted", num_images=2, seeds=[ 1607226, 1720395, ], )
Output
{ "outputs": [ { "image_uri": "https://assets.substrate.run/84848484.jpg", "seed": 1607266 }, { "image_uri": "https://assets.substrate.run/48484848.jpg", "seed": 1720398 } ] }
with the depth method, which processes the original image with a depth detection algorithm and uses depth to structure generation.

Python
TypeScript
styles = ["sunlit onsen style tokyo office", "80s disco style berlin office at night"]
images = [
StableDiffusionXLControlNet(
image_uri="https://media.substrate.run/office.jpg",
control_method="depth",
prompt=s,
num_images=1,
)
for s in styles
]

4. Describing images

We can describe the content of the images using

ComputeText

Compute text using a language model.

ComputeText( prompt="Who is Don Quixote?", temperature=0.4, max_tokens=800, )
Output
{ "text": "Don Quixote is a fictional character in the novel of the same name by Miguel de Cervantes." }
, and then summarize the generated descriptions using
ComputeText

Compute text using a language model.

ComputeText( prompt="Who is Don Quixote?", temperature=0.4, max_tokens=800, )
Output
{ "text": "Don Quixote is a fictional character in the novel of the same name by Miguel de Cervantes." }
.

We run the pipeline by calling substrate.run with the terminal nodes, summaries.

Python
TypeScript
descriptions = [
ComputeText(
prompt="Describe the interesting interior decor touches in this image",
image_uris=[i.future.outputs[0].image_uri],
)
for i in images
]
summaries = [
ComputeText(
prompt=sb.concat(
"Summarize the 2 most interesting details in one sentence, be concise: ",
d.future.text,
),
)
for d in descriptions
]
res = s.run(*summaries)
edge tokyo

The living room boasts a spacious design with a large window allowing natural light and a cozy couch. Unique is the open-concept office area, separated by a partial wall, offering a functional workspace while maintaining unity, enhanced by plants, lamps, and a rug in the modern, minimalist decor.

edge berlin

The image features a contemporary office with a captivating pink and purple color scheme: vibrant pink walls instill energy, while elegant purple furniture adds sophistication.

Last updated on