In the upscaling article, I talked about some of my LoRAs producing too many figures in a piece. I used that as the entire excuse to discuss how to make the image just bigger. But what about when you don’t want to change the figure’s size and just want to add more background? Well, doesn’t this whole application draw stuff? Can’t we just ask the machine to draw more image around the base image? The answer is Yes, we can. It’s not as straightforward as resizing, because we want the new background to fit with the existing background. Adding more image around the main image is known as “Outpainting”, a term coined by the people who make Dall-E, and picked up by other image manipulation software. It is a play off the older term “Inpainting” where you have the system redraw a portion of the existing image. Since the elements used for both of these techniques overlap greatly, I’ll go into them both.

The workflow for this article is going to get more complicated than any other than the multi-combo upscaling testing, and this time, the disconnected nodes off the bottom will get drawn into the flow later on. To start with, I’ve set up the basic upscaling setup from article 3, but I’ve decided to run the model line from the original checkpoint and its LoRAs. This is mainly to avoid clutter, as when you have multiple LoRAs, it just starts to take up more space, and possibly memory. I didn’t intend to change checkpoints or LoRAs from the set I’m applying. So what am I using? One designed to add a “Battlepriest” aesthetic which also produces high detail in a digital painting style, and one designed to give armor a black and gold marbled material.

Do these even help the reader?

The prompt for this step is “space marine in desertpunk power armor holding banner, space marine, 1man, solo, red hair, banner:1.5, desertpunk power armor, 1man, solo, loincloth,” Initially, when I took the picture for the workflow, I’d included the “Full body” keyword, then realized I wanted to try to draw more of the central subject, so I needed him to not be fully in the frame. I do note that neither the original nor upscaled version is actually holding the banner, it’s just sitting in the background. That is a perfect inpainting target to change some details. So we’re going to outpaint by expanding the smaller image with more background, and inpaint on the big one and try to get the guy to hold the banner instead of just stand in front of it.

Inpainting is simpler, so we’ll start there. So what do we need to do? First off, we have to detach the upscaling section of the workflow. We do this by deleting the latent line from the first sampler to the Upscale Latent node. This removes everything on the right hand side from the loop. We also need an input that is an existing image file, that’s a “Load Image” node in the lower left. Load Image produces two outputs – “Image” and “Mask”, if all we do is load, the mask output is basically empty. So what are these? The “Image” is simple, it’s what was loaded from the file that we haven’t selected yet. So, we’ll load the upscaled image. We now have to create an image mask so that the software knows what we want it to try to redraw. This is also done from the load node. By right-clicking on it, we get a menu which includes an option “Open in Mask Editor”.

The screenshot didn’t pick up the mouse pointer, but over the image it has a dashed circle around the arrow point. There are four controls on the mask editor screen: “Clear” which removes whatever mask you’ve painted; “Thickness” which controls the size of the brush used to paint the mask over the image; “Cancel” just aborts whatever we’ve done; “Save to Node” applies the mask to the Load Image node to be passed as output. So I’m going to paint the flag and the guy’s left arm. I want him to be holding a banner.

Before and after

Now that we’ve not an image and a mask – what do we do with it? We need to get something into the latent channel So we feed it into a new node “VAE Encode (for Inpainting)”. There is another VAE Encode node, that takes and image and makes a latent, but the inpainting version also take a mask input to merge the two, making it the ideal node for this step. The only input it takes that doesn’t come off the load image node is a VAE line. We’ll use the same VAE Loader that we used for the decode, giving it the old reliable orangemix. Lastly, we’ve got a single output from the encode node – a latent. We connect that to our sampler, replacing the one from the Empty Latent Image node.

Before I kick anything off, the workflow now looks like this:

Can anyone even read those?

Lets kick it off…

It’s… a painting.

That doesn’t look like what I wanted. I did not change either the prompt or the seed, but that big bunker-like rocky thingy and weird armor extensions were not what I was hoping for. Time to start randomizing the seed and trying again. If I don’t see anything good coming out, I’ll have to refine the prompt or fiddle with the denoising setting. I got a bunch of wonky results, so I decided to start by adjusting down the denoise value. This erased my banner outright, but was otherwise sane. So I fiddled with denoise a bit more, eventually getting some sort of mace-shaped tower in place of the banner.

So, is he going to hit someone with that tower?

While an interesting image, I don’t have my held banner. Lets adjust the prompt to increase the weight of banner and add something to emphasize held banner. After a few rounds of prompt adjustments, running random seeds, and tweaking the denoise setting, I can’t seem to get the arm to be holding a darn thing. Sounds like a job for a LoRA.

To the Internet!

And the internet has failed me. For the sake of getting this article done, I’m going to concede defeat on the objective. I want to move on to outpainting, but I’m already at article length, and outpainting is still a messy process. So I’m going to push that off to the next article.

I’ll leave you with the last image that came out of the engine. At least, it’s got a lot of flags.

I declare failure.