How to install Stable Diffusion on Ubuntu and use it in CLI

What is Stable Diffusion?

Stable Diffusion is a popular model for generating images. It’s widely used for creating art, generating visuals for content, and all sorts of fun with creative prompts.

Let’s see how to install and use this model from a developer’s perspective.

How to install Stable Diffusion on Ubuntu 24.04

Installing Stable Diffusion on Ubuntu is straightforward. First, make sure you have an Nvidia GPU and CUDA installed. I recommend to use Miniconda to keep Python environments clean, so:

conda install -c conda-forge cudatoolkit diffusers transformers

After a while, we are ready to write image generation code using Stable Diffusion.

Generating images with Python and Stable Diffusion

The simplest way to generate an image is to pick a model and a prompt:

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")
pipeline("kyiv city",height=400,width=800).images[0].save('kyiv-city-1.jpg')
100%|█████████.....█████████| 75/75 [00:04<00:00, 16.99it/s]

This code has generated the following image:

Kyiv city, generated with Stable Diffusion

Tweaking model for better results

The unconfigured model will most certainly generate poor-quality images. We have 2 tools to improve results here. First, the prompt itself. It’s the most efficient tool to improve results. Make it clear and detailed enough. Do not only describe an image but also its quality and style. Second, we can let the model consume more resources to try to generate better results.

Let’s take a look at this attempt:

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")
pipeline(
  "kyiv city, dslr, ultra quality, sharp focus, tack sharp, dof, film grain, Fujifilm XT3, crystal clear, 8K UHD",
  num_inference_steps=150,
  height=400,width=800
).images[0].save('kyiv-city-2.jpg')
100%|█████████.....█████████| 150/150 [00:08<00:00, 17.38it/s]

Now our image looks like this:

Kyiv city, second version by Stable Diffusion

This feels better but still has a lot to improve. Just play with prompt to get the best results here.

Using different models

There’s a huge number of trained and fine-tuned stable diffusion models out there. To use any of those models, just download it and use from_single_file() method to load the pipeline. Let’s use Hassanblend1.4 model to generate beautiful portrait:

from diffusers import StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_single_file("./HassanBlend1.4_Safe.safetensors", torch_dtype=torch.float16,safety_checker = None)
pipeline.to("cuda")
pipeline(
  "photo of a beautiful person woman",
  negative_prompt = "ugly, blurry, bad, photoshop, 3d",
  height=400,width=800
).images[0].save('beautiful-person-1.jpg')
100%|█████████.....█████████| 75/75 [00:04<00:00, 17.00it/s]

This is what HassanBlend has generated:

Beautiful person, by HassanBlend

Further reading

Published 2 months ago in #machinelearning about #stable diffusion, #ubuntu and #images by Denys Golotiuk

Edit this article on Github
Denys Golotiuk · golotyuk@gmail.com · my github