Flux text to image AI model

Awesome new AI model to produce image from text

Page content

Recently Black Forest Labs published a set of text-to-image AI models. These models are told have much higher output quality. Let’s try them out

This is the image Flux can generate in less than a minute. Dolores

Installing

To install it on your own PC you will need 16GB VRAM on your GPU for FLUX.1-dev and 8 GB VRAM for FLUX.1-schnell

Create account on site huggingface.co if you don’t have it yet
Have a look at the models announcement and description https://blackforestlabs.ai/announcing-black-forest-labs/
Go to the page https://huggingface.co/black-forest-labs/FLUX.1-dev for the dev and to the page https://huggingface.co/black-forest-labs/FLUX.1-schnell for schnell
Accept license agreement if you agree
Create Write access token on the page https://huggingface.co/settings/tokens . You will need it to pull the model.
Pull the model. I’m pulling dev

git clone https://huggingface.co/black-forest-labs/FLUX.1-dev

Wait

Run it

Install diffusers, torch and other good awesome python libs.

pip install -U diffusers torch transformers protobuf accelerate sentencepiece

Create python file and copy-paste:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
#save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
pipe.enable_model_cpu_offload() 
pipe.enable_sequential_cpu_offload()

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

To learn more check out the diffusers documentation

The result

When I was testing LLMs for Perplexica one of the questions I gave to Perplexica was What was that tradies protest in Australia on 27th of August 2024 about? Let’s see what image Flux generates for the very vague prompt

A group of tradie protesters are supporting
their trade union in Melbourne

topimage

And this one:

Human rights are getting impacted by COVID-19 pandemic

topimage

And my favorite test

A tram runs through the Melbourne City at night

topimage

These all images look very good. Let’s find faults in the last one:

I’s a Melbourne city. Trams and cars must run on the left side.
Tram colour is not right. ok that might be too picky
Front lights of the tram are of a red colour?
Tram doesn’t have a driver
The route is very strange

Overall I like this model!

Installing

Run it

The result

Useful links