Flux text to image AI model

Awesome new AI model to produce image from text

Page content

Recently Black Forest Labs published a set of text-to-image AI models. These models are told have much higher output quality. Let’s try them out

This is the image Flux can generate in less than a minute. Dolores

Installing

To install it on your own PC you will need 16GB VRAM on your GPU for FLUX.1-dev and 8 GB VRAM for FLUX.1-schnell

  1. Create account on site huggingface.co if you don’t have it yet

  2. Have a look at the models announcement and description https://blackforestlabs.ai/announcing-black-forest-labs/

  3. Go to the page https://huggingface.co/black-forest-labs/FLUX.1-dev for the dev and to the page https://huggingface.co/black-forest-labs/FLUX.1-schnell for schnell

  4. Accept license agreement if you agree

  5. Create Write access token on the page https://huggingface.co/settings/tokens . You will need it to pull the model.

  6. Pull the model. I’m pulling dev

git clone https://huggingface.co/black-forest-labs/FLUX.1-dev
  1. Wait

Run it

Install diffusers, torch and other good awesome python libs.

pip install -U diffusers torch transformers protobuf accelerate sentencepiece

Create python file and copy-paste:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
#save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
pipe.enable_model_cpu_offload() 
pipe.enable_sequential_cpu_offload()

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

To learn more check out the diffusers documentation

The result

  1. When I was testing LLMs for Perplexica one of the questions I gave to Perplexica was What was that tradies protest in Australia on 27th of August 2024 about? Let’s see what image Flux generates for the very vague prompt
A group of tradie protesters are supporting
their trade union in Melbourne

topimage

  1. And this one:
Human rights are getting impacted by COVID-19 pandemic

topimage

  1. And my favorite test
A tram runs through the Melbourne City at night

topimage

These all images look very good. Let’s find faults in the last one:

  • I’s a Melbourne city. Trams and cars must run on the left side.
  • Tram colour is not right. ok that might be too picky
  • Front lights of the tram are of a red colour?
  • Tram doesn’t have a driver
  • The route is very strange

Overall I like this model!