What the FLUX? A New Era in Image Generation

#CASSI#TechTrends🚀#CreativeDesign ✏️#AI#Blog

# What the FLUX? A New Era in Image Generation

You've probably already heard the news: there's a new image generator in town called FLUX, and it's all the rage. This blog post explores the why, the how, and much more, so stay tuned.

Do We Really Need Another Image Generator?

There are already more image generators out there than people care to remember, ranging from the recently upgraded DALL-E 3 to the open-source Stable Diffusion and the current king of the hill, Midjourney. Not to mention others like Amazon's Titan, Google's Gemini, Adobe Firefly, Getty Image Generator, and hundreds more. So why even bother with a new one? The answers are both simple and complex.

The Minds Behind FLUX

Let's first look at the founders of the company behind FLUX, "Black Forest Labs." As a German-based team, we're pleased to share that this model was created right here in Germany (which is unfortunately rare for cutting-edge GenAI developments). More importantly, it's created largely by the team that brought us diffusion models in the first place – former students from the CompVis Group at Ludwig Maximilian University of Munich and Runway. This pedigree might explain why a company that was in stealth just a month ago received a $31M venture capital check from Silicon Valley giants, including legendary investor Marc Andreessen.

 
notion image

Comparison between 4 image generators available on CASSI: From left to right; Amazon Titan, OpenAI DALL-E3, FLUX.Dev and Stable Diffusion XL

Why FLUX Matters: Quality and Control

The need for FLUX boils down to two critical factors: quality and control. Currently, most image generators struggle with issues like:

  1. Poor anatomy
  1. Lack of visual fidelity
  1. Artifacts
  1. Poor prompt adherence (they often don't do what you ask them to do)

Additionally, commercial image generator offerings typically don't allow fine-tuning with your own dataset, as this would require vendors to hand over the "weights," which they're reluctant to do.

 
notion image

Flux.dev Generated Image of 3 people showing their hands

Breaking It Down

Quality

While Midjourney often delivered on quality, it had a major flaw: the lack of a scalable API to integrate into other services/platforms (like our platform CASSI) and being closed-source. Stable Diffusion XL was capable of delivering high quality but required significant tweaking, including separate fine-tuned models for specific results, extensions to fix hands and other objects, and often an experienced GenAI artist using a tool like ComfyUI to manage it all.

Control

Even experienced artists often couldn't get models to do exactly what they wanted, requiring them to feed reference images, scribbles, and more to guide the AI to the desired result.

The FLUX Advantage

FLUX addresses all these issues, and remarkably, with their very first release. They're offering three models in their initial batch:

  1. FLUX.1 [schnell]: The fastest model, tailored for local development and personal use.
  1. FLUX.1 [dev]: An open-weight, mid-sized model for non-commercial applications.
  1. FLUX.1 [pro]: State-of-the-art performance image generation with top-of-the-line prompt following, visual quality, image detail, and output diversity. Most importantly, the pro version is commercially usable and will be available in CASSI very soon!
notion image

Flux.dev Image of a woman in a pink and purple superhero costume with the letter CASSI on her chest. Social Media icons in the background, out of focus

 

The Fine-Tuning Debate

Shortly after FLUX's release, a heated discussion arose about the possibility of fine-tuning the model. The CEO of a well-respected image generation platform stated it would be impossible to fine-tune FLUX. The Black Forest Labs crew made a clever play here: they gave away the weights for the mid- and lower-end models to the community, capturing the best of both worlds – the fast-paced, dynamic open-source community that quickly jumped on it and started training custom models and building extensions/adapters (upscalers, inpaiting, controltnet), and the corporate entities that often "just" want great-looking images at scale.

This post probably sums it up quite nicely:

notion image

Down-Sides

Despite the impressive advancements, FLUX has its drawbacks. It's incredibly resource-intensive, even for the smallest model, leading to slower image generation and difficulties running on standard hardware. The open-source community has released a more compact version with lower VRAM requirements, but it's still demanding.

Popular adapters like IP Adapter are not yet released, so this will require a bit of patiance, but a first ControlNet version just came out from the community (for granular control over parts of the images of poses of persons)

Another consideration is FLUX's relaxed approach to nudity, NSFW and brand and IP related content. Unlike most commercial image generators that strictly filter such prompts, FLUX readily complies with these requests. So don´t be surprised to see a naked Trump in your timeline soon. X has integrated Flux with Grok their AI as soon with the image below.

Disclaimer: With CASSI you can create “brand safe” workflows that utilize FLUX under the hood, but stay compliant and within brand DNA.
notion image

Made by user: Tom Warren with Grok - Generated Image of Mikey Mouse with MAGA hat and beer - Intellectual property theft included in the package

 

The uncensored aspect may be seen as positive or negative, depending on perspective, but it's crucial for companies to consider when implementing or using FLUX within their organization.

Conclusion

FLUX represents a significant leap forward in image generation technology, addressing key issues in quality, control, and accessibility. Whether you're a developer, an artist, or a business looking to integrate cutting-edge image generation, FLUX offers a solution that's worth paying attention to, but move ahead with caution.

Most importantly: FLUX is commercially usable and available in CASSI now!

 
notion image

Flux.dev Generated Image of a rocket launching into space

 
Disclaimer: All images in this blog post where generated using CASSI - except the Mikey Mouse example above