ComfyUI FLUX IMAGE TO IMAGE FLORENCE 2 workflow #comfyui #flux #img2img #florence

Introduction

#comfyui #flux #img2img #florence

Today, we're excited to introduce the FLUX model, a promising system known for its ability to engage with prompts effectively while producing images that rival the quality of MidJourney and DALL-E. There are three types of licenses available for FLUX: Flux Pro, Flux Dev, and Flux Chanel. In this article, we'll focus on working with Flux Chanel, which is accessible under the Apache 2.0 license, allowing for personal use of the model.

For installation, ensure to include the necessary components. Under the ComfyUI environment, you’ll find flux examples which outline our requirements. Specifically, you’ll need either the T5 XXL fp16 or T5 XXL fp8 weights, as well as the L CLIP model, which should be placed in the comfyui/models/clip directory. The complete installation of the Flux Chanel model is essential, comprising a substantial file size of 23.8 GB, placed in the comfyui/models folder.

Once installed, we can dive into the text-to-image workflow. You can easily upload an image to ComfyUI and start generating outputs. The setup includes selecting the T5 XXL weight (opt for fp8) and identifying the L CLIP model previously downloaded. The basic parameters, such as the latent seed and sampler, will remain familiar to users.

Prompt Examples and Results

Let's explore some prompts utilized in the FLUX model to observe its capabilities:

Prompt: "Food vendor serving customers at a brightly lit food cart with a neon sign reading 'Pixel Easel' enhancing the urban landscape."
- Result: Produces a clear image that accurately reflects the details in the prompt, highlighting the text clearly amidst vibrant surroundings.
Prompt: "Photo of rolling sand dunes illuminated by the soft light of dusk capturing curves and patterns."
- Result: Contains beautiful lighting and depicts the intricate textures described in the prompt.
Prompt: "Humorous chase scene with a cat running after a dog past colorful market stalls depicting a busy market day."
- Result: Although the chase isn't perfectly represented, it successfully captures the vibrancy and atmosphere of the market.
Prompt: "Male and female models in an eclectic bookstore, focusing on warm lighting and quirky book titles."
- Result: The image closely matches the intended scene, showcasing the models’ attire and background well.
Challenging Prompt: "Woman holding a body cream jar against a scenic ocean backdrop."
- Result: Here we faced some difficulty with the depiction of hands, common with models even as advanced as FLUX.

In another example, when the prompt is revised to improve clarity and eliminate hand discrepancies, the results are still not perfect but show potential as advancements in the model are made.

Image-to-Image Workflow

In addition to text-to-image generation, we can also explore the image-to-image capabilities of FLUX. This workflow leverages a pre-existing image, providing a detailed caption subsequently generated by the Florence 2 model. The process includes the following steps:

Upload the base image to ComfyUI.
Generate a descriptive caption using Florence 2, and modify phrases as necessary.
Transfer the image through latent space and adjust its size while maintaining original dimensions.
Utilize a denoising factor closer to 0.80 for refined results, enhancing the quality without straying too far from the original image.

This method signifies a notable improvement, resulting in images that preserve the initial character while elevating details and overall attractiveness.

Overall, FLUX presents itself as a powerful model with exceptional capabilities. Looking ahead, tools such as ControlNet and an IP adapter are expected to enhance our ability to manage the final results more effectively.

I hope you found this overview informative. Feel free to subscribe to our channel, ask questions, and like this article if you enjoyed it. Most importantly, have fun exploring the FLUX model!

Keywords

Flux, ComfyUI, Image-to-Image, Florence 2, Model Installation, Text-to-Image Workflow, Prompt Results, Denoising Factor, Apache License, Visual Quality Improvement

FAQ

1. What licenses are available for the FLUX model?

The FLUX model has three licenses: Flux Pro, Flux Dev, and Flux Chanel.

2. Where do I need to place the L CLIP model?

The L CLIP model should be placed in the comfyui/models/clip directory.

3. How can I improve the quality of images generated using image-to-image workflows?

Working with a denoising factor close to 0.80 helps maintain the integrity of the original image and enhances quality.

4. What is the file size of the Flux Chanel model?

The file size for the Flux Chanel model is approximately 23.8 GB.

5. Can I use FLUX for commercial purposes?

The Flux Chanel model is available for personal use under the Apache 2.0 license. Please review the specific licensing terms for commercial use.