update about multi-resolution training in FLUX.1

This commit is contained in:
Kohya S
2024-08-20 08:21:01 +09:00
parent 9e72be0a13
commit c62c95e862

View File

@@ -9,6 +9,13 @@ __Please update PyTorch to 2.4.0. We have tested with `torch==2.4.0` and `torchv
The command to install PyTorch is as follows:
`pip3 install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124`
Aug 20, 2024:
FLUX.1 supports multi-resolution inference, so training at multiple resolutions may be possible and the results may be improved (like 1024x1024, 768x768 and 512x512 ... you can use any resolution).
The script seems to support multi-resolution even in the current version, __if `--cache_latents_to_disk` is not specified__. Please try if you are interested. See [FLUX.1 Multi-resolution training](#flux1-multi-resolution-training) for details.
We will support multi-resolution caching to disk in the near future.
Aug 19, 2024:
In `flux_train.py`, the memory consumption during model saving is reduced when `--save_precision` is set to the same value as `--mixed_precision` (about 22GB). Please set the same value unless there is a reason.
@@ -159,6 +166,51 @@ In the case of LoRA models are trained with `bf16`, we are not sure which is bet
The script can merge multiple LoRA models. If you want to merge multiple LoRA models, specify `--concat` option to work the merged LoRA model properly.
### FLUX.1 Multi-resolution training
You can define multiple resolutions in the dataset configuration file. __Caching latents to disk is not supported yet.__
The dataset configuration file is like below. You can define multiple resolutions with different batch sizes. The resolutions are defined in the `[[datasets]]` section. The `[[datasets.subsets]]` section is for the dataset directory. Please specify the same directory for each resolution.
```
[general]
# define common settings here
flip_aug = true
color_aug = false
keep_tokens_separator= "|||"
shuffle_caption = false
caption_tag_dropout_rate = 0
caption_extension = ".txt"
[[datasets]]
# define the first resolution here
batch_size = 2
enable_bucket = true
resolution = [1024, 1024]
[[datasets.subsets]]
image_dir = "path/to/image/dir"
num_repeats = 1
[[datasets]]
# define the second resolution here
batch_size = 3
enable_bucket = true
resolution = [768, 768]
[[datasets.subsets]]
image_dir = "path/to/image/dir"
num_repeats = 1
[[datasets]]
# define the third resolution here
batch_size = 4
enable_bucket = true
resolution = [512, 512]
[[datasets.subsets]]
image_dir = "path/to/image/dir"
num_repeats = 1
```
## SD3 training