feat: support block_to_swap for FLUX.1 ControlNet training

This commit is contained in:
Kohya S
2024-12-03 08:43:26 +09:00
parent e369b9a252
commit 8b36d907d8
2 changed files with 45 additions and 14 deletions

View File

@@ -14,6 +14,11 @@ The command to install PyTorch is as follows:
### Recent Updates
Dec 3, 2024:
-`--blocks_to_swap` now works in FLUX.1 ControlNet training. Sample commands for 24GB VRAM and 16GB VRAM are added [here](#flux1-controlnet-training).
Dec 2, 2024:
- FLUX.1 ControlNet training is supported. PR [#1813](https://github.com/kohya-ss/sd-scripts/pull/1813). Thanks to minux302! See PR and [here](#flux1-controlnet-training) for details.
@@ -276,6 +281,14 @@ accelerate launch --mixed_precision bf16 --num_cpu_threads_per_process 1 flux_tr
--timestep_sampling shift --discrete_flow_shift 3.1582 --model_prediction_type raw --guidance_scale 1.0 --deepspeed
```
For 24GB VRAM GPUs, you can train with 16 blocks swapped and caching latents and text encoder outputs with the batch size of 1. Remove `--deepspeed` . Sample command is below. Not fully tested.
```
--blocks_to_swap 16 --cache_latents_to_disk --cache_text_encoder_outputs_to_disk
```
The training can be done with 16GB VRAM GPUs with around 30 blocks swapped.
`--gradient_accumulation_steps` is also available. The default value is 1 (no accumulation), but according to the original PR, 8 is used.
### FLUX.1 OFT training