Commit Graph

404 Commits

Author SHA1 Message Date
rockerBOO
0522070d19 Fix training, validation split, revert to using upstream implemenation 2025-01-03 15:20:25 -05:00
rockerBOO
58bfa36d02 Add seed help clarifying info 2025-01-03 02:00:28 -05:00
rockerBOO
534059dea5 Typos and lingering is_train 2025-01-03 01:18:15 -05:00
rockerBOO
d23c7322ee Merge remote-tracking branch 'hina/feature/val-loss' into validation-loss-upstream
Modified implementation for process_batch and cleanup validation
recording
2025-01-03 00:48:08 -05:00
rockerBOO
7f6e124c7c Merge branch 'gesen2egee/val' into validation-loss-upstream
Modified various implementations to restore original behavior
2025-01-02 23:04:38 -05:00
gesen2egee
8743532963 val 2025-01-02 15:57:12 -05:00
Hina Chen
cb89e0284e Change val latent loss compare 2024-12-28 11:57:04 +08:00
Hina Chen
64bd5317dc Split val latents/batch and pick up val latents shape size which equal to training batch. 2024-12-28 11:42:15 +08:00
Hina Chen
62164e5792 Change val loss calculate method 2024-12-27 17:28:05 +08:00
Hina Chen
05bb9183fa Add Validation loss for LoRA training 2024-12-27 16:47:59 +08:00
Kohya S.
14c9ba925f Merge pull request #1811 from rockerBOO/schedule-free-prodigy
Allow unknown schedule-free optimizers to continue to module loader
2024-12-01 21:51:25 +09:00
Kohya S
cc11989755 fix: refactor huber-loss calculation in multiple training scripts 2024-12-01 21:20:28 +09:00
rockerBOO
6593cfbec1 Fix d * lr step log 2024-11-29 14:16:24 -05:00
rockerBOO
87f5224e2d Support d*lr for ProdigyPlus optimizer 2024-11-29 14:16:00 -05:00
recris
420a180d93 Implement pseudo Huber loss for Flux and SD3 2024-11-27 18:37:09 +00:00
Kohya S
2cb7a6db02 feat: add block swap for FLUX.1/SD3 LoRA training 2024-11-12 21:39:13 +09:00
Kohya S
cde90b8903 feat: implement block swapping for FLUX.1 LoRA (WIP) 2024-11-12 08:49:05 +09:00
kohya-ss
1065dd1b56 Fix to work dropout_rate for TEs 2024-10-27 19:36:36 +09:00
kohya-ss
a1255d637f Fix SD3 LoRA training to work (WIP) 2024-10-27 17:03:36 +09:00
Kohya S
db2b4d41b9 Add dropout rate arguments for CLIP-L, CLIP-G, and T5, fix Text Encoders LoRA not trained 2024-10-27 16:42:58 +09:00
kohya-ss
d2c549d7b2 support SD3 LoRA 2024-10-25 21:58:31 +09:00
Kohya S
5fba6f514a Merge branch 'dev' into sd3 2024-10-25 19:03:27 +09:00
catboxanon
e1b63c2249 Only add warning for deprecated scaling vpred loss function 2024-10-21 08:12:53 -04:00
catboxanon
8fc30f8205 Fix training for V-pred and ztSNR
1) Updates debiased estimation loss function for V-pred.
2) Prevents now-deprecated scaling of loss if ztSNR is enabled.
2024-10-21 07:34:33 -04:00
Kohya S
3cc5b8db99 Diff Output Preserv loss for SDXL 2024-10-18 20:57:13 +09:00
kohya-ss
c80c304779 Refactor caching in train scripts 2024-10-12 20:18:41 +09:00
kohya-ss
ff4083b910 Merge branch 'sd3' into multi-gpu-caching 2024-10-12 16:39:36 +09:00
Kohya S
886f75345c support weighted captions for sdxl LoRA and fine tuning 2024-10-10 08:27:15 +09:00
Kohya S
ba08a89894 call optimizer eval/train for sample_at_first, also set train after resuming closes #1667 2024-10-04 20:35:16 +09:00
gesen2egee
3028027e07 Update train_network.py 2024-10-04 16:41:41 +08:00
Kohya S
56a63f01ae Merge branch 'sd3' into multi-gpu-caching 2024-09-29 10:12:18 +09:00
Kohya S
d050638571 Merge branch 'dev' into sd3 2024-09-29 10:00:01 +09:00
Kohya S
fe2aa32484 adjust min/max bucket reso divisible by reso steps #1632 2024-09-29 09:49:25 +09:00
kohya-ss
9249d00311 experimental support for multi-gpus latents caching 2024-09-26 22:19:56 +09:00
Kohya S
583d4a436c add compatibility for int LR (D-Adaptation etc.) #1620 2024-09-20 22:22:24 +09:00
Akegarasu
0535cd29b9 fix: backward compatibility for text_encoder_lr 2024-09-20 10:05:22 +08:00
Kohya S
1286e00bb0 fix to call train/eval in schedulefree #1605 2024-09-18 21:31:54 +09:00
Plat
a823fd9fb8 Improve wandb logging (#1576)
* fix: wrong training steps were recorded to wandb, and no log was sent when logging_dir was not specified

* fix: checking of whether wandb is enabled

* feat: log images to wandb with their positive prompt as captions

* feat: logging sample images' caption for sd3 and flux

* fix: import wandb before use
2024-09-11 22:21:16 +09:00
Kohya S
d10ff62a78 support individual LR for CLIP-L/T5XXL 2024-09-10 20:32:09 +09:00
Kohya S
2889108d85 feat: Add --cpu_offload_checkpointing option to LoRA training 2024-09-05 20:58:33 +09:00
Kohya S
b65ae9b439 T5XXL LoRA training, fp8 T5XXL support 2024-09-04 21:33:17 +09:00
Akegarasu
35882f8d5b fix 2024-08-29 23:03:43 +08:00
Akegarasu
34f2315047 fix: text_encoder_conds referenced before assignment 2024-08-29 22:33:37 +08:00
Kohya S
0087a46e14 FLUX.1 LoRA supports CLIP-L 2024-08-27 19:59:40 +09:00
Kohya S
9e72be0a13 Fix debug_dataset to work 2024-08-20 08:19:00 +09:00
Kohya S.
e2d822cad7 Merge pull request #1452 from fireicewolf/sd3-devel
Fix AttributeError: 'T5EncoderModel' object has no attribute 'text_model', while loading T5 model in GPU.
2024-08-15 21:12:19 +09:00
Kohya S
7db4222119 add sample image generation during training 2024-08-14 22:15:26 +09:00
DukeG
9760d097b0 Fix AttributeError: 'T5EncoderModel' object has no attribute 'text_model'
While loading T5 model in GPU.
2024-08-14 19:58:54 +08:00
Kohya S
8a0f12dde8 update FLUX LoRA training 2024-08-10 23:42:05 +09:00
Kohya S
36b2e6fc28 add FLUX.1 LoRA training 2024-08-09 22:56:48 +09:00