Commit Graph

404 Commits

Author SHA1 Message Date
rockerBOO
b4a89c3cdf Fix None 2025-05-01 02:03:22 -04:00
rockerBOO
f62c68df3c Make grad_norm and combined_grad_norm None is not recording 2025-05-01 01:37:57 -04:00
Kohya S
5a18a03ffc Merge branch 'dev' into sd3 2025-04-07 21:55:17 +09:00
Kohya S
d0b5c0e5cf chore: formatting, add TODO comment 2025-03-30 21:15:37 +09:00
Kohya S.
59d98e45a9 Merge pull request #1974 from rockerBOO/lora-ggpo
Add LoRA-GGPO for Flux
2025-03-30 21:07:31 +09:00
Kohya S.
93a4efabb5 Merge branch 'sd3' into resize-interpolation 2025-03-30 19:30:56 +09:00
DKnight54
381303d64f Update train_network.py 2025-03-29 02:26:18 +08:00
rockerBOO
0181b7a042 Remove progress bar avg norms 2025-03-27 03:28:33 -04:00
rockerBOO
3647d065b5 Cache weight norms estimate on initialization. Move to update norms every step 2025-03-18 14:25:09 -04:00
rockerBOO
ea53290f62 Add LoRA-GGPO for Flux 2025-03-06 00:00:38 -05:00
Kohya S
1fcac98280 Merge branch 'sd3' into val-loss-improvement 2025-02-26 21:09:10 +09:00
Kohya S.
6e90c0f86c Merge pull request #1909 from rockerBOO/progress_bar
Move progress bar to account for sampling image first
2025-02-24 18:57:44 +09:00
Kohya S
efb2a128cd fix wandb val logging 2025-02-21 22:07:35 +09:00
rockerBOO
ca1c129ffd Fix metadata 2025-02-19 14:20:40 -05:00
rockerBOO
7729c4c8f9 Add metadata 2025-02-19 14:20:40 -05:00
Kohya S
4a36996134 modify log step calculation 2025-02-18 22:05:08 +09:00
Kohya S
dc7d5fb459 Merge branch 'sd3' into val-loss-improvement 2025-02-18 21:34:30 +09:00
rockerBOO
4671e23778 Fix validation epoch loss to check epoch average 2025-02-16 01:42:44 -05:00
Kohya S
63337d9fe4 Merge branch 'sd3' into val-loss-improvement 2025-02-15 21:41:07 +09:00
rockerBOO
ab88b431b0 Fix validation epoch divergence 2025-02-14 11:14:38 -05:00
Kohya S
76b761943b fix: simplify validation step condition in NetworkTrainer 2025-02-11 21:53:57 +09:00
Kohya S
177203818a fix: unpause training progress bar after vaidation 2025-02-11 21:42:46 +09:00
Kohya S
344845b429 fix: validation with block swap 2025-02-09 21:25:40 +09:00
Kohya S
0911683717 set python random state 2025-02-09 20:53:49 +09:00
Kohya S
c5b803ce94 rng state management: Implement functions to get and set RNG states for consistent validation 2025-02-04 21:59:09 +09:00
rockerBOO
de830b8941 Move progress bar to account for sampling image first 2025-01-29 00:02:45 -05:00
Kohya S
45ec02b2a8 use same noise for every validation 2025-01-27 22:10:38 +09:00
Kohya S
0778dd9b1d fix Text Encoder only LoRA training 2025-01-27 22:03:42 +09:00
Kohya S
0750859133 validation: Implement timestep-based validation processing 2025-01-27 21:56:59 +09:00
Kohya S
29f31d005f add network.train()/eval() for validation 2025-01-27 21:35:43 +09:00
Kohya S
b6a3093216 call optimizer eval/train fn before/after validation 2025-01-27 21:22:11 +09:00
Kohya S
86a2f3fd26 Fix gradient handling when Text Encoders are trained 2025-01-27 21:10:52 +09:00
Kohya S
532f5c58a6 formatting 2025-01-27 20:50:42 +09:00
rockerBOO
c04e5dfe92 Fix loss recorder on 0. Fix validation for cached runs. Assert on validation dataset 2025-01-23 09:57:24 -05:00
rockerBOO
25929dd0d7 Remove Validating... print to fix output layout 2025-01-12 15:38:57 -05:00
rockerBOO
ee9265cf26 Fix validate_every_n_steps for gradient accumulation 2025-01-12 14:56:35 -05:00
rockerBOO
0456858992 Fix validate_every_n_steps always running first step 2025-01-12 14:47:49 -05:00
rockerBOO
2bbb40ce51 Fix regularization images with validation
Adding metadata recording for validation arguments
Add comments about the validation split for clarity of intention
2025-01-12 14:29:50 -05:00
rockerBOO
4c61adc996 Add divergence to logs
Divergence is the difference between training and validation to
allow a clear value to indicate the difference between the two
in the logs.
2025-01-12 13:18:26 -05:00
rockerBOO
1e61392cf2 Revert bucket_reso_steps to correct 64 2025-01-08 18:43:26 -05:00
rockerBOO
556f3f1696 Fix documentation, remove unused function, fix bucket reso for sd1.5, fix multiple datasets 2025-01-08 13:41:15 -05:00
rockerBOO
1231f5114c Remove unused train_util code, fix accelerate.log for wandb, add init_trackers library code 2025-01-07 22:31:41 -05:00
rockerBOO
742bee9738 Set validation steps in multiple lines for readability 2025-01-06 17:34:23 -05:00
rockerBOO
fcb2ff010c Clean up some validation help documentation 2025-01-06 11:39:32 -05:00
rockerBOO
f8850296c8 Fix validate epoch, cleanup imports 2025-01-06 11:34:10 -05:00
rockerBOO
c64d1a22fc Add validate_every_n_epochs, change name validate_every_n_steps 2025-01-06 11:30:21 -05:00
rockerBOO
1c63e7cc49 Cleanup unused code and formatting 2025-01-06 11:07:47 -05:00
rockerBOO
bbf6bbd5ea Use self.get_noise_pred_and_target and drop fixed timesteps 2025-01-06 10:48:38 -05:00
rockerBOO
1c0ae306e5 Add missing functions for training batch 2025-01-03 15:43:02 -05:00
rockerBOO
1f9ba40b8b Add step break for validation epoch. Remove unused variable 2025-01-03 15:32:07 -05:00