Commit Graph

389 Commits

Author SHA1 Message Date
Kohya S
efb2a128cd fix wandb val logging 2025-02-21 22:07:35 +09:00
Kohya S
4a36996134 modify log step calculation 2025-02-18 22:05:08 +09:00
Kohya S
dc7d5fb459 Merge branch 'sd3' into val-loss-improvement 2025-02-18 21:34:30 +09:00
rockerBOO
4671e23778 Fix validation epoch loss to check epoch average 2025-02-16 01:42:44 -05:00
Kohya S
63337d9fe4 Merge branch 'sd3' into val-loss-improvement 2025-02-15 21:41:07 +09:00
rockerBOO
ab88b431b0 Fix validation epoch divergence 2025-02-14 11:14:38 -05:00
Kohya S
76b761943b fix: simplify validation step condition in NetworkTrainer 2025-02-11 21:53:57 +09:00
Kohya S
177203818a fix: unpause training progress bar after vaidation 2025-02-11 21:42:46 +09:00
Kohya S
344845b429 fix: validation with block swap 2025-02-09 21:25:40 +09:00
Kohya S
0911683717 set python random state 2025-02-09 20:53:49 +09:00
Kohya S
c5b803ce94 rng state management: Implement functions to get and set RNG states for consistent validation 2025-02-04 21:59:09 +09:00
Kohya S
45ec02b2a8 use same noise for every validation 2025-01-27 22:10:38 +09:00
Kohya S
0778dd9b1d fix Text Encoder only LoRA training 2025-01-27 22:03:42 +09:00
Kohya S
0750859133 validation: Implement timestep-based validation processing 2025-01-27 21:56:59 +09:00
Kohya S
29f31d005f add network.train()/eval() for validation 2025-01-27 21:35:43 +09:00
Kohya S
b6a3093216 call optimizer eval/train fn before/after validation 2025-01-27 21:22:11 +09:00
Kohya S
86a2f3fd26 Fix gradient handling when Text Encoders are trained 2025-01-27 21:10:52 +09:00
Kohya S
532f5c58a6 formatting 2025-01-27 20:50:42 +09:00
rockerBOO
c04e5dfe92 Fix loss recorder on 0. Fix validation for cached runs. Assert on validation dataset 2025-01-23 09:57:24 -05:00
rockerBOO
25929dd0d7 Remove Validating... print to fix output layout 2025-01-12 15:38:57 -05:00
rockerBOO
ee9265cf26 Fix validate_every_n_steps for gradient accumulation 2025-01-12 14:56:35 -05:00
rockerBOO
0456858992 Fix validate_every_n_steps always running first step 2025-01-12 14:47:49 -05:00
rockerBOO
2bbb40ce51 Fix regularization images with validation
Adding metadata recording for validation arguments
Add comments about the validation split for clarity of intention
2025-01-12 14:29:50 -05:00
rockerBOO
4c61adc996 Add divergence to logs
Divergence is the difference between training and validation to
allow a clear value to indicate the difference between the two
in the logs.
2025-01-12 13:18:26 -05:00
rockerBOO
1e61392cf2 Revert bucket_reso_steps to correct 64 2025-01-08 18:43:26 -05:00
rockerBOO
556f3f1696 Fix documentation, remove unused function, fix bucket reso for sd1.5, fix multiple datasets 2025-01-08 13:41:15 -05:00
rockerBOO
1231f5114c Remove unused train_util code, fix accelerate.log for wandb, add init_trackers library code 2025-01-07 22:31:41 -05:00
rockerBOO
742bee9738 Set validation steps in multiple lines for readability 2025-01-06 17:34:23 -05:00
rockerBOO
fcb2ff010c Clean up some validation help documentation 2025-01-06 11:39:32 -05:00
rockerBOO
f8850296c8 Fix validate epoch, cleanup imports 2025-01-06 11:34:10 -05:00
rockerBOO
c64d1a22fc Add validate_every_n_epochs, change name validate_every_n_steps 2025-01-06 11:30:21 -05:00
rockerBOO
1c63e7cc49 Cleanup unused code and formatting 2025-01-06 11:07:47 -05:00
rockerBOO
bbf6bbd5ea Use self.get_noise_pred_and_target and drop fixed timesteps 2025-01-06 10:48:38 -05:00
rockerBOO
1c0ae306e5 Add missing functions for training batch 2025-01-03 15:43:02 -05:00
rockerBOO
1f9ba40b8b Add step break for validation epoch. Remove unused variable 2025-01-03 15:32:07 -05:00
rockerBOO
0522070d19 Fix training, validation split, revert to using upstream implemenation 2025-01-03 15:20:25 -05:00
rockerBOO
58bfa36d02 Add seed help clarifying info 2025-01-03 02:00:28 -05:00
rockerBOO
534059dea5 Typos and lingering is_train 2025-01-03 01:18:15 -05:00
rockerBOO
d23c7322ee Merge remote-tracking branch 'hina/feature/val-loss' into validation-loss-upstream
Modified implementation for process_batch and cleanup validation
recording
2025-01-03 00:48:08 -05:00
rockerBOO
7f6e124c7c Merge branch 'gesen2egee/val' into validation-loss-upstream
Modified various implementations to restore original behavior
2025-01-02 23:04:38 -05:00
gesen2egee
8743532963 val 2025-01-02 15:57:12 -05:00
Hina Chen
cb89e0284e Change val latent loss compare 2024-12-28 11:57:04 +08:00
Hina Chen
64bd5317dc Split val latents/batch and pick up val latents shape size which equal to training batch. 2024-12-28 11:42:15 +08:00
Hina Chen
62164e5792 Change val loss calculate method 2024-12-27 17:28:05 +08:00
Hina Chen
05bb9183fa Add Validation loss for LoRA training 2024-12-27 16:47:59 +08:00
Kohya S.
14c9ba925f Merge pull request #1811 from rockerBOO/schedule-free-prodigy
Allow unknown schedule-free optimizers to continue to module loader
2024-12-01 21:51:25 +09:00
Kohya S
cc11989755 fix: refactor huber-loss calculation in multiple training scripts 2024-12-01 21:20:28 +09:00
rockerBOO
6593cfbec1 Fix d * lr step log 2024-11-29 14:16:24 -05:00
rockerBOO
87f5224e2d Support d*lr for ProdigyPlus optimizer 2024-11-29 14:16:00 -05:00
recris
420a180d93 Implement pseudo Huber loss for Flux and SD3 2024-11-27 18:37:09 +00:00