Commit Graph

2230 Commits

Author SHA1 Message Date
rockerBOO
415233993a Spelling 2025-06-03 15:17:00 -04:00
rockerBOO
429b2abaf3 Merge branch 'sd3' into po 2025-06-03 15:15:23 -04:00
rockerBOO
4f27c6a0c9 Add BPO, CPO, DDO, SDPO, SimPO
Refactor Preference Optimization
Refactor preference dataset
Add iterator support for ImageInfo and ImageSetInfo
- Supporting iterating through either ImageInfo or ImageSetInfo to
  clean up preference dataset implementation and support 2 or more
  images more cleanly without needing to duplicate code
Add tests for all PO functions
Add metrics for process_batch
Add losses for gradient manipulation of loss parts
Add normalizing gradient for stabilizing gradients

Args added:

mapo_beta = 0.05
cpo_beta = 0.1
bpo_beta = 0.1
bpo_lambda = 0.2
sdpo_beta = 0.02
simpo_gamma_beta_ratio = 0.25
simpo_beta = 2.0
simpo_smoothing = 0.0
simpo_loss_type = "sigmoid"
ddo_alpha = 4.0
ddo_beta = 0.05
2025-06-03 15:09:48 -04:00
Kohya S.
5b38d07f03 Merge pull request #2073 from rockerBOO/fix-mean-grad-norms
Fix mean grad norms
2025-05-11 21:32:34 +09:00
rockerBOO
971387ea8c Fix DDO arguments 2025-05-04 22:19:39 -04:00
rockerBOO
fe497291b5 Fix names 2025-05-04 21:27:51 -04:00
rockerBOO
e4bdffd128 Update diffusion_dpo, MaPO tests. Fix diffusion_dpo/MaPO 2025-05-04 21:19:45 -04:00
rockerBOO
d8716a9cb9 Rework DDO loss 2025-05-02 02:07:53 -04:00
Kohya S.
e2ed265104 Merge pull request #2072 from rockerBOO/pytest-pythonpath
Add  pythonpath to pytest.ini
2025-05-01 23:38:29 +09:00
Kohya S.
e85813200a Merge pull request #2074 from kohya-ss/deepspeed-readme
Deepspeed readme
2025-05-01 23:34:41 +09:00
Kohya S
a27ace74d9 doc: add DeepSpeed installation in header section 2025-05-01 23:31:23 +09:00
Kohya S
865c8d55e2 README.md: Update recent updates and add DeepSpeed installation instructions 2025-05-01 23:29:19 +09:00
Kohya S.
7c075a9c8d Merge pull request #2060 from saibit-tech/sd3
Fix: try aligning dtype of matrixes when training with deepspeed and mixed-precision is set to bf16 or fp16
2025-05-01 23:20:17 +09:00
rockerBOO
b4a89c3cdf Fix None 2025-05-01 02:03:22 -04:00
rockerBOO
f62c68df3c Make grad_norm and combined_grad_norm None is not recording 2025-05-01 01:37:57 -04:00
rockerBOO
a4fae93dce Add pythonpath to pytest.ini 2025-05-01 00:55:10 -04:00
rockerBOO
e61dd14203 Formatting 2025-04-30 19:58:05 -04:00
rockerBOO
22447ebc76 Use mean, use ddo_loss 2025-04-30 19:46:44 -04:00
sharlynxy
1684ababcd remove deepspeed from requirements.txt 2025-04-30 19:51:09 +08:00
rockerBOO
9a2101a040 Add DDO loss 2025-04-30 03:34:19 -04:00
Kohya S
64430eb9b2 Merge branch 'dev' into sd3 2025-04-29 21:30:57 +09:00
Kohya S
d8717a3d1c Merge branch 'main' into dev 2025-04-29 21:30:33 +09:00
Kohya S.
a21b6a917e Merge pull request #2070 from kohya-ss/fix-mean-ar-error-nan
Fix mean image aspect ratio error calculation to avoid NaN values
2025-04-29 21:29:42 +09:00
Kohya S
4625b34f4e Fix mean image aspect ratio error calculation to avoid NaN values 2025-04-29 21:27:04 +09:00
rockerBOO
8e8243a423 Add DDO preference optimization 2025-04-28 22:37:44 -04:00
rockerBOO
d23e15ac5c Fix remaining test 2025-04-28 16:14:10 -04:00
rockerBOO
10ce29f4fe Fix timestep/timestep refactor 2025-04-28 16:11:12 -04:00
rockerBOO
61e3083945 Typo 2025-04-28 16:05:48 -04:00
rockerBOO
78a29467f0 Merge branch 'sd3' into po 2025-04-27 17:41:03 -04:00
rockerBOO
d22c827544 Update PO cached latents, move out functions, update calls 2025-04-27 17:38:50 -04:00
Kohya S.
80320d21fe Merge pull request #2066 from kohya-ss/quick-fix-flux-sampling-scales
Quick fix flux sampling scales
2025-04-27 23:39:47 +09:00
Kohya S
29523c9b68 docs: add note for user feedback on CFG scale in FLUX.1 training 2025-04-27 23:34:37 +09:00
Kohya S
fd3a445769 fix: revert default emb guidance scale and CFG scale for FLUX.1 sampling 2025-04-27 22:50:27 +09:00
Kohya S
13296ae93b Merge branch 'sd3' of https://github.com/kohya-ss/sd-scripts into sd3 2025-04-27 21:48:03 +09:00
Kohya S
0e8ac43760 Merge branch 'dev' into sd3 2025-04-27 21:47:58 +09:00
Kohya S
bc9252cc1b Merge branch 'main' into dev 2025-04-27 21:47:39 +09:00
Kohya S.
3b25de1f17 Merge pull request #2065 from kohya-ss/kohya-ss-funding-yml
Create FUNDING.yml
2025-04-27 21:29:44 +09:00
Kohya S.
f0b07c52ab Create FUNDING.yml 2025-04-27 21:28:38 +09:00
Kohya S.
309c44bdf2 Merge pull request #2064 from kohya-ss/flux-sample-cfg
Add CFG for sampling in training with FLUX.1
2025-04-27 18:35:45 +09:00
Kohya S
8387e0b95c docs: update README to include CFG scale support in FLUX.1 training 2025-04-27 18:25:59 +09:00
Kohya S
5c50cdbb44 Merge branch 'sd3' into flux-sample-cfg 2025-04-27 17:59:26 +09:00
saibit
46ad3be059 update deepspeed wrapper 2025-04-24 11:26:36 +08:00
sharlynxy
abf2c44bc5 Dynamically set device in deepspeed wrapper (#2)
* get device type from model

* add logger warning

* format

* format

* format
2025-04-23 18:57:19 +08:00
saibit
adb775c616 Update: requirement diffusers[torch]==0.25.0 2025-04-23 17:05:20 +08:00
Kohya S
b11c053b8f Merge branch 'dev' into sd3 2025-04-22 21:48:24 +09:00
Kohya S.
c46f08a87a Merge pull request #2053 from GlenCarpenter/main
fix: update hf_hub_download parameters to fix wd14 tagger regression
2025-04-22 21:47:29 +09:00
sharlynxy
0d9da0ea71 Merge pull request #1 from saibit-tech/dev/xy/align_dtype_using_mixed_precision
Fix: try aligning dtype of matrixes when training with deepspeed and mixed-precision is set to bf16 or fp16
2025-04-22 16:37:33 +08:00
Robert
f501209c37 Merge branch 'dev/xy/align_dtype_using_mixed_precision' of github.com:saibit-tech/sd-scripts into dev/xy/align_dtype_using_mixed_precision 2025-04-22 16:19:52 +08:00
Robert
c8af252a44 refactor 2025-04-22 16:19:14 +08:00
saibit
7f984f4775 # 2025-04-22 16:15:12 +08:00