rockerBOO
4f27c6a0c9
Add BPO, CPO, DDO, SDPO, SimPO
...
Refactor Preference Optimization
Refactor preference dataset
Add iterator support for ImageInfo and ImageSetInfo
- Supporting iterating through either ImageInfo or ImageSetInfo to
clean up preference dataset implementation and support 2 or more
images more cleanly without needing to duplicate code
Add tests for all PO functions
Add metrics for process_batch
Add losses for gradient manipulation of loss parts
Add normalizing gradient for stabilizing gradients
Args added:
mapo_beta = 0.05
cpo_beta = 0.1
bpo_beta = 0.1
bpo_lambda = 0.2
sdpo_beta = 0.02
simpo_gamma_beta_ratio = 0.25
simpo_beta = 2.0
simpo_smoothing = 0.0
simpo_loss_type = "sigmoid"
ddo_alpha = 4.0
ddo_beta = 0.05
2025-06-03 15:09:48 -04:00
rockerBOO
d8716a9cb9
Rework DDO loss
2025-05-02 02:07:53 -04:00
rockerBOO
9a2101a040
Add DDO loss
2025-04-30 03:34:19 -04:00
rockerBOO
d22c827544
Update PO cached latents, move out functions, update calls
2025-04-27 17:38:50 -04:00
Kohya S
cd80752175
fix: remove unused parameter 'accelerator' from encode_images_to_latents method
2025-02-11 21:42:58 +09:00
Kohya S
344845b429
fix: validation with block swap
2025-02-09 21:25:40 +09:00
Kohya S
45ec02b2a8
use same noise for every validation
2025-01-27 22:10:38 +09:00
Kohya S
42c0a9e1fc
Merge branch 'sd3' into val-loss-improvement
2025-01-27 22:06:18 +09:00
Kohya S
0778dd9b1d
fix Text Encoder only LoRA training
2025-01-27 22:03:42 +09:00
Kohya S
86a2f3fd26
Fix gradient handling when Text Encoders are trained
2025-01-27 21:10:52 +09:00
rockerBOO
c04e5dfe92
Fix loss recorder on 0. Fix validation for cached runs. Assert on validation dataset
2025-01-23 09:57:24 -05:00
rockerBOO
bbf6bbd5ea
Use self.get_noise_pred_and_target and drop fixed timesteps
2025-01-06 10:48:38 -05:00
Kohya S.
09a3740f6c
Merge pull request #1813 from minux302/flux-controlnet
...
Add Flux ControlNet
2024-12-02 23:32:16 +09:00
minux302
f40632bac6
rm abundant arg
2024-11-30 00:15:47 +09:00
minux302
9dff44d785
fix device
2024-11-29 14:40:38 +00:00
recris
420a180d93
Implement pseudo Huber loss for Flux and SD3
2024-11-27 18:37:09 +00:00
Kohya S
2a188f07e6
Fix to work DOP with bock swap
2024-11-17 16:12:10 +09:00
minux302
42f6edf3a8
fix for adding controlnet
2024-11-15 23:48:51 +09:00
minux302
ccfaa001e7
add flux controlnet base module
2024-11-15 20:21:28 +09:00
Kohya S
5c5b544b91
refactor: remove unused prepare_split_model method from FluxNetworkTrainer
2024-11-14 19:35:43 +09:00
Kohya S
2cb7a6db02
feat: add block swap for FLUX.1/SD3 LoRA training
2024-11-12 21:39:13 +09:00
Kohya S
cde90b8903
feat: implement block swapping for FLUX.1 LoRA (WIP)
2024-11-12 08:49:05 +09:00
kohya-ss
1065dd1b56
Fix to work dropout_rate for TEs
2024-10-27 19:36:36 +09:00
Kohya S
623017f716
refactor SD3 CLIP to transformers etc.
2024-10-24 19:49:28 +09:00
kohya-ss
2c45d979e6
update README, remove unnecessary autocast
2024-10-19 19:21:12 +09:00
kohya-ss
ef70aa7b42
add FLUX.1 support
2024-10-18 23:39:48 +09:00
kohya-ss
5bb9f7fb1a
Merge branch 'sd3' into multi-gpu-caching
2024-10-13 11:52:42 +09:00
Kohya S
e277b5789e
Update FLUX.1 support for compact models
2024-10-12 21:49:07 +09:00
kohya-ss
c80c304779
Refactor caching in train scripts
2024-10-12 20:18:41 +09:00
Kohya S
83e3048cb0
load Diffusers format, check schnell/dev
2024-10-06 21:32:21 +09:00
Kohya S
1a0f5b0c38
re-fix sample generation is not working in FLUX1 split mode #1647
2024-09-29 00:35:29 +09:00
Kohya S
2889108d85
feat: Add --cpu_offload_checkpointing option to LoRA training
2024-09-05 20:58:33 +09:00
Kohya S
b65ae9b439
T5XXL LoRA training, fp8 T5XXL support
2024-09-04 21:33:17 +09:00
Akegarasu
6c0e8a5a17
make guidance_scale keep float in args
2024-08-29 14:50:29 +08:00
Kohya S
3be712e3e0
feat: Update direct loading fp8 ckpt for LoRA training
2024-08-27 21:40:02 +09:00
Kohya S
0087a46e14
FLUX.1 LoRA supports CLIP-L
2024-08-27 19:59:40 +09:00
Kohya S
2e89cd2cc6
Fix issue with attention mask not being applied in single blocks
2024-08-24 12:39:54 +09:00
kohya-ss
98c91a7625
Fix bug in FLUX multi GPU training
2024-08-22 12:37:41 +09:00
Kohya S
7e459c00b2
Update T5 attention mask handling in FLUX
2024-08-21 08:02:33 +09:00
Kohya S
400955d3ea
add fine tuning FLUX.1 (WIP)
2024-08-17 15:36:18 +09:00
Kohya S
3921a4efda
add t5xxl max token length, support schnell
2024-08-16 17:06:05 +09:00
DukeG
08ef886bfe
Fix AttributeError: 'FluxNetworkTrainer' object has no attribute 'sample_prompts_te_outputs'
...
Move "self.sample_prompts_te_outputs = None" from Line 150 to Line 26.
2024-08-16 11:00:08 +08:00
Kohya S
8aaa1967bd
fix encoding latents closes #1456
2024-08-15 22:07:23 +09:00
Kohya S
7db4222119
add sample image generation during training
2024-08-14 22:15:26 +09:00
Kohya S
56d7651f08
add experimental split mode for FLUX
2024-08-13 22:28:39 +09:00
Kohya S
d25ae361d0
fix apply_t5_attn_mask to work
2024-08-11 19:07:07 +09:00
Kohya S
8a0f12dde8
update FLUX LoRA training
2024-08-10 23:42:05 +09:00
Kohya S
808d2d1f48
fix typos
2024-08-09 23:02:51 +09:00
Kohya S
36b2e6fc28
add FLUX.1 LoRA training
2024-08-09 22:56:48 +09:00