Kohya-ss-sd-scripts

mirror of https://github.com/kohya-ss/sd-scripts.git synced 2026-04-09 06:45:09 +00:00

Author	SHA1	Message	Date
kohya-ss	c80c304779	Refactor caching in train scripts	2024-10-12 20:18:41 +09:00
kohya-ss	ff4083b910	Merge branch 'sd3' into multi-gpu-caching	2024-10-12 16:39:36 +09:00
Kohya S	886f75345c	support weighted captions for sdxl LoRA and fine tuning	2024-10-10 08:27:15 +09:00
Kohya S	ba08a89894	call optimizer eval/train for sample_at_first, also set train after resuming closes #1667	2024-10-04 20:35:16 +09:00
gesen2egee	3028027e07	Update train_network.py	2024-10-04 16:41:41 +08:00
Kohya S	56a63f01ae	Merge branch 'sd3' into multi-gpu-caching	2024-09-29 10:12:18 +09:00
Kohya S	d050638571	Merge branch 'dev' into sd3	2024-09-29 10:00:01 +09:00
Kohya S	fe2aa32484	adjust min/max bucket reso divisible by reso steps #1632	2024-09-29 09:49:25 +09:00
kohya-ss	9249d00311	experimental support for multi-gpus latents caching	2024-09-26 22:19:56 +09:00
Kohya S	583d4a436c	add compatibility for int LR (D-Adaptation etc.) #1620	2024-09-20 22:22:24 +09:00
Akegarasu	0535cd29b9	fix: backward compatibility for text_encoder_lr	2024-09-20 10:05:22 +08:00
Kohya S	1286e00bb0	fix to call train/eval in schedulefree #1605	2024-09-18 21:31:54 +09:00
Plat	a823fd9fb8	Improve wandb logging (#1576 ) * fix: wrong training steps were recorded to wandb, and no log was sent when logging_dir was not specified * fix: checking of whether wandb is enabled * feat: log images to wandb with their positive prompt as captions * feat: logging sample images' caption for sd3 and flux * fix: import wandb before use	2024-09-11 22:21:16 +09:00
Kohya S	d10ff62a78	support individual LR for CLIP-L/T5XXL	2024-09-10 20:32:09 +09:00
Kohya S	2889108d85	feat: Add --cpu_offload_checkpointing option to LoRA training	2024-09-05 20:58:33 +09:00
Kohya S	b65ae9b439	T5XXL LoRA training, fp8 T5XXL support	2024-09-04 21:33:17 +09:00
Akegarasu	35882f8d5b	fix	2024-08-29 23:03:43 +08:00
Akegarasu	34f2315047	fix: text_encoder_conds referenced before assignment	2024-08-29 22:33:37 +08:00
Kohya S	0087a46e14	FLUX.1 LoRA supports CLIP-L	2024-08-27 19:59:40 +09:00
Kohya S	9e72be0a13	Fix debug_dataset to work	2024-08-20 08:19:00 +09:00
Kohya S.	e2d822cad7	Merge pull request #1452 from fireicewolf/sd3-devel Fix AttributeError: 'T5EncoderModel' object has no attribute 'text_model', while loading T5 model in GPU.	2024-08-15 21:12:19 +09:00
Kohya S	7db4222119	add sample image generation during training	2024-08-14 22:15:26 +09:00
DukeG	9760d097b0	Fix AttributeError: 'T5EncoderModel' object has no attribute 'text_model' While loading T5 model in GPU.	2024-08-14 19:58:54 +08:00
Kohya S	8a0f12dde8	update FLUX LoRA training	2024-08-10 23:42:05 +09:00
Kohya S	36b2e6fc28	add FLUX.1 LoRA training	2024-08-09 22:56:48 +09:00
gesen2egee	cdb2d9c516	Update train_network.py	2024-08-04 17:36:34 +08:00
gesen2egee	aa850aa531	Update train_network.py	2024-08-04 17:34:20 +08:00
gesen2egee	f6dbf7c419	Update train_network.py	2024-08-04 15:18:53 +08:00
gesen2egee	a593e837f3	Update train_network.py	2024-08-04 15:17:30 +08:00
gesen2egee	b9bdd10129	Update train_network.py	2024-08-04 15:11:26 +08:00
gesen2egee	31507b9901	Remove unnecessary is_train changes and use apply_debiased_estimation to calculate validation loss. Balances the influence of different time steps on training performance (without affecting actual training results)	2024-08-02 13:15:21 +08:00
Kohya S	41dee60383	Refactor caching mechanism for latents and text encoder outputs, etc.	2024-07-27 13:50:05 +09:00
Kohya S	4dbcef429b	update for corner cases	2024-06-04 21:26:55 +09:00
Kohaku-Blueleaf	3eb27ced52	Skip the final 1 step	2024-05-31 12:24:15 +08:00
Kohaku-Blueleaf	b2363f1021	Final implementation	2024-05-31 12:20:20 +08:00
Kohya S	da6fea3d97	simplify and update alpha mask to work with various cases	2024-05-19 21:26:18 +09:00
u-haru	db6752901f	画像のアルファチャンネルをlossのマスクとして使用するオプションを追加 (#1223 ) * Add alpha_mask parameter and apply masked loss * Fix type hint in trim_and_resize_if_required function * Refactor code to use keyword arguments in train_util.py * Fix alpha mask flipping logic * Fix alpha mask initialization * Fix alpha_mask transformation * Cache alpha_mask * Update alpha_masks to be on CPU * Set flipped_alpha_masks to Null if option disabled * Check if alpha_mask is None * Set alpha_mask to None if option disabled * Add description of alpha_mask option to docs	2024-05-19 19:07:25 +09:00
Kohya S	c68baae480	add `--log_config` option to enable/disable output training config	2024-05-19 17:21:04 +09:00
Kohya S	47187f7079	Merge pull request #1285 from ccharest93/main Hyperparameter tracking	2024-05-19 16:31:33 +09:00
Kohya S	52e64c69cf	add debug log	2024-05-04 18:43:52 +09:00
Kohya S	58c2d856ae	support block dim/lr for sdxl	2024-05-03 22:18:20 +09:00
Kohya S	969f82ab47	move loraplus args from args to network_args, simplify log lr desc	2024-04-29 20:04:25 +09:00
Kohya S	834445a1d6	Merge pull request #1233 from rockerBOO/lora-plus Add LoRA+ support	2024-04-29 18:05:12 +09:00
Kohya S	0540c33aca	pop weights if available #1247	2024-04-21 17:45:29 +09:00
Kohya S	52652cba1a	disable main process check for deepspeed #1247	2024-04-21 17:41:32 +09:00
Maatra	2c9db5d9f2	passing filtered hyperparameters to accelerate	2024-04-20 14:11:43 +01:00
gesen2egee	086f6000f2	Merge branch 'main' into val	2024-04-11 01:14:46 +08:00
rockerBOO	75833e84a1	Fix default LR, Add overall LoRA+ ratio, Add log `--loraplus_ratio` added for both TE and UNet Add log for lora+	2024-04-08 19:23:02 -04:00
Kohya S	d30ebb205c	update readme, add metadata for network module	2024-04-07 14:58:17 +09:00
kabachuha	90b18795fc	Add option to use Scheduled Huber Loss in all training pipelines to improve resilience to data corruption (#1228 ) * add huber loss and huber_c compute to train_util * add reduction modes * add huber_c retrieval from timestep getter * move get timesteps and huber to own function * add conditional loss to all training scripts * add cond loss to train network * add (scheduled) huber_loss to args * fixup twice timesteps getting * PHL-schedule should depend on noise scheduler's num timesteps * 2 multiplier to huber loss cause of 1/2 a^2 conv. The Taylor expansion of sqrt near zero gives 1/2 a^2, which differs from a^2 of the standard MSE loss. This change scales them better against one another add option for smooth l1 (huber / delta) * unify huber scheduling * add snr huber scheduler --------- Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com>	2024-04-07 13:54:21 +09:00

1 2 3 4 5 ...

379 Commits