Kohya-ss-sd-scripts

mirror of https://github.com/kohya-ss/sd-scripts.git synced 2026-04-10 15:00:23 +00:00

Author	SHA1	Message	Date
Kohya S.	5cdad10de5	Fix/leco cleanup (#2294 ) * feat: SD1.x/2.x と SDXL 向けの LECO 学習スクリプトを追加 (#2285) * Add LECO training script and associated tests - Implemented `sdxl_train_leco.py` for training with LECO prompts, including argument parsing, model setup, training loop, and weight saving functionality. - Created unit tests for `load_prompt_settings` in `test_leco_train_util.py` to validate loading of prompt configurations in both original and slider formats. - Added basic syntax tests for `train_leco.py` and `sdxl_train_leco.py` to ensure modules are importable. * fix: use getattr for safe attribute access in argument verification * feat: add CUDA device compatibility validation and corresponding tests * Revert "feat: add CUDA device compatibility validation and corresponding tests" This reverts commit `6d3e51431b`. * feat: update predict_noise_xl to use vector embedding from add_time_ids * feat: implement checkpointing in predict_noise and predict_noise_xl functions * feat: remove unused submodules and update .gitignore to exclude .codex-tmp --------- Co-authored-by: Kohya S. <52813779+kohya-ss@users.noreply.github.com> * fix: format * fix: LECO PR #2285 のレビュー指摘事項を修正 - train_util.py/deepspeed_utils.py の getattr 化を元に戻し、LECO パーサーにダミー引数を追加 - sdxl_train_util のモジュールレベルインポートをローカルインポートに変更 - PromptEmbedsCache.__getitem__ でキャッシュミス時に KeyError を送出するよう修正 - 設定ファイル形式を YAML から TOML に変更（リポジトリの規約に統一） - 重複コード (build_network_kwargs, get_save_extension, save_weights) を leco_train_util.py に統合 - _expand_slider_target の冗長な PromptSettings 構築を簡素化 - add_time_ids 用に専用の batch_add_time_ids 関数を追加 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: LECO 学習ガイドを大幅に拡充コマンドライン引数の全カテゴリ別解説、プロンプト TOML の全フィールド説明、 2つの guidance_scale の違い、推奨設定表、YAML からの変換ガイド等を追加。英語本文と日本語折り畳みの二言語構成。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply_noise_offset の dtype 不一致を修正 torch.randn のデフォルト float32 により latents が暗黙的にアップキャストされる問題を修正。 float32/CPU で生成後に latents の dtype/device へ変換する安全なパターンを採用。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Umisetokikaze <52318966+umisetokikaze@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 20:41:43 +09:00
Kohya S.	5f793fb0f4	Log d*lr for ProdigyPlusScheduleFree (#2289 )	2026-03-29 18:47:09 +09:00
woctordho	343c929e39	Log d*lr for ProdigyPlusScheduleFree	2026-03-21 11:09:56 +08:00
woctordho	1cd95b2d8b	Add `skip_image_resolution` to deduplicate multi-resolution dataset (#2273 ) * Add min_orig_resolution and max_orig_resolution * Rename min_orig_resolution to skip_image_resolution; remove max_orig_resolution * Change skip_image_resolution to tuple * Move filtering to __init__ * Minor fix	2026-03-19 08:43:39 +09:00
Kohya S.	34e7138b6a	Add/modify some implementation for anima (#2261 ) * fix: update extend-exclude list in _typos.toml to include configs * fix: exclude anima tests from pytest * feat: add entry for 'temperal' in extend-words section of _typos.toml for Qwen-Image VAE * fix: update default value for --discrete_flow_shift in anima training guide * feat: add Qwen-Image VAE * feat: simplify encode_tokens * feat: use unified attention module, add wrapper for state dict compatibility * feat: loading with dynamic fp8 optimization and LoRA support * feat: add anima minimal inference script (WIP) * format: format * feat: simplify target module selection by regular expression patterns * feat: kept caption dropout rate in cache and handle in training script * feat: update train_llm_adapter and verbose default values to string type * fix: use strategy instead of using tokenizers directly * feat: add dtype property and all-zero mask handling in cross-attention in LLMAdapterTransformerBlock * feat: support 5d tensor in get_noisy_model_input_and_timesteps * feat: update loss calculation to support 5d tensor * fix: update argument names in anima_train_utils to align with other archtectures * feat: simplify Anima training script and update empty caption handling * feat: support LoRA format without `net.` prefix * fix: update to work fp8_scaled option * feat: add regex-based learning rates and dimensions handling in create_network * fix: improve regex matching for module selection and learning rates in LoRANetwork * fix: update logging message for regex match in LoRANetwork * fix: keep latents 4D except DiT call * feat: enhance block swap functionality for inference and training in Anima model * feat: refactor Anima training script * feat: optimize VAE processing by adjusting tensor dimensions and data types * fix: wait all block trasfer before siwtching offloader mode * feat: update Anima training guide with new argument specifications and regex-based module selection. Thank you Claude! * feat: support LORA for Qwen3 * feat: update Anima SAI model spec metadata handling * fix: remove unused code * feat: split CFG processing in do_sample function to reduce memory usage * feat: add VAE chunking and caching options to reduce memory usage * feat: optimize RMSNorm forward method and remove unused torch_attention_op * Update library/strategy_anima.py Use torch.all instead of all. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/safetensors_utils.py Fix duplicated new_key for concat_hook. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_minimal_inference.py Remove unused code. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_train.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/anima_train_utils.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: review with Copilot * feat: add script to convert LoRA format to ComfyUI compatible format (WIP, not tested yet) * feat: add process_escape function to handle escape sequences in prompts * feat: enhance LoRA weight handling in model loading and add text encoder loading function * feat: improve ComfyUI conversion script with prefix constants and module name adjustments * feat: update caption dropout documentation to clarify cache regeneration requirement * feat: add clarification on learning rate adjustments * feat: add note on PyTorch version requirement to prevent NaN loss --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-13 08:15:06 +09:00
duongve13112002	e21a7736f8	Support Anima model (#2260 ) * Support Anima model * Update document and fix bug * Fix latent normlization * Fix typo * Fix cache embedding * fix typo in tests/test_anima_cache.py * Remove redundant argument apply_t5_attn_mask * Improving caching with argument caption_dropout_rate * Fix W&B logging bugs * Fix discrete_flow_shift default value	2026-02-08 10:18:55 +09:00
Kohya S.	c6bc632ec6	fix: metadata dataset degradation and make it work (#2186 ) * fix: support dataset with metadata * feat: support another tagger model * fix: improve handling of image size and caption/tag processing in FineTuningDataset * fix: enhance metadata loading to support JSONL format in FineTuningDataset * feat: enhance image loading and processing in ImageLoadingPrepDataset with batch support and output options * fix: improve image path handling and memory management in dataset classes * Update finetune/tag_images_by_wd14_tagger.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: add return type annotation for process_tag_replacement function and ensure tags are returned * feat: add artist category threshold for tagging * doc: add comment for clarification --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-18 15:17:07 +09:00
Kohya S	209c02dbb6	feat: HunyuanImage LoRA training	2025-09-12 21:40:42 +09:00
Kohya S	7f983c558d	feat: block swap for inference and initial impl for HunyuanImage LoRA (not working)	2025-09-11 22:15:22 +09:00
Kohya S	6edbe00547	feat: update libraries, remove warnings	2025-08-16 20:07:03 +09:00
rockerBOO	d24d733892	Update model spec to 1.0.1. Refactor model spec	2025-08-02 21:14:27 -04:00
Kohya S	9eda938876	Merge branch 'sd3' into feature-chroma-support	2025-07-21 13:32:22 +09:00
Kohya S.	d98400b06e	Merge pull request #2138 from kohya-ss/feature-lumina-image Feature lumina image	2025-07-21 13:21:26 +09:00
Kohya S	b4e862626a	feat: add LoRA training support for Chroma	2025-07-20 19:00:09 +09:00
Dave Lage	3adbbb6e33	Add note about why we are moving it	2025-07-16 16:09:20 -04:00
rockerBOO	a7b33f3204	Fix alphas cumprod after add_noise for DDIMScheduler	2025-07-15 22:36:46 -04:00
Kohya S	30295c9668	fix: update parameter names for CFG truncate and Renorm CFG in documentation and code	2025-07-13 21:00:27 +09:00
rockerBOO	0e929f97b9	Revert system_prompt for dataset config	2025-06-16 16:50:18 -04:00
rockerBOO	0145efc2f2	Merge branch 'sd3' into lumina	2025-06-09 18:13:06 -04:00
Kohya S.	7c075a9c8d	Merge pull request #2060 from saibit-tech/sd3 Fix: try aligning dtype of matrixes when training with deepspeed and mixed-precision is set to bf16 or fp16	2025-05-01 23:20:17 +09:00
Kohya S	64430eb9b2	Merge branch 'dev' into sd3	2025-04-29 21:30:57 +09:00
Kohya S	d8717a3d1c	Merge branch 'main' into dev	2025-04-29 21:30:33 +09:00
Kohya S	4625b34f4e	Fix mean image aspect ratio error calculation to avoid NaN values	2025-04-29 21:27:04 +09:00
sdbds	4fc917821a	fix bugs	2025-04-23 16:16:36 +08:00
sdbds	899f3454b6	update for init problem	2025-04-23 15:47:12 +08:00
saibit	7c61c0dfe0	Add autocast warpper for forward functions in deepspeed_utils.py to try aligning precision when using mixed precision in training process	2025-04-22 16:06:55 +08:00
Kohya S	629073cd9d	Add guidance scale for prompt param and flux sampling	2025-04-16 21:50:36 +09:00
sdbds	7f93e21f30	fix typo	2025-04-06 16:21:48 +08:00
青龍聖者@bdsqlsz	9f1892cc8e	Merge branch 'sd3' into lumina	2025-04-06 16:13:43 +08:00
Kohya S	f1423a7229	fix: add resize_interpolation parameter to FineTuningDataset constructor	2025-04-03 21:48:51 +09:00
Kohya S	b3c56b22bd	Merge branch 'dev' into sd3	2025-03-31 22:05:40 +09:00
Kohya S	1f432e2c0e	use PIL for lanczos and box	2025-03-30 20:40:29 +09:00
Kohya S.	93a4efabb5	Merge branch 'sd3' into resize-interpolation	2025-03-30 19:30:56 +09:00
Disty0	620a06f517	Check for uppercase file extension too	2025-03-17 17:44:29 +03:00
rockerBOO	1f22a94cfe	Update embedder_dims, add more flexible caption extension	2025-03-04 02:25:50 -05:00
rockerBOO	ce2610d29b	Change system prompt to inject Prompt Start special token	2025-02-27 02:47:04 -05:00
rockerBOO	7b83d50dc0	Merge branch 'sd3' into lumina	2025-02-26 22:13:56 -05:00
Disty0	9a415ba965	JPEG XL support	2025-02-27 00:21:57 +03:00
sdbds	fc772affbe	1、Implement cfg_trunc calculation directly using timesteps, without intermediate steps. 2、Deprecate and remove the guidance_scale parameter because it used in inference not train 3、Add inference command-line arguments --ct for cfg_trunc_ratio and --rc for renorm_cfg to control CFG truncation and renormalization during inference.	2025-02-24 14:10:24 +08:00
rockerBOO	42a801514c	Fix system prompt in datasets	2025-02-23 13:48:37 -05:00
rockerBOO	025cca699b	Fix samples, LoRA training. Add system prompt, use_flash_attn	2025-02-23 01:29:18 -05:00
Kohya S	efb2a128cd	fix wandb val logging	2025-02-21 22:07:35 +09:00
rockerBOO	7f2747176b	Use resize_image where resizing is required	2025-02-19 14:20:40 -05:00
rockerBOO	545425c13e	Typo	2025-02-19 14:20:40 -05:00
rockerBOO	d0128d18be	Add resize interpolation CLI option	2025-02-19 14:20:40 -05:00
rockerBOO	58e9e146a3	Add resize interpolation configuration	2025-02-19 14:20:40 -05:00
Kohya S	dc7d5fb459	Merge branch 'sd3' into val-loss-improvement	2025-02-18 21:34:30 +09:00
rockerBOO	9436b41061	Fix validation split and add test	2025-02-17 14:28:41 -05:00
rockerBOO	3ed7606f88	Clear sizes for validation reg images to be consistent	2025-02-17 12:07:23 -05:00
rockerBOO	3365cfadd7	Fix sizes for validation split	2025-02-17 12:07:23 -05:00

1 2 3 4 5 ...

563 Commits