Kohya-ss-sd-scripts

mirror of https://github.com/kohya-ss/sd-scripts.git synced 2026-04-16 00:49:40 +00:00

Author	SHA1	Message	Date
Dave Lage	862e5e8ef1	Merge `4888327caa` into `e21a7736f8`	2026-02-11 01:09:24 +01:00
duongve13112002	e21a7736f8	Support Anima model (#2260 ) * Support Anima model * Update document and fix bug * Fix latent normlization * Fix typo * Fix cache embedding * fix typo in tests/test_anima_cache.py * Remove redundant argument apply_t5_attn_mask * Improving caching with argument caption_dropout_rate * Fix W&B logging bugs * Fix discrete_flow_shift default value	2026-02-08 10:18:55 +09:00
Kohya S.	b996440c5f	Doc update sd3 branch documentation (#2253 ) * doc: move sample prompt file documentation, and remove history for branch * doc: remove outdated FLUX.1 and SD3 training information from README * doc: update README and training documentation for clarity and structure	2026-01-19 21:38:46 +09:00
Kohya S.	a9af52692a	feat: add pyramid noise and noise offset options to generation script (#2252 ) * feat: add pyramid noise and noise offset options to generation script * fix: fix to work with SD1.5 models * doc: update to match with latest gen_img.py * doc: update README to clarify script capabilities and remove deprecated sections	2026-01-18 16:56:48 +09:00
Kohya S.	c6bc632ec6	fix: metadata dataset degradation and make it work (#2186 ) * fix: support dataset with metadata * feat: support another tagger model * fix: improve handling of image size and caption/tag processing in FineTuningDataset * fix: enhance metadata loading to support JSONL format in FineTuningDataset * feat: enhance image loading and processing in ImageLoadingPrepDataset with batch support and output options * fix: improve image path handling and memory management in dataset classes * Update finetune/tag_images_by_wd14_tagger.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: add return type annotation for process_tag_replacement function and ensure tags are returned * feat: add artist category threshold for tagging * doc: add comment for clarification --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-18 15:17:07 +09:00
Kohya S.	f7f971f50d	Merge pull request #2251 from kohya-ss/fix-pytest-for-lumina fix(tests): add ip_noise_gamma args for MockArgs in pytest	2026-01-18 15:09:47 +09:00
Kohya S	c4be615f69	fix(tests): add ip_noise_gamma args for MockArgs in pytest	2026-01-18 15:05:57 +09:00
Kohya S.	e06e063970	Merge pull request #2225 from urlesistiana/sd3_lumina2_ts_fix fix: lumina 2 timesteps handling	2026-01-18 14:39:04 +09:00
Kohya S.	94e3dbebea	Merge pull request #2246 from kozistr/deps/pytorch-optimizer Bump `pytorch-optimizer` version to v3.9.0	2025-12-21 22:51:32 +09:00
kozistr	95a65b89a5	build(deps): bump pytorch-optimizer to v3.9.0	2025-12-21 15:53:47 +09:00
rockerBOO	4888327caa	Fix tests	2025-11-17 11:34:09 -05:00
rockerBOO	cc0e4acf1b	Remove timestep_index	2025-11-17 11:26:38 -05:00
rockerBOO	7a08c52aa4	Add error if with CDC if cache_latents or cache_latents_to_disk is not set	2025-11-03 21:47:15 -05:00
rockerBOO	377299851a	Fix cdc cache file validation	2025-11-02 23:22:10 -05:00
rockerBOO	03947ca465	Add multi-resolution test	2025-10-30 23:27:43 -04:00
rockerBOO	b4e5d09871	Fix multi-resolution support in cached files	2025-10-30 23:27:13 -04:00
rockerBOO	0dfafb4fff	Remove deprecated cdc cache path	2025-10-18 17:59:12 -04:00
rockerBOO	c820acee58	Fix CDC tests to new format and deprecate old tests	2025-10-18 14:36:13 -04:00
rockerBOO	83c17de61f	Remove faiss, save per image cdc file	2025-10-18 14:07:55 -04:00
Kohya S.	a5a162044c	Merge pull request #2226 from kohya-ss/fix-hunyuan-image-batch-gen-error fix: error on batch generation closes #2209	2025-10-15 21:57:45 +09:00
Kohya S	a33cad714e	fix: error on batch generation closes #2209	2025-10-15 21:57:11 +09:00
urlesistiana	f7fc7ddda2	fix #2201 : lumina 2 timesteps handling	2025-10-13 16:08:28 +08:00
rockerBOO	1f79115c6c	Consolidate and simplify CDC test files - Merged redundant test files - Removed 'comprehensive' from file and docstring names - Improved test organization and clarity - Ensured all tests continue to pass - Simplified test documentation	2025-10-11 17:48:08 -04:00
rockerBOO	8089cb6925	Improve dimension mismatch warning for CDC Flow Matching - Add explicit warning and tracking for multiple unique latent shapes - Simplify test imports by removing unused modules - Minor formatting improvements in print statements - Ensure log messages provide clear context about dimension mismatches	2025-10-11 17:17:09 -04:00
rockerBOO	aa3a216106	Slight cleanup	2025-10-11 16:15:35 -04:00
rockerBOO	8458a5696e	Add graceful fallback when FAISS is not installed - Make FAISS import optional with try/except - CDCPreprocessor raises helpful ImportError if FAISS unavailable - train_util.py catches ImportError and returns None - train_network.py checks for None and warns user - Training continues without CDC-FM if FAISS not installed - Remove benchmark file (not needed in repo) This allows users to run training without FAISS dependency. CDC-FM will be automatically disabled with a warning if FAISS is missing.	2025-10-09 23:50:07 -04:00
rockerBOO	7ca799ca26	Add adaptive k_neighbors support for CDC-FM - Add --cdc_adaptive_k flag to enable adaptive k based on bucket size - Add --cdc_min_bucket_size to set minimum bucket threshold (default: 16) - Fixed mode (default): Skip buckets with < k_neighbors samples - Adaptive mode: Use k=min(k_neighbors, bucket_size-1) for buckets >= min_bucket_size - Update CDCPreprocessor to support adaptive k per bucket - Add metadata tracking for adaptive_k and min_bucket_size - Add comprehensive pytest tests for adaptive k behavior This allows CDC-FM to work effectively with multi-resolution bucketing where bucket sizes may vary widely. Users can choose between strict paper methodology (fixed k) or pragmatic approach (adaptive k).	2025-10-09 23:16:44 -04:00
rockerBOO	f450443fe4	Add CDC-FM parameters to model metadata - Add ss_use_cdc_fm, ss_cdc_k_neighbors, ss_cdc_k_bandwidth, ss_cdc_d_cdc, ss_cdc_gamma - Ensures CDC-FM training parameters are tracked in model metadata - Enables reproducibility and model provenance tracking	2025-10-09 22:51:47 -04:00
rockerBOO	20c6ae5a9a	Add faiss to github action	2025-10-09 18:34:37 -04:00
rockerBOO	f128f5a645	Formatting cleanup	2025-10-09 18:28:51 -04:00
rockerBOO	c8a4e99074	Add --cdc_debug flag and tqdm progress for CDC preprocessing - Add --cdc_debug flag to enable verbose bucket-by-bucket output - When debug=False (default): Show tqdm progress bar, concise logging - When debug=True: Show detailed bucket information, no progress bar - Improves user experience during CDC cache generation	2025-10-09 18:28:51 -04:00
rockerBOO	7a7110cdc6	Use logger instead of print for CDC loading messages	2025-10-09 18:28:51 -04:00
rockerBOO	1d4c4d4cb2	Fix: Replace CDC integer index lookup with image_key strings Fixes shape mismatch bug in multi-subset training where CDC preprocessing and training used different index calculations, causing wrong CDC data to be loaded for samples. Changes: - CDC cache now stores/loads data using image_key strings instead of integer indices - Training passes image_key list instead of computed integer indices - All CDC lookups use stable image_key identifiers - Improved device compatibility check (handles "cuda" vs "cuda:0") - Updated all 30 CDC tests to use image_key-based access Root cause: Preprocessing used cumulative dataset indices while training used sorted keys, resulting in mismatched lookups during shuffled multi-subset training.	2025-10-09 18:28:51 -04:00
rockerBOO	4bea582601	Fix: Prevent false device mismatch warnings for cuda vs cuda:0 - Treat cuda and cuda:0 as compatible devices - Only warn on actual device mismatches (cuda vs cpu) - Eliminates warning spam during multi-subset training	2025-10-09 18:28:51 -04:00
rockerBOO	ee8ceee178	Add device consistency validation for CDC transformation - Check that noise and CDC matrices are on same device - Automatically transfer noise if device mismatch detected - Warn user when device transfer occurs - Add tests to verify device handling	2025-10-09 18:28:51 -04:00
rockerBOO	ce17007e1a	Add warning throttling for CDC shape mismatches - Track warned samples in global set to prevent log spam - Each sample only warned once per training session - Prevents thousands of duplicate warnings during training - Add tests to verify throttling behavior	2025-10-09 18:28:50 -04:00
rockerBOO	88af20881d	Fix: Enable gradient flow through CDC noise transformation - Remove @torch.no_grad() decorator from compute_sigma_t_x() - Gradients now properly flow through CDC transformation during training - Add comprehensive gradient flow tests for fast/slow paths and fallback - All 25 CDC tests passing	2025-10-09 18:28:50 -04:00
rockerBOO	0d822b2f74	Refactor: Extract CDC noise transformation to separate function - Create apply_cdc_noise_transformation() for better modularity - Implement fast path for batch processing when all shapes match - Implement slow path for per-sample processing on shape mismatch - Clone noise tensors in fallback path for gradient consistency	2025-10-09 18:28:50 -04:00
rockerBOO	e03200bdba	Optimize: Cache CDC shapes in memory to eliminate I/O bottleneck - Cache all shapes during GammaBDataset initialization - Eliminates file I/O on every training step (9.5M accesses/sec) - Reduces get_shape() from file operation to dict lookup - Memory overhead: ~126 bytes/sample (~12.6 MB per 100k images)	2025-10-09 18:28:50 -04:00
rockerBOO	f552f9a3bd	Add CDC-FM (Carré du Champ Flow Matching) support Implements geometry-aware noise generation for FLUX training based on arXiv:2510.05930v1.	2025-10-09 18:28:47 -04:00
Kohya S.	5e366acda4	Merge pull request #2003 from laolongboy/sd3-dev Fix missing parameters in model conversion script	2025-10-01 21:03:12 +09:00
Kohya S	5462a6bb24	Merge branch 'dev' into sd3	2025-09-29 21:02:02 +09:00
Kohya S	63711390a0	Merge branch 'main' into dev	2025-09-29 20:56:07 +09:00
Kohya S.	206adb6438	Merge pull request #2216 from kohya-ss/fix-sdxl-textual-inversion-training-disable-mmap fix: disable_mmap_safetensors not defined in SDXL TI training	2025-09-29 20:55:02 +09:00
Kohya S	60bfa97b19	fix: disable_mmap_safetensors not defined in SDXL TI training	2025-09-29 20:52:48 +09:00
Kohya S.	f0c767e0f2	Merge pull request #2213 from kohya-ss/doc-hunyuan-image-training-text-encoder-cpu-note docs: enhance text encoder CPU usage instructions for HunyuanImage-2.…	2025-09-28 18:32:11 +09:00
kohya-ss	a0c26a0efa	docs: enhance text encoder CPU usage instructions for HunyuanImage-2.1 training	2025-09-28 18:21:25 +09:00
Kohya S.	67d0621313	Merge pull request #2212 from kohya-ss/fix-hunyuan-image-sample-generation fix: HunyuanImage-2.1 sample generation fails	2025-09-28 18:12:04 +09:00
Kohya S	6a826d21b1	feat: add new parameters for sample image inference configuration	2025-09-28 18:06:17 +09:00
Kohya S.	4c197a538b	Merge pull request #2207 from kohya-ss/fix-flux-extract-lora-metadata-failed fix: update metadata construction to include model_config for flux	2025-09-24 21:19:27 +09:00

1 2 3 4 5 ...

2502 Commits