Commit Graph

46 Commits

Author SHA1 Message Date
Hina Chen
05bb9183fa Add Validation loss for LoRA training 2024-12-27 16:47:59 +08:00
Kohya S
5e86323f12 Update README and clean-up the code for SD3 timesteps 2024-11-07 21:27:12 +09:00
Kohya S
3cc5b8db99 Diff Output Preserv loss for SDXL 2024-10-18 20:57:13 +09:00
Kohya S
41dee60383 Refactor caching mechanism for latents and text encoder outputs, etc. 2024-07-27 13:50:05 +09:00
Kohya S
e8cfd4ba1d fix to work cond mask and alpha mask 2024-05-26 22:01:37 +09:00
Kohya S
da6fea3d97 simplify and update alpha mask to work with various cases 2024-05-19 21:26:18 +09:00
u-haru
db6752901f 画像のアルファチャンネルをlossのマスクとして使用するオプションを追加 (#1223)
* Add alpha_mask parameter and apply masked loss

* Fix type hint in trim_and_resize_if_required function

* Refactor code to use keyword arguments in train_util.py

* Fix alpha mask flipping logic

* Fix alpha mask initialization

* Fix alpha_mask transformation

* Cache alpha_mask

* Update alpha_masks to be on CPU

* Set flipped_alpha_masks to Null if option disabled

* Check if alpha_mask is None

* Set alpha_mask to None if option disabled

* Add description of alpha_mask option to docs
2024-05-19 19:07:25 +09:00
Kohya S
a384bf2187 Merge pull request #1313 from rockerBOO/patch-3
Add caption_separator to output for subset
2024-05-12 21:36:56 +09:00
Dave Lage
8db0cadcee Add caption_separator to output for subset 2024-05-02 18:08:28 -04:00
Dave Lage
dbb7bb288e Fix caption_separator missing in subset schema 2024-05-02 17:39:35 -04:00
Kohya S
c86e356013 Merge branch 'dev' into dataset-cache 2024-03-26 19:43:40 +09:00
Kohya S
025347214d refactor metadata caching for DreamBooth dataset 2024-03-24 18:09:32 +09:00
Kohaku-Blueleaf
ae97c8bfd1 [Experimental] Add cache mechanism for dataset groups to avoid long waiting time for initilization (#1178)
* support meta cached dataset

* add cache meta scripts

* random ip_noise_gamma strength

* random noise_offset strength

* use correct settings for parser

* cache path/caption/size only

* revert mess up commit

* revert mess up commit

* Update requirements.txt

* Add arguments for meta cache.

* remove pickle implementation

* Return sizes when enable cache

---------

Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com>
2024-03-24 15:40:18 +09:00
Kohya S
3419c3de0d common masked loss func, apply to all training script 2024-03-17 19:30:20 +09:00
Kohya S
f2c727fc8c add minimal impl for masked loss 2024-02-26 23:19:58 +09:00
Kohya S
577e9913ca add some new dataset settings 2024-02-26 20:01:25 +09:00
Yuta Hayashibe
5f6bf29e52 Replace print with logger if they are logs (#905)
* Add get_my_logger()

* Use logger instead of print

* Fix log level

* Removed line-breaks for readability

* Use setup_logging()

* Add rich to requirements.txt

* Make simple

* Use logger instead of print

---------

Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com>
2024-02-04 18:14:34 +09:00
Kohya S
fef172966f Add network_multiplier for dataset and train LoRA 2024-01-20 16:24:43 +09:00
Kohya S
5a1ebc4c7c format by black 2024-01-20 13:10:45 +09:00
Furqanil Taqwa
4a913ce61e initialize keep_tokens_separator to dataset config 2023-11-28 17:22:35 +07:00
Kohaku-Blueleaf
489b728dbc Fix typo again 2023-10-30 20:19:51 +08:00
Kohaku-Blueleaf
5dc2a0d3fd Add custom seperator 2023-10-30 19:55:30 +08:00
青龍聖者@bdsqlsz
d5be8125b0 update bitsandbytes for 0.41.1 and fixed bugs with generate_controlnet_subsets_config for training (#823)
* update for bnb 0.41.1

* fixed generate_controlnet_subsets_config for training

* Revert "update for bnb 0.41.1"

This reverts commit 70bd3612d8.
2023-09-24 10:51:47 +09:00
Kohya S
948cf17499 add caption_prefix/suffix to dataset 2023-09-02 16:17:12 +09:00
Kohya S
ce46aa0c3b remove debug print 2023-07-04 21:34:18 +09:00
Kohya S
71a6d49d06 fix to work train_network with fine-tuning dataset 2023-06-28 07:50:53 +09:00
Kohya S
9e9df2b501 update dataset to return size, refactor ctrlnet ds 2023-06-24 17:56:02 +09:00
Kohya S
5114e8daf1 fix training scripts except controlnet not working 2023-06-22 08:46:53 +09:00
ddPn08
62d00b4520 add controlnet training 2023-06-01 20:48:25 +09:00
Kohya S
f407f5a686 Merge pull request #352 from rockerBOO/dataset-config
Open dataset_config json file before load
2023-04-03 21:31:55 +09:00
u-haru
94441fa746 繰り返し回数のないディレクトリの名前表示修正 2023-03-31 02:26:54 +09:00
rockerBOO
313f3e8286 Open dataset_config json file before load 2023-03-30 12:08:04 -04:00
Kohya S
14891523ce fix seed for each dataset to make shuffling same 2023-03-26 22:17:03 +09:00
u-haru
a4b34a9c3c blueprint_args_conflictは不要なため削除、shuffleが毎回行われる不具合修正 2023-03-26 03:26:55 +09:00
u-haru
292cdb8379 データセットにepoch、stepが通達されないバグ修正 2023-03-26 01:44:25 +09:00
u-haru
143c26e552 競合時にpersistant_data_loader側を無効にするように変更 2023-03-24 13:08:56 +09:00
u-haru
dbadc40ec2 persistent_workersを有効にした際にキャプションが変化しなくなるバグ修正 2023-03-23 12:33:03 +09:00
u-haru
447c56bf50 typo修正、stepをglobal_stepに修正、バグ修正 2023-03-23 09:53:14 +09:00
u-haru
a9b26b73e0 implement token warmup 2023-03-23 07:37:14 +09:00
Kohya S
46aee85d2a re2-fix to support python 3.8/3.9 2023-03-05 23:27:16 +09:00
Kohya S
2ae33db83f re-fix to support python 3.8/3.9 2023-03-05 22:35:32 +09:00
Kohya S
dd39e5d944 hope to support python 3.8/3.9 2023-03-05 20:04:18 +09:00
Kohya S
5602e0e5fc change dataset config option to dataset_config 2023-03-02 21:51:58 +09:00
Kohya S
83bfb54f20 fix num_repeats not working in DB classic dataset 2023-03-02 19:01:22 +09:00
Kohya S
d1d7d432e9 print dataset index in making buckets 2023-03-01 21:30:12 +09:00
fur0ut0
8abb8645ae add detail dataset config feature by extra config file (#227)
* add config file schema

* change config file specification

* refactor config utility

* unify batch_size to train_batch_size

* fix indent size

* use batch_size instead of train_batch_size

* make cache_latents configurable on subset

* rename options
* bucket_repo_range
* shuffle_keep_tokens

* update readme

* revert to min_bucket_reso & max_bucket_reso

* use subset structure in dataset

* format import lines

* split mode specific options

* use only valid subset

* change valid subsets name

* manage multiple datasets by dataset group

* update config file sanitizer

* prune redundant validation

* add comments

* update type annotation

* rename json_file_name to metadata_file

* ignore when image dir is invalid

* fix tag shuffle and dropout

* ignore duplicated subset

* add method to check latent cachability

* fix format

* fix bug

* update caption dropout default values

* update annotation

* fix bug

* add option to enable bucket shuffle across dataset

* update blueprint generate function

* use blueprint generator for dataset initialization

* delete duplicated function

* update config readme

* delete debug print

* print dataset and subset info as info

* enable bucket_shuffle_across_dataset option

* update config readme for clarification

* compensate quotes for string option example

* fix bug of bad usage of join

* conserve trained metadata backward compatibility

* enable shuffle in data loader by default

* delete resolved TODO

* add comment for image data handling

* fix reference bug

* fix undefined variable bug

* prevent raise overwriting

* assert image_dir and metadata_file validity

* add debug message for ignoring subset

* fix inconsistent import statement

* loosen too strict validation on float value

* sanitize argument parser separately

* make image_dir optional for fine tuning dataset

* fix import

* fix trailing characters in print

* parse flexible dataset config deterministically

* use relative import

* print supplementary message for parsing error

* add note about different methods

* add note of benefit of separate dataset

* add error example

* add note for english readme plan

---------

Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com>
2023-03-01 20:58:08 +09:00