Kohya S
fbb98f144e
Merge branch 'dev' into deep-speed
2024-03-20 18:15:26 +09:00
Kohya S
9b6b39f204
Merge branch 'dev' into masked-loss
2024-03-20 18:14:36 +09:00
Kohya S
855add067b
update option help and readme
2024-03-20 18:14:05 +09:00
Kohya S
bf6cd4b9da
Merge pull request #1168 from gesen2egee/save_state_on_train_end
...
Save state on train end
2024-03-20 18:02:13 +09:00
Kohya S
119cc99fb0
Merge pull request #1167 from Horizon1704/patch-1
...
Add "encoding='utf-8'" for --config_file
2024-03-20 17:39:08 +09:00
Kohya S
3419c3de0d
common masked loss func, apply to all training script
2024-03-17 19:30:20 +09:00
Kohya S
7081a0cf0f
extension of src image could be different than target image
2024-03-17 18:09:15 +09:00
gesen2egee
b5e8045df4
fix control net
2024-03-16 11:51:41 +08:00
kblueleaf
53954a1e2e
use correct settings for parser
2024-03-13 18:21:49 +08:00
kblueleaf
86399407b2
random noise_offset strength
2024-03-13 18:21:49 +08:00
kblueleaf
948029fe61
random ip_noise_gamma strength
2024-03-13 18:21:49 +08:00
gesen2egee
5d7ed0dff0
Merge remote-tracking branch 'kohya-ss/dev' into val
2024-03-13 18:00:49 +08:00
gesen2egee
095b8035e6
save state on train end
2024-03-10 23:33:38 +08:00
Horizon1704
124ec45876
Add "encoding='utf-8'"
2024-03-10 22:53:05 +08:00
gesen2egee
b558a5b73d
val
2024-03-10 04:37:16 +08:00
Kohya S
e3ccf8fbf7
make deepspeed_utils
2024-02-27 21:30:46 +09:00
Kohya S
eefb3cc1e7
Merge branch 'deep-speed' into deepspeed
2024-02-27 18:57:42 +09:00
Kohya S
f2c727fc8c
add minimal impl for masked loss
2024-02-26 23:19:58 +09:00
Kohya S
577e9913ca
add some new dataset settings
2024-02-26 20:01:25 +09:00
BootsofLagrangian
4d5186d1cf
refactored codes, some function moved into train_utils.py
2024-02-22 16:20:53 +09:00
Kohya S
d1fb480887
format by black
2024-02-18 09:13:24 +09:00
Kohya S
358ca205a3
Merge branch 'dev' into dev_device_support
2024-02-12 13:01:54 +09:00
Kohya S
672851e805
Merge branch 'dev' into dev_improve_log
2024-02-12 11:24:33 +09:00
Kohya S
e579648ce9
fix help for highvram arg
2024-02-12 11:12:41 +09:00
Kohya S
e24d9606a2
add clean_memory_on_device and use it from training
2024-02-12 11:10:52 +09:00
Kohya S
75ecb047e2
Merge branch 'dev' into dev_device_support
2024-02-11 19:51:28 +09:00
BootsofLagrangian
03f0816f86
the reason not working grad accum steps found. it was becasue of my accelerate settings
2024-02-09 17:47:49 +09:00
BootsofLagrangian
a98fecaeb1
forgot setting mixed_precision for deepspeed. sorry
2024-02-07 17:19:46 +09:00
BootsofLagrangian
62556619bd
fix full_fp16 compatible and train_step
2024-02-07 16:42:05 +09:00
BootsofLagrangian
3970bf4080
maybe fix branch to run offloading
2024-02-05 22:40:43 +09:00
BootsofLagrangian
2824312d5e
fix vae type error during training sdxl
2024-02-05 20:13:28 +09:00
BootsofLagrangian
64873c1b43
fix offload_optimizer_device typo
2024-02-05 17:11:50 +09:00
Yuta Hayashibe
5f6bf29e52
Replace print with logger if they are logs ( #905 )
...
* Add get_my_logger()
* Use logger instead of print
* Fix log level
* Removed line-breaks for readability
* Use setup_logging()
* Add rich to requirements.txt
* Make simple
* Use logger instead of print
---------
Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com >
2024-02-04 18:14:34 +09:00
Kohya S
e793d7780d
reduce peak VRAM in sample gen
2024-02-04 17:31:01 +09:00
BootsofLagrangian
dfe08f395f
support deepspeed
2024-02-04 03:12:42 +09:00
Kohya S
2f9a344297
fix typo
2024-02-03 23:26:57 +09:00
Kohya S
11aced3500
simplify multi-GPU sample generation
2024-02-03 22:25:29 +09:00
DKnight54
1567ce1e17
Enable distributed sample image generation on multi-GPU enviroment ( #1061 )
...
* Update train_util.py
Modifying to attempt enable multi GPU inference
* Update train_util.py
additional VRAM checking, refactor check_vram_usage to return string for use with accelerator.print
* Update train_network.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
remove sample image debug outputs
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_util.py
* Update train_network.py
* Update train_network.py
* Update train_network.py
* Cleanup of debugging outputs
* adopt more elegant coding
Co-authored-by: Aarni Koskela <akx@iki.fi >
* Update train_util.py
Fix leftover debugging code
attempt to refactor inference into separate function
* refactor in function generate_per_device_prompt_list() generation of distributed prompt list
* Clean up missing variables
* fix syntax error
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* true random sample image generation
update code to reinitialize random seed to true random if seed was set
* true random sample image generation
* simplify per process prompt
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_network.py
* Update train_network.py
---------
Co-authored-by: Aarni Koskela <akx@iki.fi >
2024-02-03 21:46:31 +09:00
Kohya S
5cca1fdc40
add highvram option and do not clear cache in caching latents
2024-02-01 21:55:55 +09:00
Disty0
a6a2b5a867
Fix IPEX support and add XPU device to device_utils
2024-01-31 17:32:37 +03:00
Kohya S
2ca4d0c831
Merge pull request #1054 from akx/mps
...
Device support improvements (MPS)
2024-01-31 21:30:12 +09:00
Kohya S
c576f80639
Fix ControlNetLLLite training issue #1069
2024-01-25 18:43:07 +09:00
Aarni Koskela
afc38707d5
Refactor memory cleaning into a single function
2024-01-23 14:28:50 +02:00
Aarni Koskela
2e4bee6f24
Log accelerator device
2024-01-23 14:20:40 +02:00
Kohya S
fef172966f
Add network_multiplier for dataset and train LoRA
2024-01-20 16:24:43 +09:00
Kohaku-Blueleaf
9cfa68c92f
[Experimental Feature] FP8 weight dtype for base model when running train_network (or sdxl_train_network) ( #1057 )
...
* Add fp8 support
* remove some debug prints
* Better implementation for te
* Fix some misunderstanding
* as same as unet, add explicit convert
* better impl for convert TE to fp8
* fp8 for not only unet
* Better cache TE and TE lr
* match arg name
* Fix with list
* Add timeout settings
* Fix arg style
* Add custom seperator
* Fix typo
* Fix typo again
* Fix dtype error
* Fix gradient problem
* Fix req grad
* fix merge
* Fix merge
* Resolve merge
* arrangement and document
* Resolve merge error
* Add assert for mixed precision
2024-01-20 09:46:53 +09:00
Kohya S
09ef3ffa8b
Merge branch 'main' into dev
2024-01-14 21:49:25 +09:00
Nir Weingarten
ab716302e4
Added cli argument for wandb session name
2024-01-03 11:52:38 +02:00
Plat
62e7516537
feat: support torch.compile
2023-12-27 02:17:24 +09:00
Kohya S
3efd90b2ad
fix sampling in training with mutiple gpus ref #989
2023-12-15 22:35:54 +09:00