Kohya S
baa0e97ced
Merge branch 'dev' into dev_device_support
2024-02-17 11:54:07 +09:00
Kohya S
93bed60762
fix to work --console_log_xxx options
2024-02-12 14:49:29 +09:00
Kohya S
358ca205a3
Merge branch 'dev' into dev_device_support
2024-02-12 13:01:54 +09:00
Kohya S
98f42d3a0b
Merge branch 'dev' into gradual_latent_hires_fix
2024-02-12 12:59:25 +09:00
Kohya S
20ae603221
Merge branch 'dev' into gradual_latent_hires_fix
2024-02-12 11:26:36 +09:00
Kohya S
672851e805
Merge branch 'dev' into dev_improve_log
2024-02-12 11:24:33 +09:00
Kohya S
e579648ce9
fix help for highvram arg
2024-02-12 11:12:41 +09:00
Kohya S
e24d9606a2
add clean_memory_on_device and use it from training
2024-02-12 11:10:52 +09:00
Kohya S
75ecb047e2
Merge branch 'dev' into dev_device_support
2024-02-11 19:51:28 +09:00
BootsofLagrangian
03f0816f86
the reason not working grad accum steps found. it was becasue of my accelerate settings
2024-02-09 17:47:49 +09:00
Kohya S
5d9e2873f6
make rich to output to stderr instead of stdout
2024-02-08 21:38:02 +09:00
Kohya S
9b8ea12d34
update log initialization without rich
2024-02-08 21:06:39 +09:00
Kohya S
74fe0453b2
add comment for get_preferred_device
2024-02-08 20:58:54 +09:00
BootsofLagrangian
a98fecaeb1
forgot setting mixed_precision for deepspeed. sorry
2024-02-07 17:19:46 +09:00
BootsofLagrangian
62556619bd
fix full_fp16 compatible and train_step
2024-02-07 16:42:05 +09:00
BootsofLagrangian
3970bf4080
maybe fix branch to run offloading
2024-02-05 22:40:43 +09:00
BootsofLagrangian
2824312d5e
fix vae type error during training sdxl
2024-02-05 20:13:28 +09:00
BootsofLagrangian
64873c1b43
fix offload_optimizer_device typo
2024-02-05 17:11:50 +09:00
Kohya S
efd3b58973
Add logging arguments and update logging setup
2024-02-04 20:44:10 +09:00
Kohya S
6279b33736
fallback to basic logging if rich is not installed
2024-02-04 18:28:54 +09:00
Yuta Hayashibe
5f6bf29e52
Replace print with logger if they are logs ( #905 )
...
* Add get_my_logger()
* Use logger instead of print
* Fix log level
* Removed line-breaks for readability
* Use setup_logging()
* Add rich to requirements.txt
* Make simple
* Use logger instead of print
---------
Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com >
2024-02-04 18:14:34 +09:00
Kohya S
e793d7780d
reduce peak VRAM in sample gen
2024-02-04 17:31:01 +09:00
BootsofLagrangian
dfe08f395f
support deepspeed
2024-02-04 03:12:42 +09:00
Kohya S
2f9a344297
fix typo
2024-02-03 23:26:57 +09:00
Kohya S
11aced3500
simplify multi-GPU sample generation
2024-02-03 22:25:29 +09:00
DKnight54
1567ce1e17
Enable distributed sample image generation on multi-GPU enviroment ( #1061 )
...
* Update train_util.py
Modifying to attempt enable multi GPU inference
* Update train_util.py
additional VRAM checking, refactor check_vram_usage to return string for use with accelerator.print
* Update train_network.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
remove sample image debug outputs
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_util.py
* Update train_network.py
* Update train_network.py
* Update train_network.py
* Cleanup of debugging outputs
* adopt more elegant coding
Co-authored-by: Aarni Koskela <akx@iki.fi >
* Update train_util.py
Fix leftover debugging code
attempt to refactor inference into separate function
* refactor in function generate_per_device_prompt_list() generation of distributed prompt list
* Clean up missing variables
* fix syntax error
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* true random sample image generation
update code to reinitialize random seed to true random if seed was set
* true random sample image generation
* simplify per process prompt
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_util.py
* Update train_network.py
* Update train_network.py
* Update train_network.py
---------
Co-authored-by: Aarni Koskela <akx@iki.fi >
2024-02-03 21:46:31 +09:00
Kohya S
5cca1fdc40
add highvram option and do not clear cache in caching latents
2024-02-01 21:55:55 +09:00
Disty0
a6a2b5a867
Fix IPEX support and add XPU device to device_utils
2024-01-31 17:32:37 +03:00
Kohya S
2ca4d0c831
Merge pull request #1054 from akx/mps
...
Device support improvements (MPS)
2024-01-31 21:30:12 +09:00
Disty0
988dee02b9
IPEX torch.tensor FP64 workaround
2024-01-30 01:52:32 +03:00
Disty0
ccc3a481e7
Update IPEX Libs
2024-01-28 14:14:31 +03:00
Kohya S
8f6f734a6f
Merge branch 'dev' into gradual_latent_hires_fix
2024-01-28 08:21:15 +09:00
Kohya S
c576f80639
Fix ControlNetLLLite training issue #1069
2024-01-25 18:43:07 +09:00
Aarni Koskela
478156b4f7
Refactor device determination to function; add MPS fallback
2024-01-23 14:29:03 +02:00
Aarni Koskela
afc38707d5
Refactor memory cleaning into a single function
2024-01-23 14:28:50 +02:00
Aarni Koskela
2e4bee6f24
Log accelerator device
2024-01-23 14:20:40 +02:00
Kohya S
bea4362e21
Merge pull request #1060 from akx/refactor-xpu-init
...
Deduplicate ipex initialization code
2024-01-23 20:25:37 +09:00
Kohya S
696dd7f668
Fix dtype issue in PyTorch 2.0 for generating samples in training sdxl network
2024-01-22 12:43:37 +09:00
Kohya S
fef172966f
Add network_multiplier for dataset and train LoRA
2024-01-20 16:24:43 +09:00
Kohya S
5a1ebc4c7c
format by black
2024-01-20 13:10:45 +09:00
Kohya S
1f77bb6e73
fix to work sample generation in fp8 ref #1057
2024-01-20 10:57:42 +09:00
Kohaku-Blueleaf
9cfa68c92f
[Experimental Feature] FP8 weight dtype for base model when running train_network (or sdxl_train_network) ( #1057 )
...
* Add fp8 support
* remove some debug prints
* Better implementation for te
* Fix some misunderstanding
* as same as unet, add explicit convert
* better impl for convert TE to fp8
* fp8 for not only unet
* Better cache TE and TE lr
* match arg name
* Fix with list
* Add timeout settings
* Fix arg style
* Add custom seperator
* Fix typo
* Fix typo again
* Fix dtype error
* Fix gradient problem
* Fix req grad
* fix merge
* Fix merge
* Resolve merge
* arrangement and document
* Resolve merge error
* Add assert for mixed precision
2024-01-20 09:46:53 +09:00
Aarni Koskela
6f3f701d3d
Deduplicate ipex initialization code
2024-01-19 18:07:36 +02:00
Aarni Koskela
ef50436464
Fix typo --spda (it's --sdpa)
2024-01-16 14:32:48 +02:00
Kohya S
09ef3ffa8b
Merge branch 'main' into dev
2024-01-14 21:49:25 +09:00
Kohya S
aab265e431
Fix an issue with saving as diffusers sd1/2 model close #1033
2024-01-04 21:43:50 +09:00
Kohya S
da9b34fa26
Merge branch 'dev' into gradual_latent_hires_fix
2024-01-04 19:53:46 +09:00
Kohya S
716bad188b
Update dependencies ref #1024
2024-01-04 19:53:25 +09:00
Kohya S
07bf2a21ac
Merge pull request #1024 from p1atdev/main
...
Add support for `torch.compile`
2024-01-04 10:49:52 +09:00
Nir Weingarten
ab716302e4
Added cli argument for wandb session name
2024-01-03 11:52:38 +02:00