Kohya S
3419c3de0d
common masked loss func, apply to all training script
2024-03-17 19:30:20 +09:00
gesen2egee
d05965dbad
Update train_network.py
2024-03-13 18:33:51 +08:00
gesen2egee
5d7ed0dff0
Merge remote-tracking branch 'kohya-ss/dev' into val
2024-03-13 18:00:49 +08:00
gesen2egee
bd7e2295b7
fix
2024-03-13 17:54:21 +08:00
gesen2egee
d282c45002
Update train_network.py
2024-03-11 23:56:09 +08:00
gesen2egee
a6c41c6bea
Update train_network.py
2024-03-11 19:23:48 +08:00
gesen2egee
63e58f78e3
Update train_network.py
2024-03-11 19:15:55 +08:00
gesen2egee
befbec5335
Update train_network.py
2024-03-11 18:47:04 +08:00
gesen2egee
a51723cc2a
fix timesteps
2024-03-11 09:42:58 +08:00
gesen2egee
095b8035e6
save state on train end
2024-03-10 23:33:38 +08:00
gesen2egee
47359b8fac
Update train_network.py
2024-03-10 20:17:40 +08:00
gesen2egee
923b761ce3
Update train_network.py
2024-03-10 20:01:40 +08:00
gesen2egee
78cfb01922
improve
2024-03-10 18:55:48 +08:00
gesen2egee
b558a5b73d
val
2024-03-10 04:37:16 +08:00
Kohya S
e3ccf8fbf7
make deepspeed_utils
2024-02-27 21:30:46 +09:00
Kohya S
eefb3cc1e7
Merge branch 'deep-speed' into deepspeed
2024-02-27 18:57:42 +09:00
Kohya S
4a5546d40e
fix typo
2024-02-26 23:39:56 +09:00
Kohya S
f2c727fc8c
add minimal impl for masked loss
2024-02-26 23:19:58 +09:00
Kohya S
577e9913ca
add some new dataset settings
2024-02-26 20:01:25 +09:00
Kohya S
f4132018c5
fix to work with cpu_count() == 1 closes #1134
2024-02-24 19:25:31 +09:00
BootsofLagrangian
4d5186d1cf
refactored codes, some function moved into train_utils.py
2024-02-22 16:20:53 +09:00
Kohya S
baa0e97ced
Merge branch 'dev' into dev_device_support
2024-02-17 11:54:07 +09:00
Kohya S
93bed60762
fix to work --console_log_xxx options
2024-02-12 14:49:29 +09:00
Kohya S
358ca205a3
Merge branch 'dev' into dev_device_support
2024-02-12 13:01:54 +09:00
Kohya S
e24d9606a2
add clean_memory_on_device and use it from training
2024-02-12 11:10:52 +09:00
Kohya S
055f02e1e1
add logging args for training scripts
2024-02-08 21:16:42 +09:00
BootsofLagrangian
62556619bd
fix full_fp16 compatible and train_step
2024-02-07 16:42:05 +09:00
BootsofLagrangian
7d2a9268b9
apply offloading method runable for all trainer
2024-02-05 22:42:06 +09:00
BootsofLagrangian
4295f91dcd
fix all trainer about vae
2024-02-05 20:19:56 +09:00
Yuta Hayashibe
5f6bf29e52
Replace print with logger if they are logs ( #905 )
...
* Add get_my_logger()
* Use logger instead of print
* Fix log level
* Removed line-breaks for readability
* Use setup_logging()
* Add rich to requirements.txt
* Make simple
* Use logger instead of print
---------
Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com >
2024-02-04 18:14:34 +09:00
BootsofLagrangian
dfe08f395f
support deepspeed
2024-02-04 03:12:42 +09:00
Disty0
a6a2b5a867
Fix IPEX support and add XPU device to device_utils
2024-01-31 17:32:37 +03:00
Kohya S
2ca4d0c831
Merge pull request #1054 from akx/mps
...
Device support improvements (MPS)
2024-01-31 21:30:12 +09:00
DukeG
4e67fb8444
test
2024-01-26 20:22:49 +08:00
DukeG
50f631c768
test
2024-01-26 20:02:48 +08:00
DukeG
85bc371ebc
test
2024-01-26 18:58:47 +08:00
Aarni Koskela
afc38707d5
Refactor memory cleaning into a single function
2024-01-23 14:28:50 +02:00
Kohya S
7a20df5ad5
Merge pull request #1064 from KohakuBlueleaf/fix-grad-sync
...
Avoid grad sync on each step even when doing accumulation
2024-01-23 20:33:55 +09:00
Kohya S
bea4362e21
Merge pull request #1060 from akx/refactor-xpu-init
...
Deduplicate ipex initialization code
2024-01-23 20:25:37 +09:00
Kohaku-Blueleaf
711b40ccda
Avoid always sync
2024-01-23 11:49:03 +08:00
Kohya S
fef172966f
Add network_multiplier for dataset and train LoRA
2024-01-20 16:24:43 +09:00
Kohya S
a7ef6422b6
fix to work with torch 2.0
2024-01-20 10:00:30 +09:00
Kohaku-Blueleaf
9cfa68c92f
[Experimental Feature] FP8 weight dtype for base model when running train_network (or sdxl_train_network) ( #1057 )
...
* Add fp8 support
* remove some debug prints
* Better implementation for te
* Fix some misunderstanding
* as same as unet, add explicit convert
* better impl for convert TE to fp8
* fp8 for not only unet
* Better cache TE and TE lr
* match arg name
* Fix with list
* Add timeout settings
* Fix arg style
* Add custom seperator
* Fix typo
* Fix typo again
* Fix dtype error
* Fix gradient problem
* Fix req grad
* fix merge
* Fix merge
* Resolve merge
* arrangement and document
* Resolve merge error
* Add assert for mixed precision
2024-01-20 09:46:53 +09:00
Aarni Koskela
6f3f701d3d
Deduplicate ipex initialization code
2024-01-19 18:07:36 +02:00
Kohya S
976d092c68
fix text encodes are on gpu even when not trained
2024-01-17 21:31:50 +09:00
Nir Weingarten
ab716302e4
Added cli argument for wandb session name
2024-01-03 11:52:38 +02:00
Kohya S
0676f1a86f
Merge pull request #1009 from liubo0902/main
...
speed up latents nan replace
2023-12-21 21:37:16 +09:00
liubo0902
8c7d05afd2
speed up latents nan replace
2023-12-20 09:35:17 +08:00
Kohya S
912dca8f65
fix duplicated sample gen for every epoch ref #907
2023-12-07 22:13:38 +09:00
Isotr0py
db84530074
Fix gradients synchronization for multi-GPUs training ( #989 )
...
* delete DDP wrapper
* fix train_db vae and train_network
* fix train_db vae and train_network unwrap
* network grad sync
---------
Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com >
2023-12-07 22:01:42 +09:00