Merge branch 'kohya-ss:main' into main

2026-04-16 00:49:40 +00:00 · 2023-01-13 23:04:37 -08:00
parent 4735b21318 bf691aef69
commit 29c9008e07
10 changed files with 1834 additions and 2997 deletions
--- a/README-ja.md
+++ b/README-ja.md
@@ -114,9 +114,13 @@ pip install --upgrade -r <requirement file name>

 コマンドが成功すれば新しいバージョンが使用できます。

+## 謝意
+
+LoRAの実装は[cloneofsimo氏のリポジトリ](https://github.com/cloneofsimo/lora)を基にしたものです。感謝申し上げます。
+
 ## ライセンス

-スクリプトのライセンスはASL 2.0ですが、一部他のライセンスのコードを含みます。
+スクリプトのライセンスはASL 2.0ですが（Diffusersおよびcloneofsimo氏のリポジトリ由来のものも同様）、一部他のライセンスのコードを含みます。

 [Memory Efficient Attention Pytorch](https://github.com/lucidrains/memory-efficient-attention-pytorch): MIT

--- a/README.md
+++ b/README.md
@@ -1,13 +1,26 @@
 This repository contains training, generation and utility scripts for Stable Diffusion.

+## Updates
+
+- January 12, 2023, 2023/1/23
+  - Metadata is saved on the model (.safetensors only) (model name, VAE name, training steps, learning rate etc.) The metadata will be able to inspect by sd-webui-additional-networks extension in near future. If you do not want to save it, specify ``no_metadata`` option.
+  - メタデータが保存されるようになりました（ .safetensors 形式の場合のみ）（モデル名、VAE 名、ステップ数、学習率など）。近日中に拡張から確認できるようになる予定です。メタデータを保存したくない場合は ``no_metadata`` オプションをしてしてください。
+  
+**January 9, 2023: Important information about the update can be found at [the end of the page](#updates-jan-9-2023).**
+
+**20231/1/9: 更新情報が[ページ末尾](#更新情報-202319)にありますのでご覧ください。**
+
 [日本語版README](./README-ja.md)

+##
+
 For easier use (GUI and PowerShell scripts etc...), please visit [the repository maintained by bmaltais](https://github.com/bmaltais/kohya_ss). Thanks to @bmaltais!

 This repository contains the scripts for:

 * DreamBooth training, including U-Net and Text Encoder
 * fine-tuning (native training), including U-Net and Text Encoder
+* LoRA training
 * image generation
 * model conversion (supports 1.x and 2.x, Stable Diffision ckpt/safetensors and Diffusers)

@@ -94,9 +107,13 @@ pip install --upgrade -r requirements.txt

 Once the commands have completed successfully you should be ready to use the new version.

+## Credits
+
+The implementation for LoRA is based on [cloneofsimo's repo](https://github.com/cloneofsimo/lora). Thank you for great work!!!
+
 ## License

-The majority of scripts is licensed under ASL 2.0 (including codes from Diffusers), however portions of the project are available under separate license terms:
+The majority of scripts is licensed under ASL 2.0 (including codes from Diffusers, cloneofsimo's), however portions of the project are available under separate license terms:

 [Memory Efficient Attention Pytorch](https://github.com/lucidrains/memory-efficient-attention-pytorch): MIT

@@ -104,3 +121,78 @@ The majority of scripts is licensed under ASL 2.0 (including codes from Diffuser

 [BLIP](https://github.com/salesforce/BLIP): BSD-3-Clause

+
+# Updates: Jan 9. 2023 
+
+All training scripts are updated. 
+
+## Breaking Changes
+
+- The ``fine_tuning`` option in ``train_db.py`` is removed. Please use DreamBooth with captions or ``fine_tune.py``.
+- The Hypernet feature in ``fine_tune.py`` is removed, will be implemented in ``train_network.py`` in future.
+
+## Features, Improvements and Bug Fixes
+
+### for all script: train_db.py, fine_tune.py and train_network.py
+
+- Added ``output_name`` option. The name of output file can be specified.
+    - With ``--output_name style1``, the output file is like ``style1_000001.ckpt`` (or ``.safetensors``) for each epoch and ``style1.ckpt`` for last.
+    - If ommitted (default), same to previous. ``epoch-000001.ckpt`` and ``last.ckpt``.
+- Added ``save_last_n_epochs`` option. Keep only latest n files for the checkpoints and the states. Older files are removed. (Thanks to shirayu!)
+    - If the options are ``--save_every_n_epochs=2 --save_last_n_epochs=3``, in the end of epoch 8, ``epoch-000008.ckpt`` is created and ``epoch-000002.ckpt`` is removed.
+
+### train_db.py
+
+- Added ``max_token_length`` option. Captions can have more than 75 tokens.
+
+### fine_tune.py
+
+- The script now works without .npz files. If .npz is not found, the scripts get the latents with VAE.
+    - You can omit ``prepare_buckets_latents.py`` in preprocessing. However, it is recommended if you train more than 1 or 2 epochs.
+    - ``--resolution`` option is required to specify the training resolution.
+- Added ``cache_latents`` and ``color_aug`` options.
+
+### train_network.py
+
+- Now ``--gradient_checkpointing`` is effective for U-Net and Text Encoder.
+    - The memory usage is reduced. The larger batch size is avilable, but the training speed will be slow.
+    - The training might be possible with 6GB VRAM for dimension=4 with batch size=1.
+
+Documents are not updated now, I will update one by one.
+
+# 更新情報 (2023/1/9)
+
+学習スクリプトを更新しました。
+
+## 削除された機能
+- ``train_db.py`` の ``fine_tuning`` は削除されました。キャプション付きの DreamBooth または ``fine_tune.py`` を使ってください。
+- ``fine_tune.py`` の Hypernet学習の機能は削除されました。将来的に``train_network.py``に追加される予定です。
+
+## その他の機能追加、バグ修正など
+
+### 学習スクリプトに共通: train_db.py, fine_tune.py and train_network.py
+
+- ``output_name``オプションを追加しました。保存されるモデルファイルの名前を指定できます。
+    - ``--output_name style1``と指定すると、エポックごとに保存されるファイル名は``style1_000001.ckpt`` (または ``.safetensors``) に、最後に保存されるファイル名は``style1.ckpt``になります。
+    - 省略時は今までと同じです（``epoch-000001.ckpt``および``last.ckpt``）。
+- ``save_last_n_epochs``オプションを追加しました。最新の n ファイル、stateだけ保存し、古いものは削除します。（shirayu氏に感謝します。)
+    - たとえば``--save_every_n_epochs=2 --save_last_n_epochs=3``と指定した時、8エポック目の終了時には、``epoch-000008.ckpt``が保存され``epoch-000002.ckpt``が削除されます。
+
+### train_db.py
+
+- ``max_token_length``オプションを追加しました。75文字を超えるキャプションが使えるようになります。
+
+### fine_tune.py
+
+- .npzファイルがなくても動作するようになりました。.npzファイルがない場合、VAEからlatentsを取得して動作します。
+    -  ``prepare_buckets_latents.py``を前処理で実行しなくても良くなります。ただし事前取得をしておいたほうが、2エポック以上学習する場合にはトータルで高速です。
+    - この場合、解像度を指定するために``--resolution``オプションが必要です。
+- ``cache_latents``と``color_aug``オプションを追加しました。
+
+### train_network.py
+
+- ``--gradient_checkpointing``がU-NetとText Encoderにも有効になりました。
+    - メモリ消費が減ります。バッチサイズを大きくできますが、トータルでの学習時間は長くなるかもしれません。
+    - dimension=4のLoRAはバッチサイズ1で6GB VRAMで学習できるかもしれません。
+
+ドキュメントは未更新ですが少しずつ更新の予定です。
--- a/fine_tune.py
+++ b/fine_tune.py
--- a/gen_img_diffusers.py
+++ b/gen_img_diffusers.py
@@ -46,11 +46,13 @@ VGG(
 )
 """

+import json
 from typing import List, Optional, Union
 import glob
 import importlib
 import inspect
 import time
+import zipfile
 from diffusers.utils import deprecate
 from diffusers.configuration_utils import FrozenDict
 import argparse
@@ -1972,6 +1974,14 @@ def main(args):
      if args.network_weights and i < len(args.network_weights):
        network_weight = args.network_weights[i]
        print("load network weights from:", network_weight)
+
+        if os.path.splitext(network_weight)[1] == '.safetensors':
+          from safetensors.torch import safe_open
+          with safe_open(network_weight, framework="pt") as f:
+            metadata = f.metadata()
+          if metadata is not None:
+            print(f"metadata for: {network_weight}: {metadata}")
+
        network.load_weights(network_weight)

      network.apply_to(text_encoder, unet)
--- a/library/model_util.py
+++ b/library/model_util.py
@@ -1133,14 +1133,6 @@ def load_vae(vae_id, dtype):
  return vae


-def get_epoch_ckpt_name(use_safetensors, epoch):
-  return f"epoch-{epoch:06d}" + (".safetensors" if use_safetensors else ".ckpt")
-
-
-def get_last_ckpt_name(use_safetensors):
-  return f"last" + (".safetensors" if use_safetensors else ".ckpt")
-
-
 # endregion


--- a/library/train_util.py
+++ b/library/train_util.py
--- a/networks/extract_lora_from_models.py
+++ b/networks/extract_lora_from_models.py
@@ -135,7 +135,7 @@ def svd(args):
  if dir_name and not os.path.exists(dir_name):
    os.makedirs(dir_name, exist_ok=True)

-  lora_network_o.save_weights(args.save_to, save_dtype)
+  lora_network_o.save_weights(args.save_to, save_dtype, {})
  print(f"LoRA weights are saved to: {args.save_to}")


--- a/networks/lora.py
+++ b/networks/lora.py
@@ -92,7 +92,7 @@ class LoRANetwork(torch.nn.Module):

  def load_weights(self, file):
    if os.path.splitext(file)[1] == '.safetensors':
-      from safetensors.torch import load_file
+      from safetensors.torch import load_file, safe_open
      self.weights_sd = load_file(file)
    else:
      self.weights_sd = torch.load(file, map_location='cpu')
@@ -174,7 +174,10 @@ class LoRANetwork(torch.nn.Module):
  def get_trainable_params(self):
    return self.parameters()

-  def save_weights(self, file, dtype):
+  def save_weights(self, file, dtype, metadata):
+    if metadata is not None and len(metadata) == 0:
+      metadata = None
+
    state_dict = self.state_dict()

    if dtype is not None:
@@ -185,6 +188,6 @@ class LoRANetwork(torch.nn.Module):

    if os.path.splitext(file)[1] == '.safetensors':
      from safetensors.torch import save_file
-      save_file(state_dict, file)
+      save_file(state_dict, file, metadata)
    else:
      torch.save(state_dict, file)
--- a/train_db.py
+++ b/train_db.py
--- a/train_network.py
+++ b/train_network.py