Merge pull request #178 from kohya-ss/dev

Dev
2026-04-15 08:36:41 +00:00 · 2023-02-11 16:16:15 +09:00
parent 43a41c6c43 d1ecfde487
commit b32abdd327
2 changed files with 24 additions and 10 deletions
--- a/README.md
+++ b/README.md
@@ -129,10 +129,16 @@ The majority of scripts is licensed under ASL 2.0 (including codes from Diffuser
    - For LoRAs where the activation word is unknown, this script compares the output of Text Encoder after applying LoRA to that of unapplied to find out which token is affected by LoRA. Hopefully you can figure out the activation word. LoRA trained with captions does not seem to be able to interrogate.
    - Batch size can be large (like 64 or 128).
  - ``train_textual_inversion.py`` now supports multiple init words.
+  - Following feature is reverted to be the same as before. Sorry for confusion:
+    > Now the number of data in each batch is limited to the number of actual images (not duplicated). Because a certain bucket may contain smaller number of actual images, so the batch may contain same (duplicated) images.
+  
  - ``lora_interrogator.py`` を ``network``フォルダに追加しました。使用法は ``python networks\lora_interrogator.py -h`` でご確認ください。
    - このスクリプトは、起動promptがわからないLoRAについて、LoRA適用前後のText Encoderの出力を比較することで、どのtokenの出力が変化しているかを調べます。運が良ければ起動用の単語が分かります。キャプション付きで学習されたLoRAは影響が広範囲に及ぶため、調査は難しいようです。
    - バッチサイズはわりと大きくできます（64や128など）。
  - ``train_textual_inversion.py`` で複数のinit_word指定が可能になりました。
+  - 次の機能を削除し元に戻しました。混乱を招き申し訳ありません。
+    >  これらのオプションによりbucketが細分化され、ひとつのバッチ内に同一画像が重複して存在することが増えたため、バッチサイズを``そのbucketの画像種類数``までに制限する機能を追加しました。
+
 - 10 Feb. 2023, 2023/2/10:
  - Updated ``requirements.txt`` to prevent upgrading with pip taking a long time or failure to upgrade.
  - ``resize_lora.py`` keeps the metadata of the model. ``dimension is resized from ...`` is added to the top of  ``ss_training_comment``.
--- a/library/train_util.py
+++ b/library/train_util.py
@@ -432,17 +432,25 @@ class BaseDataset(torch.utils.data.Dataset):
    # データ参照用indexを作る。このindexはdatasetのshuffleに用いられる
    self.buckets_indices: List(BucketBatchIndex) = []
    for bucket_index, bucket in enumerate(self.bucket_manager.buckets):
-      # bucketが細分化されることにより、ひとつのbucketに一種類の画像のみというケースが増え、つまりそれは
-      # ひとつのbatchが同じ画像で占められることになるので、さすがに良くないであろう
-      # そのためバッチサイズを画像種類までに制限する
-      # ただそれでも同一画像が同一バッチに含まれる可能性はあるので、繰り返し回数が少ないほうがshuffleの品質は良くなることは間違いない？
-      # TODO 正則化画像をepochまたがりで利用する仕組み
-      num_of_image_types = len(set(bucket))
-      bucket_batch_size = min(self.batch_size, num_of_image_types)
-      batch_count = int(math.ceil(len(bucket) / bucket_batch_size))
-      # print(bucket_index, num_of_image_types, bucket_batch_size, batch_count)
+      batch_count = int(math.ceil(len(bucket) / self.batch_size))
      for batch_index in range(batch_count):
-        self.buckets_indices.append(BucketBatchIndex(bucket_index, bucket_batch_size, batch_index))
+        self.buckets_indices.append(BucketBatchIndex(bucket_index, self.batch_size, batch_index))
+
+      # ↓以下はbucketごとのbatch件数があまりにも増えて混乱を招くので元に戻す
+      # 　学習時はステップ数がランダムなので、同一画像が同一batch内にあってもそれほど悪影響はないであろう、と考えられる
+      #
+      # # bucketが細分化されることにより、ひとつのbucketに一種類の画像のみというケースが増え、つまりそれは
+      # # ひとつのbatchが同じ画像で占められることになるので、さすがに良くないであろう
+      # # そのためバッチサイズを画像種類までに制限する
+      # # ただそれでも同一画像が同一バッチに含まれる可能性はあるので、繰り返し回数が少ないほうがshuffleの品質は良くなることは間違いない？
+      # # TO DO 正則化画像をepochまたがりで利用する仕組み
+      # num_of_image_types = len(set(bucket))
+      # bucket_batch_size = min(self.batch_size, num_of_image_types)
+      # batch_count = int(math.ceil(len(bucket) / bucket_batch_size))
+      # # print(bucket_index, num_of_image_types, bucket_batch_size, batch_count)
+      # for batch_index in range(batch_count):
+      #   self.buckets_indices.append(BucketBatchIndex(bucket_index, bucket_batch_size, batch_index))
+      # ↑ここまで

    self.shuffle_buckets()
    self._length = len(self.buckets_indices)