Merge branch 'kohya-ss:dev' into dev

2026-04-16 08:52:45 +00:00 · 2023-10-11 15:41:19 +08:00
parent 917a37b2de 681034d001
commit a0670e45c1
2 changed files with 20 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -249,6 +249,21 @@ ControlNet-LLLite, a novel method for ControlNet with SDXL, is added. See [docum

 ## Change History

+### Oct 11, 2023 / 2023/10/11
+- Fix to work `make_captions_by_git.py` with the latest version of transformers.
+- Improve `gen_img_diffusers.py` and `sdxl_gen_img.py`. Both scripts now support the following options:
+  - `--network_merge_n_models` option can be used to merge some of the models. The remaining models aren't merged, so the multiplier can be changed, and the regional LoRA also works.
+  - `--network_regional_mask_max_color_codes` is added. Now you can use up to 7 regions.
+    - When this option is specified, the mask of the regional LoRA is the color code based instead of the channel based. The value is the maximum number of the color codes (up to 7). 
+    - You can specify the mask for each LoRA by colors: 0x0000ff, 0x00ff00, 0x00ffff, 0xff0000, 0xff00ff, 0xffff00, 0xffffff.
+
+- `make_captions_by_git.py` が最新の transformers で動作するように修正しました。
+- `gen_img_diffusers.py` と `sdxl_gen_img.py` を更新し、以下のオプションを追加しました。
+  - `--network_merge_n_models` オプションで一部のモデルのみマージできます。残りのモデルはマージされないため、重みを変更したり、領域別LoRAを使用したりできます。
+  - `--network_regional_mask_max_color_codes` を追加しました。最大7つの領域を使用できます。
+    - このオプションを指定すると、領域別LoRAのマスクはチャンネルベースではなくカラーコードベースになります。値はカラーコードの最大数（最大7）です。
+    - 各LoRAに対してマスクをカラーで指定できます：0x0000ff、0x00ff00、0x00ffff、0xff0000、0xff00ff、0xffff00、0xffffff。
+
 ### Oct 9. 2023 / 2023/10/9

 - `tag_images_by_wd_14_tagger.py` now supports Onnx. If you use Onnx, TensorFlow is not required anymore. [#864](https://github.com/kohya-ss/sd-scripts/pull/864) Thanks to Isotr0py!
--- a/finetune/make_captions_by_git.py
+++ b/finetune/make_captions_by_git.py
@@ -52,6 +52,9 @@ def collate_fn_remove_corrupted(batch):


 def main(args):
+    r"""
+    transformers 4.30.2で、バッチサイズ>1でも動くようになったので、以下コメントアウト
+
    # GITにバッチサイズが1より大きくても動くようにパッチを当てる: transformers 4.26.0用
    org_prepare_input_ids_for_generation = GenerationMixin._prepare_input_ids_for_generation
    curr_batch_size = [args.batch_size]  # ループの最後で件数がbatch_size未満になるので入れ替えられるように
@@ -65,6 +68,7 @@ def main(args):
        return input_ids

    GenerationMixin._prepare_input_ids_for_generation = _prepare_input_ids_for_generation_patch
+    """

    print(f"load images from {args.train_data_dir}")
    train_data_dir_path = Path(args.train_data_dir)
@@ -81,7 +85,7 @@ def main(args):
    def run_batch(path_imgs):
        imgs = [im for _, im in path_imgs]

-        curr_batch_size[0] = len(path_imgs)
+        # curr_batch_size[0] = len(path_imgs)
        inputs = git_processor(images=imgs, return_tensors="pt").to(DEVICE)  # 画像はpil形式
        generated_ids = git_model.generate(pixel_values=inputs.pixel_values, max_length=args.max_length)
        captions = git_processor.batch_decode(generated_ids, skip_special_tokens=True)