mirror of
https://github.com/kohya-ss/sd-scripts.git
synced 2026-04-08 22:35:09 +00:00
add some new dataset settings
This commit is contained in:
161
README.md
161
README.md
@@ -249,6 +249,109 @@ ControlNet-LLLite, a novel method for ControlNet with SDXL, is added. See [docum
|
|||||||
|
|
||||||
## Change History
|
## Change History
|
||||||
|
|
||||||
|
### Working in progress
|
||||||
|
|
||||||
|
- `train_network.py` and `sdxl_train_network.py` are modified to record some dataset settings in the metadata of the trained model (`caption_prefix`, `caption_suffix`, `keep_tokens_separator`, `secondary_separator`, `enable_wildcard`).
|
||||||
|
- Some features are added to the dataset subset settings.
|
||||||
|
- `secondary_separator` is added to specify the tag separator that is not the target of shuffling or dropping.
|
||||||
|
- Specify `secondary_separator=";;;"`. When you specify `secondary_separator`, the part is not shuffled or dropped. See the example below.
|
||||||
|
- `enable_wildcard` is added. When set to `true`, the wildcard notation `{aaa|bbb|ccc}` can be used. See the example below.
|
||||||
|
- `keep_tokens_separator` is updated to be used twice in the caption. When you specify `keep_tokens_separator="|||"`, the part divided by the second `|||` is not shuffled or dropped and remains at the end.
|
||||||
|
- The existing features `caption_prefix` and `caption_suffix` can be used together. `caption_prefix` and `caption_suffix` are processed first, and then `enable_wildcard`, `keep_tokens_separator`, shuffling and dropping, and `secondary_separator` are processed in order.
|
||||||
|
- The examples are [shown below](#example-of-dataset-settings--データセット設定の記述例).
|
||||||
|
|
||||||
|
- `train_network.py` および `sdxl_train_network.py` で、学習したモデルのメタデータに一部のデータセット設定が記録されるよう修正しました(`caption_prefix`、`caption_suffix`、`keep_tokens_separator`、`secondary_separator`、`enable_wildcard`)。
|
||||||
|
- データセットのサブセット設定にいくつかの機能を追加しました。
|
||||||
|
- シャッフルの対象とならないタグ分割識別子の指定 `secondary_separator` を追加しました。`secondary_separator=";;;"` のように指定します。`secondary_separator` で区切ることで、その部分はシャッフル、drop 時にまとめて扱われます。詳しくは記述例をご覧ください。
|
||||||
|
- `enable_wildcard` を追加しました。`true` にするとワイルドカード記法 `{aaa|bbb|ccc}` が使えます。詳しくは記述例をご覧ください。
|
||||||
|
- `keep_tokens_separator` をキャプション内に 2 つ使えるようにしました。たとえば `keep_tokens_separator="|||"` と指定したとき、`1girl, hatsune miku, vocaloid ||| stage, mic ||| best quality, rating: general` とキャプションを指定すると、二番目の `|||` で分割された部分はシャッフル、drop されず末尾に残ります。
|
||||||
|
- 既存の機能 `caption_prefix` と `caption_suffix` とあわせて使えます。`caption_prefix` と `caption_suffix` は一番最初に処理され、その後、ワイルドカード、`keep_tokens_separator`、シャッフルおよび drop、`secondary_separator` の順に処理されます。
|
||||||
|
|
||||||
|
#### Example of dataset settings / データセット設定の記述例:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[general]
|
||||||
|
flip_aug = true
|
||||||
|
color_aug = false
|
||||||
|
resolution = [1024, 1024]
|
||||||
|
|
||||||
|
[[datasets]]
|
||||||
|
batch_size = 6
|
||||||
|
enable_bucket = true
|
||||||
|
bucket_no_upscale = true
|
||||||
|
caption_extension = ".txt"
|
||||||
|
keep_tokens_separator= "|||"
|
||||||
|
shuffle_caption = true
|
||||||
|
caption_tag_dropout_rate = 0.1
|
||||||
|
secondary_separator = ";;;" # subset 側に書くこともできます / can be written in the subset side
|
||||||
|
enable_wildcard = true # 同上 / same as above
|
||||||
|
|
||||||
|
[[datasets.subsets]]
|
||||||
|
image_dir = "/path/to/image_dir"
|
||||||
|
num_repeats = 1
|
||||||
|
|
||||||
|
# ||| の前後はカンマは不要です(自動的に追加されます) / No comma is required before and after ||| (it is added automatically)
|
||||||
|
caption_prefix = "1girl, hatsune miku, vocaloid |||"
|
||||||
|
|
||||||
|
# ||| の後はシャッフル、drop されず残ります / After |||, it is not shuffled or dropped and remains
|
||||||
|
# 単純に文字列として連結されるので、カンマなどは自分で入れる必要があります / It is simply concatenated as a string, so you need to put commas yourself
|
||||||
|
caption_suffix = ", anime screencap ||| masterpiece, rating: general"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example of caption, secondary_separator notation: `secondary_separator = ";;;"`
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, upper body, looking at viewer, sky;;;cloud;;;day, outdoors
|
||||||
|
```
|
||||||
|
The part `sky;;;cloud;;;day` is replaced with `sky,cloud,day` without shuffling or dropping. When shuffling and dropping are enabled, it is processed as a whole (as one tag). For example, it becomes `vocaloid, 1girl, upper body, sky,cloud,day, outdoors, hatsune miku` (shuffled) or `vocaloid, 1girl, outdoors, looking at viewer, upper body, hatsune miku` (dropped).
|
||||||
|
|
||||||
|
#### Example of caption, enable_wildcard notation: `enable_wildcard = true`
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, upper body, looking at viewer, {simple|white} background
|
||||||
|
```
|
||||||
|
`simple` or `white` is randomly selected, and it becomes `simple background` or `white background`.
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, {{retro style}}
|
||||||
|
```
|
||||||
|
If you want to include `{` or `}` in the tag string, double them like `{{` or `}}` (in this example, the actual caption used for training is `{retro style}`).
|
||||||
|
|
||||||
|
#### Example of caption, `keep_tokens_separator` notation: `keep_tokens_separator = "|||"`
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid ||| stage, microphone, white shirt, smile ||| best quality, rating: general
|
||||||
|
```
|
||||||
|
It becomes `1girl, hatsune miku, vocaloid, microphone, stage, white shirt, best quality, rating: general` or `1girl, hatsune miku, vocaloid, white shirt, smile, stage, microphone, best quality, rating: general` etc.
|
||||||
|
|
||||||
|
|
||||||
|
#### キャプション記述例、secondary_separator 記法:`secondary_separator = ";;;"` の場合
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, upper body, looking at viewer, sky;;;cloud;;;day, outdoors
|
||||||
|
```
|
||||||
|
`sky;;;cloud;;;day` の部分はシャッフル、drop されず `sky,cloud,day` に置換されます。シャッフル、drop が有効な場合、まとめて(一つのタグとして)処理されます。つまり `vocaloid, 1girl, upper body, sky,cloud,day, outdoors, hatsune miku` (シャッフル)や `vocaloid, 1girl, outdoors, looking at viewer, upper body, hatsune miku` (drop されたケース)などになります。
|
||||||
|
|
||||||
|
#### キャプション記述例、ワイルドカード記法: `enable_wildcard = true` の場合
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, upper body, looking at viewer, {simple|white} background
|
||||||
|
```
|
||||||
|
ランダムに `simple` または `white` が選ばれ、`simple background` または `white background` になります。
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid, {{retro style}}
|
||||||
|
```
|
||||||
|
タグ文字列に `{` や `}` そのものを含めたい場合は `{{` や `}}` のように二つ重ねてください(この例では実際に学習に用いられるキャプションは `{retro style}` になります)。
|
||||||
|
|
||||||
|
#### キャプション記述例、`keep_tokens_separator` 記法: `keep_tokens_separator = "|||"` の場合
|
||||||
|
|
||||||
|
```txt
|
||||||
|
1girl, hatsune miku, vocaloid ||| stage, microphone, white shirt, smile ||| best quality, rating: general
|
||||||
|
```
|
||||||
|
`1girl, hatsune miku, vocaloid, microphone, stage, white shirt, best quality, rating: general` や `1girl, hatsune miku, vocaloid, white shirt, smile, stage, microphone, best quality, rating: general` などになります。
|
||||||
|
|
||||||
|
|
||||||
### Feb 24, 2024 / 2024/2/24: v0.8.4
|
### Feb 24, 2024 / 2024/2/24: v0.8.4
|
||||||
|
|
||||||
- The log output has been improved. PR [#905](https://github.com/kohya-ss/sd-scripts/pull/905) Thanks to shirayu!
|
- The log output has been improved. PR [#905](https://github.com/kohya-ss/sd-scripts/pull/905) Thanks to shirayu!
|
||||||
@@ -304,64 +407,6 @@ ControlNet-LLLite, a novel method for ControlNet with SDXL, is added. See [docum
|
|||||||
- 複数 GPU での学習時に `network_multiplier` を指定するとクラッシュする不具合が修正されました。 PR [#1084](https://github.com/kohya-ss/sd-scripts/pull/1084) fireicewolf 氏に感謝します。
|
- 複数 GPU での学習時に `network_multiplier` を指定するとクラッシュする不具合が修正されました。 PR [#1084](https://github.com/kohya-ss/sd-scripts/pull/1084) fireicewolf 氏に感謝します。
|
||||||
- ControlNet-LLLite の学習がエラーになる不具合を修正しました。
|
- ControlNet-LLLite の学習がエラーになる不具合を修正しました。
|
||||||
|
|
||||||
### Jan 23, 2024 / 2024/1/23: v0.8.2
|
|
||||||
|
|
||||||
- [Experimental] The `--fp8_base` option is added to the training scripts for LoRA etc. The base model (U-Net, and Text Encoder when training modules for Text Encoder) can be trained with fp8. PR [#1057](https://github.com/kohya-ss/sd-scripts/pull/1057) Thanks to KohakuBlueleaf!
|
|
||||||
- Please specify `--fp8_base` in `train_network.py` or `sdxl_train_network.py`.
|
|
||||||
- PyTorch 2.1 or later is required.
|
|
||||||
- If you use xformers with PyTorch 2.1, please see [xformers repository](https://github.com/facebookresearch/xformers) and install the appropriate version according to your CUDA version.
|
|
||||||
- The sample image generation during training consumes a lot of memory. It is recommended to turn it off.
|
|
||||||
|
|
||||||
- [Experimental] The network multiplier can be specified for each dataset in the training scripts for LoRA etc.
|
|
||||||
- This is an experimental option and may be removed or changed in the future.
|
|
||||||
- For example, if you train with state A as `1.0` and state B as `-1.0`, you may be able to generate by switching between state A and B depending on the LoRA application rate.
|
|
||||||
- Also, if you prepare five states and train them as `0.2`, `0.4`, `0.6`, `0.8`, and `1.0`, you may be able to generate by switching the states smoothly depending on the application rate.
|
|
||||||
- Please specify `network_multiplier` in `[[datasets]]` in `.toml` file.
|
|
||||||
- Some options are added to `networks/extract_lora_from_models.py` to reduce the memory usage.
|
|
||||||
- `--load_precision` option can be used to specify the precision when loading the model. If the model is saved in fp16, you can reduce the memory usage by specifying `--load_precision fp16` without losing precision.
|
|
||||||
- `--load_original_model_to` option can be used to specify the device to load the original model. `--load_tuned_model_to` option can be used to specify the device to load the derived model. The default is `cpu` for both options, but you can specify `cuda` etc. You can reduce the memory usage by loading one of them to GPU. This option is available only for SDXL.
|
|
||||||
|
|
||||||
- The gradient synchronization in LoRA training with multi-GPU is improved. PR [#1064](https://github.com/kohya-ss/sd-scripts/pull/1064) Thanks to KohakuBlueleaf!
|
|
||||||
- The code for Intel IPEX support is improved. PR [#1060](https://github.com/kohya-ss/sd-scripts/pull/1060) Thanks to akx!
|
|
||||||
- Fixed a bug in multi-GPU Textual Inversion training.
|
|
||||||
|
|
||||||
- (実験的) LoRA等の学習スクリプトで、ベースモデル(U-Net、および Text Encoder のモジュール学習時は Text Encoder も)の重みを fp8 にして学習するオプションが追加されました。 PR [#1057](https://github.com/kohya-ss/sd-scripts/pull/1057) KohakuBlueleaf 氏に感謝します。
|
|
||||||
- `train_network.py` または `sdxl_train_network.py` で `--fp8_base` を指定してください。
|
|
||||||
- PyTorch 2.1 以降が必要です。
|
|
||||||
- PyTorch 2.1 で xformers を使用する場合は、[xformers のリポジトリ](https://github.com/facebookresearch/xformers) を参照し、CUDA バージョンに応じて適切なバージョンをインストールしてください。
|
|
||||||
- 学習中のサンプル画像生成はメモリを大量に消費するため、オフにすることをお勧めします。
|
|
||||||
- (実験的) LoRA 等の学習で、データセットごとに異なるネットワーク適用率を指定できるようになりました。
|
|
||||||
- 実験的オプションのため、将来的に削除または仕様変更される可能性があります。
|
|
||||||
- たとえば状態 A を `1.0`、状態 B を `-1.0` として学習すると、LoRA の適用率に応じて状態 A と B を切り替えつつ生成できるかもしれません。
|
|
||||||
- また、五段階の状態を用意し、それぞれ `0.2`、`0.4`、`0.6`、`0.8`、`1.0` として学習すると、適用率でなめらかに状態を切り替えて生成できるかもしれません。
|
|
||||||
- `.toml` ファイルで `[[datasets]]` に `network_multiplier` を指定してください。
|
|
||||||
- `networks/extract_lora_from_models.py` に使用メモリ量を削減するいくつかのオプションを追加しました。
|
|
||||||
- `--load_precision` で読み込み時の精度を指定できます。モデルが fp16 で保存されている場合は `--load_precision fp16` を指定して精度を変えずにメモリ量を削減できます。
|
|
||||||
- `--load_original_model_to` で元モデルを読み込むデバイスを、`--load_tuned_model_to` で派生モデルを読み込むデバイスを指定できます。デフォルトは両方とも `cpu` ですがそれぞれ `cuda` 等を指定できます。片方を GPU に読み込むことでメモリ量を削減できます。SDXL の場合のみ有効です。
|
|
||||||
- マルチ GPU での LoRA 等の学習時に勾配の同期が改善されました。 PR [#1064](https://github.com/kohya-ss/sd-scripts/pull/1064) KohakuBlueleaf 氏に感謝します。
|
|
||||||
- Intel IPEX サポートのコードが改善されました。PR [#1060](https://github.com/kohya-ss/sd-scripts/pull/1060) akx 氏に感謝します。
|
|
||||||
- マルチ GPU での Textual Inversion 学習の不具合を修正しました。
|
|
||||||
|
|
||||||
- `.toml` example for network multiplier / ネットワーク適用率の `.toml` の記述例
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[general]
|
|
||||||
[[datasets]]
|
|
||||||
resolution = 512
|
|
||||||
batch_size = 8
|
|
||||||
network_multiplier = 1.0
|
|
||||||
|
|
||||||
... subset settings ...
|
|
||||||
|
|
||||||
[[datasets]]
|
|
||||||
resolution = 512
|
|
||||||
batch_size = 8
|
|
||||||
network_multiplier = -1.0
|
|
||||||
|
|
||||||
... subset settings ...
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
Please read [Releases](https://github.com/kohya-ss/sd-scripts/releases) for recent updates.
|
Please read [Releases](https://github.com/kohya-ss/sd-scripts/releases) for recent updates.
|
||||||
最近の更新情報は [Release](https://github.com/kohya-ss/sd-scripts/releases) をご覧ください。
|
最近の更新情報は [Release](https://github.com/kohya-ss/sd-scripts/releases) をご覧ください。
|
||||||
|
|
||||||
|
|||||||
@@ -60,6 +60,8 @@ class BaseSubsetParams:
|
|||||||
caption_separator: str = (",",)
|
caption_separator: str = (",",)
|
||||||
keep_tokens: int = 0
|
keep_tokens: int = 0
|
||||||
keep_tokens_separator: str = (None,)
|
keep_tokens_separator: str = (None,)
|
||||||
|
secondary_separator: Optional[str] = None
|
||||||
|
enable_wildcard: bool = False
|
||||||
color_aug: bool = False
|
color_aug: bool = False
|
||||||
flip_aug: bool = False
|
flip_aug: bool = False
|
||||||
face_crop_aug_range: Optional[Tuple[float, float]] = None
|
face_crop_aug_range: Optional[Tuple[float, float]] = None
|
||||||
@@ -181,6 +183,8 @@ class ConfigSanitizer:
|
|||||||
"shuffle_caption": bool,
|
"shuffle_caption": bool,
|
||||||
"keep_tokens": int,
|
"keep_tokens": int,
|
||||||
"keep_tokens_separator": str,
|
"keep_tokens_separator": str,
|
||||||
|
"secondary_separator": str,
|
||||||
|
"enable_wildcard": bool,
|
||||||
"token_warmup_min": int,
|
"token_warmup_min": int,
|
||||||
"token_warmup_step": Any(float, int),
|
"token_warmup_step": Any(float, int),
|
||||||
"caption_prefix": str,
|
"caption_prefix": str,
|
||||||
@@ -504,6 +508,8 @@ def generate_dataset_group_by_blueprint(dataset_group_blueprint: DatasetGroupBlu
|
|||||||
shuffle_caption: {subset.shuffle_caption}
|
shuffle_caption: {subset.shuffle_caption}
|
||||||
keep_tokens: {subset.keep_tokens}
|
keep_tokens: {subset.keep_tokens}
|
||||||
keep_tokens_separator: {subset.keep_tokens_separator}
|
keep_tokens_separator: {subset.keep_tokens_separator}
|
||||||
|
secondary_separator: {subset.secondary_separator}
|
||||||
|
enable_wildcard: {subset.enable_wildcard}
|
||||||
caption_dropout_rate: {subset.caption_dropout_rate}
|
caption_dropout_rate: {subset.caption_dropout_rate}
|
||||||
caption_dropout_every_n_epoches: {subset.caption_dropout_every_n_epochs}
|
caption_dropout_every_n_epoches: {subset.caption_dropout_every_n_epochs}
|
||||||
caption_tag_dropout_rate: {subset.caption_tag_dropout_rate}
|
caption_tag_dropout_rate: {subset.caption_tag_dropout_rate}
|
||||||
|
|||||||
@@ -364,6 +364,8 @@ class BaseSubset:
|
|||||||
caption_separator: str,
|
caption_separator: str,
|
||||||
keep_tokens: int,
|
keep_tokens: int,
|
||||||
keep_tokens_separator: str,
|
keep_tokens_separator: str,
|
||||||
|
secondary_separator: Optional[str],
|
||||||
|
enable_wildcard: bool,
|
||||||
color_aug: bool,
|
color_aug: bool,
|
||||||
flip_aug: bool,
|
flip_aug: bool,
|
||||||
face_crop_aug_range: Optional[Tuple[float, float]],
|
face_crop_aug_range: Optional[Tuple[float, float]],
|
||||||
@@ -382,6 +384,8 @@ class BaseSubset:
|
|||||||
self.caption_separator = caption_separator
|
self.caption_separator = caption_separator
|
||||||
self.keep_tokens = keep_tokens
|
self.keep_tokens = keep_tokens
|
||||||
self.keep_tokens_separator = keep_tokens_separator
|
self.keep_tokens_separator = keep_tokens_separator
|
||||||
|
self.secondary_separator = secondary_separator
|
||||||
|
self.enable_wildcard = enable_wildcard
|
||||||
self.color_aug = color_aug
|
self.color_aug = color_aug
|
||||||
self.flip_aug = flip_aug
|
self.flip_aug = flip_aug
|
||||||
self.face_crop_aug_range = face_crop_aug_range
|
self.face_crop_aug_range = face_crop_aug_range
|
||||||
@@ -410,6 +414,8 @@ class DreamBoothSubset(BaseSubset):
|
|||||||
caption_separator: str,
|
caption_separator: str,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -431,6 +437,8 @@ class DreamBoothSubset(BaseSubset):
|
|||||||
caption_separator,
|
caption_separator,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -466,6 +474,8 @@ class FineTuningSubset(BaseSubset):
|
|||||||
caption_separator,
|
caption_separator,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -487,6 +497,8 @@ class FineTuningSubset(BaseSubset):
|
|||||||
caption_separator,
|
caption_separator,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -519,6 +531,8 @@ class ControlNetSubset(BaseSubset):
|
|||||||
caption_separator,
|
caption_separator,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -540,6 +554,8 @@ class ControlNetSubset(BaseSubset):
|
|||||||
caption_separator,
|
caption_separator,
|
||||||
keep_tokens,
|
keep_tokens,
|
||||||
keep_tokens_separator,
|
keep_tokens_separator,
|
||||||
|
secondary_separator,
|
||||||
|
enable_wildcard,
|
||||||
color_aug,
|
color_aug,
|
||||||
flip_aug,
|
flip_aug,
|
||||||
face_crop_aug_range,
|
face_crop_aug_range,
|
||||||
@@ -675,15 +691,41 @@ class BaseDataset(torch.utils.data.Dataset):
|
|||||||
if is_drop_out:
|
if is_drop_out:
|
||||||
caption = ""
|
caption = ""
|
||||||
else:
|
else:
|
||||||
|
# process wildcards
|
||||||
|
if subset.enable_wildcard:
|
||||||
|
# wildcard is like '{aaa|bbb|ccc...}'
|
||||||
|
# escape the curly braces like {{ or }}
|
||||||
|
replacer1 = "⦅"
|
||||||
|
replacer2 = "⦆"
|
||||||
|
while replacer1 in caption or replacer2 in caption:
|
||||||
|
replacer1 += "⦅"
|
||||||
|
replacer2 += "⦆"
|
||||||
|
|
||||||
|
caption = caption.replace("{{", replacer1).replace("}}", replacer2)
|
||||||
|
|
||||||
|
# replace the wildcard
|
||||||
|
def replace_wildcard(match):
|
||||||
|
return random.choice(match.group(1).split("|"))
|
||||||
|
|
||||||
|
caption = re.sub(r"\{([^}]+)\}", replace_wildcard, caption)
|
||||||
|
|
||||||
|
# unescape the curly braces
|
||||||
|
caption = caption.replace(replacer1, "{").replace(replacer2, "}")
|
||||||
|
|
||||||
if subset.shuffle_caption or subset.token_warmup_step > 0 or subset.caption_tag_dropout_rate > 0:
|
if subset.shuffle_caption or subset.token_warmup_step > 0 or subset.caption_tag_dropout_rate > 0:
|
||||||
fixed_tokens = []
|
fixed_tokens = []
|
||||||
flex_tokens = []
|
flex_tokens = []
|
||||||
|
fixed_suffix_tokens = []
|
||||||
if (
|
if (
|
||||||
hasattr(subset, "keep_tokens_separator")
|
hasattr(subset, "keep_tokens_separator")
|
||||||
and subset.keep_tokens_separator
|
and subset.keep_tokens_separator
|
||||||
and subset.keep_tokens_separator in caption
|
and subset.keep_tokens_separator in caption
|
||||||
):
|
):
|
||||||
fixed_part, flex_part = caption.split(subset.keep_tokens_separator, 1)
|
fixed_part, flex_part = caption.split(subset.keep_tokens_separator, 1)
|
||||||
|
if subset.keep_tokens_separator in flex_part:
|
||||||
|
flex_part, fixed_suffix_part = flex_part.split(subset.keep_tokens_separator, 1)
|
||||||
|
fixed_suffix_tokens = [t.strip() for t in fixed_suffix_part.split(subset.caption_separator) if t.strip()]
|
||||||
|
|
||||||
fixed_tokens = [t.strip() for t in fixed_part.split(subset.caption_separator) if t.strip()]
|
fixed_tokens = [t.strip() for t in fixed_part.split(subset.caption_separator) if t.strip()]
|
||||||
flex_tokens = [t.strip() for t in flex_part.split(subset.caption_separator) if t.strip()]
|
flex_tokens = [t.strip() for t in flex_part.split(subset.caption_separator) if t.strip()]
|
||||||
else:
|
else:
|
||||||
@@ -718,7 +760,11 @@ class BaseDataset(torch.utils.data.Dataset):
|
|||||||
|
|
||||||
flex_tokens = dropout_tags(flex_tokens)
|
flex_tokens = dropout_tags(flex_tokens)
|
||||||
|
|
||||||
caption = ", ".join(fixed_tokens + flex_tokens)
|
caption = ", ".join(fixed_tokens + flex_tokens + fixed_suffix_tokens)
|
||||||
|
|
||||||
|
# process secondary separator
|
||||||
|
if subset.secondary_separator:
|
||||||
|
caption = caption.replace(subset.secondary_separator, subset.caption_separator)
|
||||||
|
|
||||||
# textual inversion対応
|
# textual inversion対応
|
||||||
for str_from, str_to in self.replacements.items():
|
for str_from, str_to in self.replacements.items():
|
||||||
@@ -1774,6 +1820,8 @@ class ControlNetDataset(BaseDataset):
|
|||||||
subset.caption_separator,
|
subset.caption_separator,
|
||||||
subset.keep_tokens,
|
subset.keep_tokens,
|
||||||
subset.keep_tokens_separator,
|
subset.keep_tokens_separator,
|
||||||
|
subset.secondary_separator,
|
||||||
|
subset.enable_wildcard,
|
||||||
subset.color_aug,
|
subset.color_aug,
|
||||||
subset.flip_aug,
|
subset.flip_aug,
|
||||||
subset.face_crop_aug_range,
|
subset.face_crop_aug_range,
|
||||||
@@ -3284,6 +3332,18 @@ def add_dataset_arguments(
|
|||||||
help="A custom separator to divide the caption into fixed and flexible parts. Tokens before this separator will not be shuffled. If not specified, '--keep_tokens' will be used to determine the fixed number of tokens."
|
help="A custom separator to divide the caption into fixed and flexible parts. Tokens before this separator will not be shuffled. If not specified, '--keep_tokens' will be used to determine the fixed number of tokens."
|
||||||
+ " / captionを固定部分と可変部分に分けるためのカスタム区切り文字。この区切り文字より前のトークンはシャッフルされない。指定しない場合、'--keep_tokens'が固定部分のトークン数として使用される。",
|
+ " / captionを固定部分と可変部分に分けるためのカスタム区切り文字。この区切り文字より前のトークンはシャッフルされない。指定しない場合、'--keep_tokens'が固定部分のトークン数として使用される。",
|
||||||
)
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--secondary_separator",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="a secondary separator for caption. This separator is replaced to caption_separator after dropping/shuffling caption"
|
||||||
|
+ " / captionのセカンダリ区切り文字。この区切り文字はcaptionのドロップやシャッフル後にcaption_separatorに置き換えられる",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--enable_wildcard",
|
||||||
|
action="store_true",
|
||||||
|
help="enable wildcard for caption (e.g. '{image|picture|rendition}') / captionのワイルドカードを有効にする(例:'{image|picture|rendition}')",
|
||||||
|
)
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
"--caption_prefix",
|
"--caption_prefix",
|
||||||
type=str,
|
type=str,
|
||||||
|
|||||||
@@ -564,6 +564,11 @@ class NetworkTrainer:
|
|||||||
"random_crop": bool(subset.random_crop),
|
"random_crop": bool(subset.random_crop),
|
||||||
"shuffle_caption": bool(subset.shuffle_caption),
|
"shuffle_caption": bool(subset.shuffle_caption),
|
||||||
"keep_tokens": subset.keep_tokens,
|
"keep_tokens": subset.keep_tokens,
|
||||||
|
"keep_tokens_separator": subset.keep_tokens_separator,
|
||||||
|
"secondary_separator": subset.secondary_separator,
|
||||||
|
"enable_wildcard": bool(subset.enable_wildcard),
|
||||||
|
"caption_prefix": subset.caption_prefix,
|
||||||
|
"caption_suffix": subset.caption_suffix,
|
||||||
}
|
}
|
||||||
|
|
||||||
image_dir_or_metadata_file = None
|
image_dir_or_metadata_file = None
|
||||||
|
|||||||
Reference in New Issue
Block a user