Update train_ti_README-ja.md

2026-04-08 22:35:09 +00:00 · 2023-01-26 22:22:37 +09:00
parent 835b0d54cd
commit 7c35aee042
1 changed files with 14 additions and 1 deletions
--- a/train_ti_README-ja.md
+++ b/train_ti_README-ja.md
@@ -26,7 +26,20 @@ accelerate launch --num_cpu_threads_per_process 1 train_textual_inversion.py
    --token_string=mychar4 --init_word=cute --num_vectors_per_token=4
 ```

-``--token_string`` に学習時のトークン文字列を指定します。学習時のプロンプトは、この文字列を含むようにしてください（token_stringがmychar4なら、``mychar4 1girl`` など）。プロンプトのこの文字列の部分が、Textual Inversionの新しいtokenに置換されて学習されます。
+``--token_string`` に学習時のトークン文字列を指定します。__学習時のプロンプトは、この文字列を含むようにしてください（token_stringがmychar4なら、``mychar4 1girl`` など）__。プロンプトのこの文字列の部分が、Textual Inversionの新しいtokenに置換されて学習されます。
+
+プロンプトにトークン文字列が含まれているかどうかは、``--debug_dataset`` で置換後のtoken idが表示されますので、以下のように ``49408`` 以降のtokenが存在するかどうかで確認できます。
+
+```
+input ids: tensor([[49406, 49408, 49409, 49410, 49411, 49412, 49413, 49414, 49415, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
+         49407, 49407, 49407, 49407, 49407, 49407, 49407]])
+```

 tokenizerがすでに持っている単語（一般的な単語）は使用できません。