Update anima_train_network.md

2026-04-17 01:12:41 +00:00 · 2026-02-18 17:53:41 +07:00
parent bdfc2cf0a0
commit 331499c6e1
1 changed files with 36 additions and 15 deletions
--- a/docs/anima_train_network.md
+++ b/docs/anima_train_network.md
@@ -48,7 +48,7 @@ Qwen-Image VAEとQwen-Image VAEは同じアーキテクチャですが、[Anima
 * **Arguments:** Uses the common `--pretrained_model_name_or_path` for the DiT model path, `--qwen3` for the Qwen3 text encoder, and `--vae` for the Qwen-Image VAE. The LLM adapter and T5 tokenizer can be specified separately with `--llm_adapter_path` and `--t5_tokenizer_path`.
 * **Incompatible arguments:** Stable Diffusion v1/v2 options such as `--v2`, `--v_parameterization` and `--clip_skip` are not used. `--fp8_base` is not supported.
 * **Timestep sampling:** Uses the same `--timestep_sampling` options as FLUX training (`sigma`, `uniform`, `sigmoid`, `shift`, `flux_shift`).
-* **LoRA:** Uses regex-based module selection and per-module rank/learning rate control (`network_reg_dims`, `network_reg_lrs`) instead of per-component arguments. Module exclusion/inclusion is controlled by `exclude_patterns` and `include_patterns`.
+* **LoRA:** Uses regex-based module selection and per-module rank/alpha/learning rate control (`network_reg_dims`, `network_reg_alphas`, `network_reg_lrs`) instead of per-component arguments. Module exclusion/inclusion is controlled by `exclude_patterns` and `include_patterns`.

 <details>
 <summary>日本語</summary>
@@ -60,7 +60,7 @@ Qwen-Image VAEとQwen-Image VAEは同じアーキテクチャですが、[Anima
 * **引数:** DiTモデルのパスには共通引数`--pretrained_model_name_or_path`を、Qwen3テキストエンコーダーには`--qwen3`を、Qwen-Image VAEには`--vae`を使用します。LLM AdapterとT5トークナイザーはそれぞれ`--llm_adapter_path`、`--t5_tokenizer_path`で個別に指定できます。
 * **一部引数の非互換性:** Stable Diffusion v1/v2向けの引数（例: `--v2`, `--v_parameterization`, `--clip_skip`）は使用されません。`--fp8_base`はサポートされていません。
 * **タイムステップサンプリング:** FLUX学習と同じ`--timestep_sampling`オプション（`sigma`、`uniform`、`sigmoid`、`shift`、`flux_shift`）を使用します。
-* **LoRA:** コンポーネント別の引数の代わりに、正規表現ベースのモジュール選択とモジュール単位のランク/学習率制御（`network_reg_dims`、`network_reg_lrs`）を使用します。モジュールの除外/包含は`exclude_patterns`と`include_patterns`で制御します。
+* **LoRA:** コンポーネント別の引数の代わりに、正規表現ベースのモジュール選択とモジュール単位のランク/アルファ/学習率制御（`network_reg_dims`、`network_reg_alphas`、`network_reg_lrs`）を使用します。モジュールの除外/包含は`exclude_patterns`と`include_patterns`で制御します
 </details>

 ## 3. Preparation / 準備
@@ -330,20 +330,31 @@ Example to additionally exclude MLP layers:
 --network_args "exclude_patterns=['.*mlp.*']"
 ```

-### 5.2. Regex-based Rank and Learning Rate Control / 正規表現によるランク・学習率の制御
-
-You can specify different ranks (network_dim) and learning rates for modules matching specific regex patterns:
-
+### 5.2. Regex-based Rank, Alpha, and Learning Rate Control / 正規表現によるランク・アルファ・学習率の制御
+ 
+You can specify different ranks (network_dim), alphas (network_alpha), and learning rates for modules matching specific regex patterns:
+ 
 * `network_reg_dims`: Specify ranks for modules matching a regular expression. The format is a comma-separated string of `pattern=rank`.
    * Example: `--network_args "network_reg_dims=.*self_attn.*=8,.*cross_attn.*=4,.*mlp.*=8"`
    * This sets the rank to 8 for self-attention modules, 4 for cross-attention modules, and 8 for MLP modules.
+* `network_reg_alphas`: Specify alphas for modules matching a regular expression. The format is a comma-separated string of `pattern=alpha`.
+    * Example: `--network_args "network_reg_alphas=.*self_attn.*=4,.*mlp.*=8"`
+    * This sets the alpha to 4 for self-attention modules and 8 for MLP modules.
+    * Alpha controls the effective scaling of LoRA: `effective_scale = alpha / dim`. A lower alpha relative to dim reduces the LoRA's influence.
 * `network_reg_lrs`: Specify learning rates for modules matching a regular expression. The format is a comma-separated string of `pattern=lr`.
    * Example: `--network_args "network_reg_lrs=.*self_attn.*=1e-4,.*cross_attn.*=5e-5"`
    * This sets the learning rate to `1e-4` for self-attention modules and `5e-5` for cross-attention modules.
-
+ 
+**Priority order:**
+ 
+1. `network_reg_dims` sets the rank for matched modules. If `network_reg_alphas` is also specified and matches the same module, that alpha is used; otherwise the global `--network_alpha` is used.
+2. `network_reg_alphas` can override the alpha independently, even for modules not matched by `network_reg_dims`.
+3. Modules not matched by any regex pattern fall back to the global `--network_dim` and `--network_alpha`.
+4. `network_reg_lrs` overrides the learning rate independently of rank/alpha settings.
+ 
 **Notes:**
-
-* Settings via `network_reg_dims` and `network_reg_lrs` take precedence over the global `--network_dim` and `--learning_rate` settings.
+ 
+* Settings via `network_reg_dims`, `network_reg_alphas`, and `network_reg_lrs` take precedence over the global `--network_dim`, `--network_alpha`, and `--learning_rate` settings.
 * Patterns are matched using `re.fullmatch()` against the module's original name (e.g., `blocks.0.self_attn.q_proj`).

 ### 5.3. LLM Adapter LoRA / LLM Adapter LoRA
@@ -389,17 +400,27 @@ In preliminary tests, lowering the learning rate for the LLM Adapter seems to im

 パターンは`re.fullmatch()`を使用して完全なモジュール名に対してマッチングされます。

-### 5.2. 正規表現によるランク・学習率の制御
-
-正規表現にマッチするモジュールに対して、異なるランクや学習率を指定できます：
-
+### 5.2. 正規表現によるランク・アルファ・学習率の制御
+ 
+正規表現にマッチするモジュールに対して、異なるランク、アルファ、学習率を指定できます：
+ 
 * `network_reg_dims`: 正規表現にマッチするモジュールに対してランクを指定します。`pattern=rank`形式の文字列をカンマで区切って指定します。
    * 例: `--network_args "network_reg_dims=.*self_attn.*=8,.*cross_attn.*=4,.*mlp.*=8"`
+* `network_reg_alphas`: 正規表現にマッチするモジュールに対してアルファを指定します。`pattern=alpha`形式の文字列をカンマで区切って指定します。
+    * 例: `--network_args "network_reg_alphas=.*self_attn.*=4,.*mlp.*=8"`
+    * アルファはLoRAの実効的なスケーリングを制御します：`effective_scale = alpha / dim`。dimに対してアルファが低いほど、LoRAの影響が小さくなります。
 * `network_reg_lrs`: 正規表現にマッチするモジュールに対して学習率を指定します。`pattern=lr`形式の文字列をカンマで区切って指定します。
    * 例: `--network_args "network_reg_lrs=.*self_attn.*=1e-4,.*cross_attn.*=5e-5"`
-
+ 
+**優先順位:**
+ 
+1. `network_reg_dims`はマッチしたモジュールのランクを設定します。`network_reg_alphas`も指定されており同じモジュールにマッチする場合はそのアルファが使用されます。マッチしない場合はグローバルの`--network_alpha`が使用されます。
+2. `network_reg_alphas`は、`network_reg_dims`にマッチしていないモジュールに対してもアルファを独立して上書きできます。
+3. どの正規表現パターンにもマッチしないモジュールは、グローバルの`--network_dim`と`--network_alpha`にフォールバックします。
+4. `network_reg_lrs`はランク/アルファの設定とは独立して学習率を上書きします。
+ 
 **注意点:**
-* `network_reg_dims`および`network_reg_lrs`での設定は、全体設定である`--network_dim`や`--learning_rate`よりも優先されます。
+* `network_reg_dims`、`network_reg_alphas`、`network_reg_lrs`での設定は、全体設定である`--network_dim`、`--network_alpha`、`--learning_rate`よりも優先されます。
 * パターンはモジュールのオリジナル名（例: `blocks.0.self_attn.q_proj`）に対して`re.fullmatch()`でマッチングされます。

 ### 5.3. LLM Adapter LoRA