17 Commits

Author SHA1 Message Date
Kohya S
6731d8a57f fix: update system prompt handling 2025-06-29 22:21:48 +09:00
Kohya S
884c1f37c4 fix: update to work with cache text encoder outputs (without disk) 2025-06-29 21:58:43 +09:00
Kohya S
935e0037dc feat: update lumina system prompt handling 2025-06-29 21:33:09 +09:00
青龍聖者@bdsqlsz
dfe1ab6c50 Merge pull request #21 from rockerBOO/lumina-torch-dynamo-gemma2
fix torch compile/dynamo for Gemma2
2025-03-02 18:31:13 +08:00
青龍聖者@bdsqlsz
b6e4194ea5 Merge pull request #20 from rockerBOO/lumina-system-prompt-special-token
Lumina system prompt special token
2025-03-02 18:30:49 +08:00
青龍聖者@bdsqlsz
b5d1f1caea Merge pull request #19 from rockerBOO/lumina-block-swap
Lumina block swap
2025-03-02 18:30:37 +08:00
rockerBOO
cad182d29a fix torch compile/dynamo for Gemma2 2025-02-28 18:35:19 -05:00
rockerBOO
9647f1e324 Fix validation block swap. Add custom offloading tests 2025-02-27 20:36:36 -05:00
rockerBOO
ce2610d29b Change system prompt to inject Prompt Start special token 2025-02-27 02:47:04 -05:00
rockerBOO
70403f6977 fix cache text encoder outputs if not using disk. small cleanup/alignment 2025-02-26 23:33:50 -05:00
sdbds
5f9047c8cf add truncation when > max_length 2025-02-26 01:00:35 +08:00
rockerBOO
025cca699b Fix samples, LoRA training. Add system prompt, use_flash_attn 2025-02-23 01:29:18 -05:00
rockerBOO
98efbc3bb7 Add documentation to model, use SDPA attention, sample images 2025-02-18 00:58:53 -05:00
sdbds
aa36c48685 update for always use gemma2 mask 2025-02-17 19:00:18 +08:00
rockerBOO
60a76ebb72 Add caching gemma2, add gradient checkpointing, refactor lumina model code 2025-02-16 01:06:34 -05:00
rockerBOO
a00b06bc97 Lumina 2 and Gemma 2 model loading 2025-02-15 14:56:11 -05:00
sdbds
d154e76c45 init 2025-02-12 16:30:05 +08:00