Fix sd 2.0/2.1 lora fine-tuning text encoder params mismatch issue.
With the original text encoder and its corresponding parser, stable diffusion 2.0/2.1/2.1-unclip all cannot be properly loaded due to the text encoder transformer version difference and checkpoints' state_dict key name difference.
After this fix, all these three versions are tested to work well when executing lora fine tuning.