当前位置:   article > 正文

LLM-微调-方案(一):Lora【案例:chatGLM-Lora】【在chatGLM原有结构中间插入新的网络层】【微调时冻结原有结构参数,只微调新加入的网络层参数】_chatglm lora 微调

chatglm lora 微调

Lora主要在模型中注入可训练模块,大模型在预训练完收敛之后模型包含许多进行矩阵乘法的稠密层,这些层通常是满秩的,在微调过程中其实改变量是比较小的,在矩阵乘法中表现为低秩的改变,注入可训练层的目的是想下游微调的低秩改变由可训练层来学习,冻结模型其他部分,大大减少模型训练参数。

这种方法有点类似于矩阵分解,可训练层维度和预训练模型层维度一致为d,先将维度d通过全连接层降维至r,再从r通过全连接层映射回d维度,r<<d,r是矩阵的秩,这样矩阵计算就从d x d变为d x r + r x d,参数量减少很多,上图对矩阵A使用随机高斯初始化,对矩阵B使用0进行初始化。

推理计算的时候,因为没有改变预训练权重,所以换不同的下游任务时,lora模型保存的权重也是可以相应加载进来的,通过矩阵分解的方法参数量减少了很多,且推理时可以并行,对于推理性能并没有增加多少负担,算是比较好的低资源微调方法。

Lora方法包实现:

GitHub - huggingface/peft: PEFT: State-of-the-art Parameter-Efficient Fine-Tuning

p-tuning-v2Lora两者对于低资源微调大模型的共同点都是冻结大模型参数,通过小模块来学习微调产生的低秩改变。但目前存在的一些问题就是这两种训练方式很容易参数灾难性遗忘,因为模型在微调的时候整个模型层参数未改变,而少参数的学习模块微调时却是改变量巨大,容易给模型在推理时产生较大偏置,使得以前的回答能力被可学习模块带偏,在微调的时候也必须注意可学习模块不能过于拟合微调数据,否则会丧失原本的预训练知识能力,产生灾难性遗忘。

最好能够在微调语料中也加入通用学习语料一起微调,避免产生对微调语料极大的偏向,在instruct gpt论文中也提到在强化学习ppo的时候模型也会很容易对于ppo数据拟合,降低模型通用自然语言任务能力,所以在ppo loss中加入了SFT梯度和预训练梯度来缓解这种遗忘问题。

  1. from transformers.integrations import TensorBoardCallback
  2. from torch.utils.tensorboard import SummaryWriter
  3. from transformers import TrainingArguments
  4. from transformers import Trainer, HfArgumentParser
  5. from transformers import AutoTokenizer, AutoModel
  6. import torch
  7. import torch.nn as nn
  8. from peft import get_peft_model, LoraConfig, TaskType
  9. from dataclasses import dataclass, field
  10. import datasets
  11. import os
  12. tokenizer = AutoTokenizer.from_pretrained("/data/pretrained_models/chatglm-6b", trust_remote_code=True)
  13. @dataclass
  14. class FinetuneArguments:
  15. dataset_path: str = field(default="data/alpaca")
  16. model_path: str = field(default="output")
  17. lora_rank: int = field(default=8)
  18. class CastOutputToFloat(nn.Sequential):
  19. def forward(self, x):
  20. return super().forward(x).to(torch.float32)
  21. def data_collator(features: list) -> dict:
  22. len_ids = [len(feature["input_ids"]) for feature in features]
  23. longest = max(len_ids)
  24. input_ids = []
  25. labels_list = []
  26. for ids_l, feature in sorted(zip(len_ids, features), key=lambda x: -x[0]):
  27. ids = feature["input_ids"] # [37010, 12, 3461, 100, 294, 102, 104, 3539, 2549, 101, 104, 306, 101, 433, ...]
  28. seq_len = feature["seq_len"]
  29. labels = ([-100] * (seq_len - 1) + ids[(seq_len - 1) :] + [-100] * (longest - ids_l))
  30. ids = ids + [tokenizer.pad_token_id] * (longest - ids_l) # [37010, 12, 3461, 100, 294, 102, 104, 3539, 2549, 101, 104, 306, 101, 433, ...]
  31. _ids = torch.LongTensor(ids)
  32. labels_list.append(torch.LongTensor(labels))
  33. input_ids.append(_ids)
  34. input_ids = torch.stack(input_ids) # torch.Size([6, 118])
  35. labels = torch.stack(labels_list) # torch.Size([6, 118])
  36. return {"input_ids": input_ids, "labels": labels}
  37. class ModifiedTrainer(Trainer):
  38. def compute_loss(self, model, inputs, return_outputs=False):
  39. loss = model(input_ids=inputs["input_ids"], labels=inputs["labels"]).loss
  40. return loss
  41. def save_model(self, output_dir=None, _internal_call=False):
  42. from transformers.trainer import TRAINING_ARGS_NAME
  43. os.makedirs(output_dir, exist_ok=True)
  44. torch.save(self.args, os.path.join(output_dir, TRAINING_ARGS_NAME))
  45. saved_params = {k: v.to("cpu") for k, v in self.model.named_parameters() if v.requires_grad}
  46. torch.save(saved_params, os.path.join(output_dir, "adapter_model.bin"))
  47. def main():
  48. writer = SummaryWriter()
  49. finetune_args, training_args = HfArgumentParser((FinetuneArguments, TrainingArguments)).parse_args_into_dataclasses()
  50. # init model
  51. model = AutoModel.from_pretrained("/data/pretrained_models/chatglm-6b", load_in_8bit=False, trust_remote_code=True, device_map="auto")
  52. model.gradient_checkpointing_enable()
  53. model.enable_input_require_grads()
  54. model.is_parallelizable = True
  55. model.model_parallel = True
  56. model.lm_head = CastOutputToFloat(model.lm_head)
  57. model.config.use_cache = (False) # silence the warnings. Please re-enable for inference!
  58. print("原始模型结构:\n", model)
  59. print("-" * 100)
  60. for param_tuple in model.named_parameters():
  61. name, param = param_tuple
  62. print("layer name = ", name)
  63. print("-" * 100)
  64. # setup peft
  65. peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, inference_mode=False, r=finetune_args.lora_rank, lora_alpha=32, lora_dropout=0.1)
  66. model = get_peft_model(model, peft_config)
  67. print("=" * 200)
  68. print("添加Lora之后的模型结构:\n", model)
  69. print("-" * 100)
  70. for param_tuple in model.named_parameters():
  71. name, param = param_tuple
  72. print("layer name = ", name)
  73. print("-" * 100)
  74. # load dataset
  75. dataset = datasets.load_from_disk(finetune_args.dataset_path) # 'data/alpaca'
  76. print(f"\n{len(dataset)=}\n")
  77. # start train
  78. trainer = ModifiedTrainer(model=model, train_dataset=dataset, args=training_args, callbacks=[TensorBoardCallback(writer)], data_collator=data_collator)
  79. trainer.train()
  80. writer.close()
  81. # save model
  82. model.save_pretrained(training_args.output_dir)
  83. if __name__ == "__main__":
  84. main()
  1. ChatGLMForConditionalGeneration(
  2. (transformer): ChatGLMModel(
  3. (word_embeddings): Embedding(130528, 4096)
  4. (layers): ModuleList(
  5. (0-27): 28 x GLMBlock(
  6. (input_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  7. (attention): SelfAttention(
  8. (rotary_emb): RotaryEmbedding()
  9. (query_key_value): Linear(in_features=4096, out_features=12288, bias=True)
  10. (dense): Linear(in_features=4096, out_features=4096, bias=True)
  11. )
  12. (post_attention_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  13. (mlp): GLU(
  14. (dense_h_to_4h): Linear(in_features=4096, out_features=16384, bias=True)
  15. (dense_4h_to_h): Linear(in_features=16384, out_features=4096, bias=True)
  16. )
  17. )
  18. )
  19. (final_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  20. )
  21. (lm_head): CastOutputToFloat(
  22. (0): Linear(in_features=4096, out_features=130528, bias=False)
  23. )
  24. )

  1. layer name = transformer.word_embeddings.weight
  2. ----------------------------------------------------------------------------------------------------
  3. layer name = transformer.layers.0.input_layernorm.weight
  4. ----------------------------------------------------------------------------------------------------
  5. layer name = transformer.layers.0.input_layernorm.bias
  6. ----------------------------------------------------------------------------------------------------
  7. layer name = transformer.layers.0.attention.query_key_value.weight
  8. ----------------------------------------------------------------------------------------------------
  9. layer name = transformer.layers.0.attention.query_key_value.bias
  10. ----------------------------------------------------------------------------------------------------
  11. layer name = transformer.layers.0.attention.dense.weight
  12. ----------------------------------------------------------------------------------------------------
  13. layer name = transformer.layers.0.attention.dense.bias
  14. ----------------------------------------------------------------------------------------------------
  15. layer name = transformer.layers.0.post_attention_layernorm.weight
  16. ----------------------------------------------------------------------------------------------------
  17. layer name = transformer.layers.0.post_attention_layernorm.bias
  18. ----------------------------------------------------------------------------------------------------
  19. layer name = transformer.layers.0.mlp.dense_h_to_4h.weight
  20. ----------------------------------------------------------------------------------------------------
  21. layer name = transformer.layers.0.mlp.dense_h_to_4h.bias
  22. ----------------------------------------------------------------------------------------------------
  23. layer name = transformer.layers.0.mlp.dense_4h_to_h.weight
  24. ----------------------------------------------------------------------------------------------------
  25. layer name = transformer.layers.0.mlp.dense_4h_to_h.bias
  26. ----------------------------------------------------------------------------------------------------
  27. layer name = transformer.layers.1.input_layernorm.weight
  28. ----------------------------------------------------------------------------------------------------
  29. layer name = transformer.layers.1.input_layernorm.bias
  30. ----------------------------------------------------------------------------------------------------
  31. layer name = transformer.layers.1.attention.query_key_value.weight
  32. ----------------------------------------------------------------------------------------------------
  33. layer name = transformer.layers.1.attention.query_key_value.bias
  34. ----------------------------------------------------------------------------------------------------
  35. layer name = transformer.layers.1.attention.dense.weight
  36. ----------------------------------------------------------------------------------------------------
  37. layer name = transformer.layers.1.attention.dense.bias
  38. ----------------------------------------------------------------------------------------------------
  39. layer name = transformer.layers.1.post_attention_layernorm.weight
  40. ----------------------------------------------------------------------------------------------------
  41. layer name = transformer.layers.1.post_attention_layernorm.bias
  42. ----------------------------------------------------------------------------------------------------
  43. layer name = transformer.layers.1.mlp.dense_h_to_4h.weight
  44. ----------------------------------------------------------------------------------------------------
  45. layer name = transformer.layers.1.mlp.dense_h_to_4h.bias
  46. ----------------------------------------------------------------------------------------------------
  47. layer name = transformer.layers.1.mlp.dense_4h_to_h.weight
  48. ----------------------------------------------------------------------------------------------------
  49. layer name = transformer.layers.1.mlp.dense_4h_to_h.bias
  50. ----------------------------------------------------------------------------------------------------
  51. layer name = transformer.layers.2.input_layernorm.weight
  52. ----------------------------------------------------------------------------------------------------
  53. layer name = transformer.layers.2.input_layernorm.bias
  54. ----------------------------------------------------------------------------------------------------
  55. layer name = transformer.layers.2.attention.query_key_value.weight
  56. ----------------------------------------------------------------------------------------------------
  57. layer name = transformer.layers.2.attention.query_key_value.bias
  58. ----------------------------------------------------------------------------------------------------
  59. layer name = transformer.layers.2.attention.dense.weight
  60. ----------------------------------------------------------------------------------------------------
  61. layer name = transformer.layers.2.attention.dense.bias
  62. ----------------------------------------------------------------------------------------------------
  63. layer name = transformer.layers.2.post_attention_layernorm.weight
  64. ----------------------------------------------------------------------------------------------------
  65. layer name = transformer.layers.2.post_attention_layernorm.bias
  66. ----------------------------------------------------------------------------------------------------
  67. layer name = transformer.layers.2.mlp.dense_h_to_4h.weight
  68. ----------------------------------------------------------------------------------------------------
  69. layer name = transformer.layers.2.mlp.dense_h_to_4h.bias
  70. ----------------------------------------------------------------------------------------------------
  71. layer name = transformer.layers.2.mlp.dense_4h_to_h.weight
  72. ----------------------------------------------------------------------------------------------------
  73. layer name = transformer.layers.2.mlp.dense_4h_to_h.bias
  74. ----------------------------------------------------------------------------------------------------
  75. layer name = transformer.layers.3.input_layernorm.weight
  76. ----------------------------------------------------------------------------------------------------
  77. layer name = transformer.layers.3.input_layernorm.bias
  78. ----------------------------------------------------------------------------------------------------
  79. layer name = transformer.layers.3.attention.query_key_value.weight
  80. ----------------------------------------------------------------------------------------------------
  81. layer name = transformer.layers.3.attention.query_key_value.bias
  82. ----------------------------------------------------------------------------------------------------
  83. layer name = transformer.layers.3.attention.dense.weight
  84. ----------------------------------------------------------------------------------------------------
  85. layer name = transformer.layers.3.attention.dense.bias
  86. ----------------------------------------------------------------------------------------------------
  87. layer name = transformer.layers.3.post_attention_layernorm.weight
  88. ----------------------------------------------------------------------------------------------------
  89. layer name = transformer.layers.3.post_attention_layernorm.bias
  90. ----------------------------------------------------------------------------------------------------
  91. layer name = transformer.layers.3.mlp.dense_h_to_4h.weight
  92. ----------------------------------------------------------------------------------------------------
  93. layer name = transformer.layers.3.mlp.dense_h_to_4h.bias
  94. ----------------------------------------------------------------------------------------------------
  95. layer name = transformer.layers.3.mlp.dense_4h_to_h.weight
  96. ----------------------------------------------------------------------------------------------------
  97. layer name = transformer.layers.3.mlp.dense_4h_to_h.bias
  98. ----------------------------------------------------------------------------------------------------
  99. layer name = transformer.layers.4.input_layernorm.weight
  100. ----------------------------------------------------------------------------------------------------
  101. layer name = transformer.layers.4.input_layernorm.bias
  102. ----------------------------------------------------------------------------------------------------
  103. layer name = transformer.layers.4.attention.query_key_value.weight
  104. ----------------------------------------------------------------------------------------------------
  105. layer name = transformer.layers.4.attention.query_key_value.bias
  106. ----------------------------------------------------------------------------------------------------
  107. layer name = transformer.layers.4.attention.dense.weight
  108. ----------------------------------------------------------------------------------------------------
  109. layer name = transformer.layers.4.attention.dense.bias
  110. ----------------------------------------------------------------------------------------------------
  111. layer name = transformer.layers.4.post_attention_layernorm.weight
  112. ----------------------------------------------------------------------------------------------------
  113. layer name = transformer.layers.4.post_attention_layernorm.bias
  114. ----------------------------------------------------------------------------------------------------
  115. layer name = transformer.layers.4.mlp.dense_h_to_4h.weight
  116. ----------------------------------------------------------------------------------------------------
  117. layer name = transformer.layers.4.mlp.dense_h_to_4h.bias
  118. ----------------------------------------------------------------------------------------------------
  119. layer name = transformer.layers.4.mlp.dense_4h_to_h.weight
  120. ----------------------------------------------------------------------------------------------------
  121. layer name = transformer.layers.4.mlp.dense_4h_to_h.bias
  122. ----------------------------------------------------------------------------------------------------
  123. layer name = transformer.layers.5.input_layernorm.weight
  124. ----------------------------------------------------------------------------------------------------
  125. layer name = transformer.layers.5.input_layernorm.bias
  126. ----------------------------------------------------------------------------------------------------
  127. layer name = transformer.layers.5.attention.query_key_value.weight
  128. ----------------------------------------------------------------------------------------------------
  129. layer name = transformer.layers.5.attention.query_key_value.bias
  130. ----------------------------------------------------------------------------------------------------
  131. layer name = transformer.layers.5.attention.dense.weight
  132. ----------------------------------------------------------------------------------------------------
  133. layer name = transformer.layers.5.attention.dense.bias
  134. ----------------------------------------------------------------------------------------------------
  135. layer name = transformer.layers.5.post_attention_layernorm.weight
  136. ----------------------------------------------------------------------------------------------------
  137. layer name = transformer.layers.5.post_attention_layernorm.bias
  138. ----------------------------------------------------------------------------------------------------
  139. layer name = transformer.layers.5.mlp.dense_h_to_4h.weight
  140. ----------------------------------------------------------------------------------------------------
  141. layer name = transformer.layers.5.mlp.dense_h_to_4h.bias
  142. ----------------------------------------------------------------------------------------------------
  143. layer name = transformer.layers.5.mlp.dense_4h_to_h.weight
  144. ----------------------------------------------------------------------------------------------------
  145. layer name = transformer.layers.5.mlp.dense_4h_to_h.bias
  146. ----------------------------------------------------------------------------------------------------
  147. layer name = transformer.layers.6.input_layernorm.weight
  148. ----------------------------------------------------------------------------------------------------
  149. layer name = transformer.layers.6.input_layernorm.bias
  150. ----------------------------------------------------------------------------------------------------
  151. layer name = transformer.layers.6.attention.query_key_value.weight
  152. ----------------------------------------------------------------------------------------------------
  153. layer name = transformer.layers.6.attention.query_key_value.bias
  154. ----------------------------------------------------------------------------------------------------
  155. layer name = transformer.layers.6.attention.dense.weight
  156. ----------------------------------------------------------------------------------------------------
  157. layer name = transformer.layers.6.attention.dense.bias
  158. ----------------------------------------------------------------------------------------------------
  159. layer name = transformer.layers.6.post_attention_layernorm.weight
  160. ----------------------------------------------------------------------------------------------------
  161. layer name = transformer.layers.6.post_attention_layernorm.bias
  162. ----------------------------------------------------------------------------------------------------
  163. layer name = transformer.layers.6.mlp.dense_h_to_4h.weight
  164. ----------------------------------------------------------------------------------------------------
  165. layer name = transformer.layers.6.mlp.dense_h_to_4h.bias
  166. ----------------------------------------------------------------------------------------------------
  167. layer name = transformer.layers.6.mlp.dense_4h_to_h.weight
  168. ----------------------------------------------------------------------------------------------------
  169. layer name = transformer.layers.6.mlp.dense_4h_to_h.bias
  170. ----------------------------------------------------------------------------------------------------
  171. layer name = transformer.layers.7.input_layernorm.weight
  172. ----------------------------------------------------------------------------------------------------
  173. layer name = transformer.layers.7.input_layernorm.bias
  174. ----------------------------------------------------------------------------------------------------
  175. layer name = transformer.layers.7.attention.query_key_value.weight
  176. ----------------------------------------------------------------------------------------------------
  177. layer name = transformer.layers.7.attention.query_key_value.bias
  178. ----------------------------------------------------------------------------------------------------
  179. layer name = transformer.layers.7.attention.dense.weight
  180. ----------------------------------------------------------------------------------------------------
  181. layer name = transformer.layers.7.attention.dense.bias
  182. ----------------------------------------------------------------------------------------------------
  183. layer name = transformer.layers.7.post_attention_layernorm.weight
  184. ----------------------------------------------------------------------------------------------------
  185. layer name = transformer.layers.7.post_attention_layernorm.bias
  186. ----------------------------------------------------------------------------------------------------
  187. layer name = transformer.layers.7.mlp.dense_h_to_4h.weight
  188. ----------------------------------------------------------------------------------------------------
  189. layer name = transformer.layers.7.mlp.dense_h_to_4h.bias
  190. ----------------------------------------------------------------------------------------------------
  191. layer name = transformer.layers.7.mlp.dense_4h_to_h.weight
  192. ----------------------------------------------------------------------------------------------------
  193. layer name = transformer.layers.7.mlp.dense_4h_to_h.bias
  194. ----------------------------------------------------------------------------------------------------
  195. layer name = transformer.layers.8.input_layernorm.weight
  196. ----------------------------------------------------------------------------------------------------
  197. layer name = transformer.layers.8.input_layernorm.bias
  198. ----------------------------------------------------------------------------------------------------
  199. layer name = transformer.layers.8.attention.query_key_value.weight
  200. ----------------------------------------------------------------------------------------------------
  201. layer name = transformer.layers.8.attention.query_key_value.bias
  202. ----------------------------------------------------------------------------------------------------
  203. layer name = transformer.layers.8.attention.dense.weight
  204. ----------------------------------------------------------------------------------------------------
  205. layer name = transformer.layers.8.attention.dense.bias
  206. ----------------------------------------------------------------------------------------------------
  207. layer name = transformer.layers.8.post_attention_layernorm.weight
  208. ----------------------------------------------------------------------------------------------------
  209. layer name = transformer.layers.8.post_attention_layernorm.bias
  210. ----------------------------------------------------------------------------------------------------
  211. layer name = transformer.layers.8.mlp.dense_h_to_4h.weight
  212. ----------------------------------------------------------------------------------------------------
  213. layer name = transformer.layers.8.mlp.dense_h_to_4h.bias
  214. ----------------------------------------------------------------------------------------------------
  215. layer name = transformer.layers.8.mlp.dense_4h_to_h.weight
  216. ----------------------------------------------------------------------------------------------------
  217. layer name = transformer.layers.8.mlp.dense_4h_to_h.bias
  218. ----------------------------------------------------------------------------------------------------
  219. layer name = transformer.layers.9.input_layernorm.weight
  220. ----------------------------------------------------------------------------------------------------
  221. layer name = transformer.layers.9.input_layernorm.bias
  222. ----------------------------------------------------------------------------------------------------
  223. layer name = transformer.layers.9.attention.query_key_value.weight
  224. ----------------------------------------------------------------------------------------------------
  225. layer name = transformer.layers.9.attention.query_key_value.bias
  226. ----------------------------------------------------------------------------------------------------
  227. layer name = transformer.layers.9.attention.dense.weight
  228. ----------------------------------------------------------------------------------------------------
  229. layer name = transformer.layers.9.attention.dense.bias
  230. ----------------------------------------------------------------------------------------------------
  231. layer name = transformer.layers.9.post_attention_layernorm.weight
  232. ----------------------------------------------------------------------------------------------------
  233. layer name = transformer.layers.9.post_attention_layernorm.bias
  234. ----------------------------------------------------------------------------------------------------
  235. layer name = transformer.layers.9.mlp.dense_h_to_4h.weight
  236. ----------------------------------------------------------------------------------------------------
  237. layer name = transformer.layers.9.mlp.dense_h_to_4h.bias
  238. ----------------------------------------------------------------------------------------------------
  239. layer name = transformer.layers.9.mlp.dense_4h_to_h.weight
  240. ----------------------------------------------------------------------------------------------------
  241. layer name = transformer.layers.9.mlp.dense_4h_to_h.bias
  242. ----------------------------------------------------------------------------------------------------
  243. layer name = transformer.layers.10.input_layernorm.weight
  244. ----------------------------------------------------------------------------------------------------
  245. layer name = transformer.layers.10.input_layernorm.bias
  246. ----------------------------------------------------------------------------------------------------
  247. layer name = transformer.layers.10.attention.query_key_value.weight
  248. ----------------------------------------------------------------------------------------------------
  249. layer name = transformer.layers.10.attention.query_key_value.bias
  250. ----------------------------------------------------------------------------------------------------
  251. layer name = transformer.layers.10.attention.dense.weight
  252. ----------------------------------------------------------------------------------------------------
  253. layer name = transformer.layers.10.attention.dense.bias
  254. ----------------------------------------------------------------------------------------------------
  255. layer name = transformer.layers.10.post_attention_layernorm.weight
  256. ----------------------------------------------------------------------------------------------------
  257. layer name = transformer.layers.10.post_attention_layernorm.bias
  258. ----------------------------------------------------------------------------------------------------
  259. layer name = transformer.layers.10.mlp.dense_h_to_4h.weight
  260. ----------------------------------------------------------------------------------------------------
  261. layer name = transformer.layers.10.mlp.dense_h_to_4h.bias
  262. ----------------------------------------------------------------------------------------------------
  263. layer name = transformer.layers.10.mlp.dense_4h_to_h.weight
  264. ----------------------------------------------------------------------------------------------------
  265. layer name = transformer.layers.10.mlp.dense_4h_to_h.bias
  266. ----------------------------------------------------------------------------------------------------
  267. layer name = transformer.layers.11.input_layernorm.weight
  268. ----------------------------------------------------------------------------------------------------
  269. layer name = transformer.layers.11.input_layernorm.bias
  270. ----------------------------------------------------------------------------------------------------
  271. layer name = transformer.layers.11.attention.query_key_value.weight
  272. ----------------------------------------------------------------------------------------------------
  273. layer name = transformer.layers.11.attention.query_key_value.bias
  274. ----------------------------------------------------------------------------------------------------
  275. layer name = transformer.layers.11.attention.dense.weight
  276. ----------------------------------------------------------------------------------------------------
  277. layer name = transformer.layers.11.attention.dense.bias
  278. ----------------------------------------------------------------------------------------------------
  279. layer name = transformer.layers.11.post_attention_layernorm.weight
  280. ----------------------------------------------------------------------------------------------------
  281. layer name = transformer.layers.11.post_attention_layernorm.bias
  282. ----------------------------------------------------------------------------------------------------
  283. layer name = transformer.layers.11.mlp.dense_h_to_4h.weight
  284. ----------------------------------------------------------------------------------------------------
  285. layer name = transformer.layers.11.mlp.dense_h_to_4h.bias
  286. ----------------------------------------------------------------------------------------------------
  287. layer name = transformer.layers.11.mlp.dense_4h_to_h.weight
  288. ----------------------------------------------------------------------------------------------------
  289. layer name = transformer.layers.11.mlp.dense_4h_to_h.bias
  290. ----------------------------------------------------------------------------------------------------
  291. layer name = transformer.layers.12.input_layernorm.weight
  292. ----------------------------------------------------------------------------------------------------
  293. layer name = transformer.layers.12.input_layernorm.bias
  294. ----------------------------------------------------------------------------------------------------
  295. layer name = transformer.layers.12.attention.query_key_value.weight
  296. ----------------------------------------------------------------------------------------------------
  297. layer name = transformer.layers.12.attention.query_key_value.bias
  298. ----------------------------------------------------------------------------------------------------
  299. layer name = transformer.layers.12.attention.dense.weight
  300. ----------------------------------------------------------------------------------------------------
  301. layer name = transformer.layers.12.attention.dense.bias
  302. ----------------------------------------------------------------------------------------------------
  303. layer name = transformer.layers.12.post_attention_layernorm.weight
  304. ----------------------------------------------------------------------------------------------------
  305. layer name = transformer.layers.12.post_attention_layernorm.bias
  306. ----------------------------------------------------------------------------------------------------
  307. layer name = transformer.layers.12.mlp.dense_h_to_4h.weight
  308. ----------------------------------------------------------------------------------------------------
  309. layer name = transformer.layers.12.mlp.dense_h_to_4h.bias
  310. ----------------------------------------------------------------------------------------------------
  311. layer name = transformer.layers.12.mlp.dense_4h_to_h.weight
  312. ----------------------------------------------------------------------------------------------------
  313. layer name = transformer.layers.12.mlp.dense_4h_to_h.bias
  314. ----------------------------------------------------------------------------------------------------
  315. layer name = transformer.layers.13.input_layernorm.weight
  316. ----------------------------------------------------------------------------------------------------
  317. layer name = transformer.layers.13.input_layernorm.bias
  318. ----------------------------------------------------------------------------------------------------
  319. layer name = transformer.layers.13.attention.query_key_value.weight
  320. ----------------------------------------------------------------------------------------------------
  321. layer name = transformer.layers.13.attention.query_key_value.bias
  322. ----------------------------------------------------------------------------------------------------
  323. layer name = transformer.layers.13.attention.dense.weight
  324. ----------------------------------------------------------------------------------------------------
  325. layer name = transformer.layers.13.attention.dense.bias
  326. ----------------------------------------------------------------------------------------------------
  327. layer name = transformer.layers.13.post_attention_layernorm.weight
  328. ----------------------------------------------------------------------------------------------------
  329. layer name = transformer.layers.13.post_attention_layernorm.bias
  330. ----------------------------------------------------------------------------------------------------
  331. layer name = transformer.layers.13.mlp.dense_h_to_4h.weight
  332. ----------------------------------------------------------------------------------------------------
  333. layer name = transformer.layers.13.mlp.dense_h_to_4h.bias
  334. ----------------------------------------------------------------------------------------------------
  335. layer name = transformer.layers.13.mlp.dense_4h_to_h.weight
  336. ----------------------------------------------------------------------------------------------------
  337. layer name = transformer.layers.13.mlp.dense_4h_to_h.bias
  338. ----------------------------------------------------------------------------------------------------
  339. layer name = transformer.layers.14.input_layernorm.weight
  340. ----------------------------------------------------------------------------------------------------
  341. layer name = transformer.layers.14.input_layernorm.bias
  342. ----------------------------------------------------------------------------------------------------
  343. layer name = transformer.layers.14.attention.query_key_value.weight
  344. ----------------------------------------------------------------------------------------------------
  345. layer name = transformer.layers.14.attention.query_key_value.bias
  346. ----------------------------------------------------------------------------------------------------
  347. layer name = transformer.layers.14.attention.dense.weight
  348. ----------------------------------------------------------------------------------------------------
  349. layer name = transformer.layers.14.attention.dense.bias
  350. ----------------------------------------------------------------------------------------------------
  351. layer name = transformer.layers.14.post_attention_layernorm.weight
  352. ----------------------------------------------------------------------------------------------------
  353. layer name = transformer.layers.14.post_attention_layernorm.bias
  354. ----------------------------------------------------------------------------------------------------
  355. layer name = transformer.layers.14.mlp.dense_h_to_4h.weight
  356. ----------------------------------------------------------------------------------------------------
  357. layer name = transformer.layers.14.mlp.dense_h_to_4h.bias
  358. ----------------------------------------------------------------------------------------------------
  359. layer name = transformer.layers.14.mlp.dense_4h_to_h.weight
  360. ----------------------------------------------------------------------------------------------------
  361. layer name = transformer.layers.14.mlp.dense_4h_to_h.bias
  362. ----------------------------------------------------------------------------------------------------
  363. layer name = transformer.layers.15.input_layernorm.weight
  364. ----------------------------------------------------------------------------------------------------
  365. layer name = transformer.layers.15.input_layernorm.bias
  366. ----------------------------------------------------------------------------------------------------
  367. layer name = transformer.layers.15.attention.query_key_value.weight
  368. ----------------------------------------------------------------------------------------------------
  369. layer name = transformer.layers.15.attention.query_key_value.bias
  370. ----------------------------------------------------------------------------------------------------
  371. layer name = transformer.layers.15.attention.dense.weight
  372. ----------------------------------------------------------------------------------------------------
  373. layer name = transformer.layers.15.attention.dense.bias
  374. ----------------------------------------------------------------------------------------------------
  375. layer name = transformer.layers.15.post_attention_layernorm.weight
  376. ----------------------------------------------------------------------------------------------------
  377. layer name = transformer.layers.15.post_attention_layernorm.bias
  378. ----------------------------------------------------------------------------------------------------
  379. layer name = transformer.layers.15.mlp.dense_h_to_4h.weight
  380. ----------------------------------------------------------------------------------------------------
  381. layer name = transformer.layers.15.mlp.dense_h_to_4h.bias
  382. ----------------------------------------------------------------------------------------------------
  383. layer name = transformer.layers.15.mlp.dense_4h_to_h.weight
  384. ----------------------------------------------------------------------------------------------------
  385. layer name = transformer.layers.15.mlp.dense_4h_to_h.bias
  386. ----------------------------------------------------------------------------------------------------
  387. layer name = transformer.layers.16.input_layernorm.weight
  388. ----------------------------------------------------------------------------------------------------
  389. layer name = transformer.layers.16.input_layernorm.bias
  390. ----------------------------------------------------------------------------------------------------
  391. layer name = transformer.layers.16.attention.query_key_value.weight
  392. ----------------------------------------------------------------------------------------------------
  393. layer name = transformer.layers.16.attention.query_key_value.bias
  394. ----------------------------------------------------------------------------------------------------
  395. layer name = transformer.layers.16.attention.dense.weight
  396. ----------------------------------------------------------------------------------------------------
  397. layer name = transformer.layers.16.attention.dense.bias
  398. ----------------------------------------------------------------------------------------------------
  399. layer name = transformer.layers.16.post_attention_layernorm.weight
  400. ----------------------------------------------------------------------------------------------------
  401. layer name = transformer.layers.16.post_attention_layernorm.bias
  402. ----------------------------------------------------------------------------------------------------
  403. layer name = transformer.layers.16.mlp.dense_h_to_4h.weight
  404. ----------------------------------------------------------------------------------------------------
  405. layer name = transformer.layers.16.mlp.dense_h_to_4h.bias
  406. ----------------------------------------------------------------------------------------------------
  407. layer name = transformer.layers.16.mlp.dense_4h_to_h.weight
  408. ----------------------------------------------------------------------------------------------------
  409. layer name = transformer.layers.16.mlp.dense_4h_to_h.bias
  410. ----------------------------------------------------------------------------------------------------
  411. layer name = transformer.layers.17.input_layernorm.weight
  412. ----------------------------------------------------------------------------------------------------
  413. layer name = transformer.layers.17.input_layernorm.bias
  414. ----------------------------------------------------------------------------------------------------
  415. layer name = transformer.layers.17.attention.query_key_value.weight
  416. ----------------------------------------------------------------------------------------------------
  417. layer name = transformer.layers.17.attention.query_key_value.bias
  418. ----------------------------------------------------------------------------------------------------
  419. layer name = transformer.layers.17.attention.dense.weight
  420. ----------------------------------------------------------------------------------------------------
  421. layer name = transformer.layers.17.attention.dense.bias
  422. ----------------------------------------------------------------------------------------------------
  423. layer name = transformer.layers.17.post_attention_layernorm.weight
  424. ----------------------------------------------------------------------------------------------------
  425. layer name = transformer.layers.17.post_attention_layernorm.bias
  426. ----------------------------------------------------------------------------------------------------
  427. layer name = transformer.layers.17.mlp.dense_h_to_4h.weight
  428. ----------------------------------------------------------------------------------------------------
  429. layer name = transformer.layers.17.mlp.dense_h_to_4h.bias
  430. ----------------------------------------------------------------------------------------------------
  431. layer name = transformer.layers.17.mlp.dense_4h_to_h.weight
  432. ----------------------------------------------------------------------------------------------------
  433. layer name = transformer.layers.17.mlp.dense_4h_to_h.bias
  434. ----------------------------------------------------------------------------------------------------
  435. layer name = transformer.layers.18.input_layernorm.weight
  436. ----------------------------------------------------------------------------------------------------
  437. layer name = transformer.layers.18.input_layernorm.bias
  438. ----------------------------------------------------------------------------------------------------
  439. layer name = transformer.layers.18.attention.query_key_value.weight
  440. ----------------------------------------------------------------------------------------------------
  441. layer name = transformer.layers.18.attention.query_key_value.bias
  442. ----------------------------------------------------------------------------------------------------
  443. layer name = transformer.layers.18.attention.dense.weight
  444. ----------------------------------------------------------------------------------------------------
  445. layer name = transformer.layers.18.attention.dense.bias
  446. ----------------------------------------------------------------------------------------------------
  447. layer name = transformer.layers.18.post_attention_layernorm.weight
  448. ----------------------------------------------------------------------------------------------------
  449. layer name = transformer.layers.18.post_attention_layernorm.bias
  450. ----------------------------------------------------------------------------------------------------
  451. layer name = transformer.layers.18.mlp.dense_h_to_4h.weight
  452. ----------------------------------------------------------------------------------------------------
  453. layer name = transformer.layers.18.mlp.dense_h_to_4h.bias
  454. ----------------------------------------------------------------------------------------------------
  455. layer name = transformer.layers.18.mlp.dense_4h_to_h.weight
  456. ----------------------------------------------------------------------------------------------------
  457. layer name = transformer.layers.18.mlp.dense_4h_to_h.bias
  458. ----------------------------------------------------------------------------------------------------
  459. layer name = transformer.layers.19.input_layernorm.weight
  460. ----------------------------------------------------------------------------------------------------
  461. layer name = transformer.layers.19.input_layernorm.bias
  462. ----------------------------------------------------------------------------------------------------
  463. layer name = transformer.layers.19.attention.query_key_value.weight
  464. ----------------------------------------------------------------------------------------------------
  465. layer name = transformer.layers.19.attention.query_key_value.bias
  466. ----------------------------------------------------------------------------------------------------
  467. layer name = transformer.layers.19.attention.dense.weight
  468. ----------------------------------------------------------------------------------------------------
  469. layer name = transformer.layers.19.attention.dense.bias
  470. ----------------------------------------------------------------------------------------------------
  471. layer name = transformer.layers.19.post_attention_layernorm.weight
  472. ----------------------------------------------------------------------------------------------------
  473. layer name = transformer.layers.19.post_attention_layernorm.bias
  474. ----------------------------------------------------------------------------------------------------
  475. layer name = transformer.layers.19.mlp.dense_h_to_4h.weight
  476. ----------------------------------------------------------------------------------------------------
  477. layer name = transformer.layers.19.mlp.dense_h_to_4h.bias
  478. ----------------------------------------------------------------------------------------------------
  479. layer name = transformer.layers.19.mlp.dense_4h_to_h.weight
  480. ----------------------------------------------------------------------------------------------------
  481. layer name = transformer.layers.19.mlp.dense_4h_to_h.bias
  482. ----------------------------------------------------------------------------------------------------
  483. layer name = transformer.layers.20.input_layernorm.weight
  484. ----------------------------------------------------------------------------------------------------
  485. layer name = transformer.layers.20.input_layernorm.bias
  486. ----------------------------------------------------------------------------------------------------
  487. layer name = transformer.layers.20.attention.query_key_value.weight
  488. ----------------------------------------------------------------------------------------------------
  489. layer name = transformer.layers.20.attention.query_key_value.bias
  490. ----------------------------------------------------------------------------------------------------
  491. layer name = transformer.layers.20.attention.dense.weight
  492. ----------------------------------------------------------------------------------------------------
  493. layer name = transformer.layers.20.attention.dense.bias
  494. ----------------------------------------------------------------------------------------------------
  495. layer name = transformer.layers.20.post_attention_layernorm.weight
  496. ----------------------------------------------------------------------------------------------------
  497. layer name = transformer.layers.20.post_attention_layernorm.bias
  498. ----------------------------------------------------------------------------------------------------
  499. layer name = transformer.layers.20.mlp.dense_h_to_4h.weight
  500. ----------------------------------------------------------------------------------------------------
  501. layer name = transformer.layers.20.mlp.dense_h_to_4h.bias
  502. ----------------------------------------------------------------------------------------------------
  503. layer name = transformer.layers.20.mlp.dense_4h_to_h.weight
  504. ----------------------------------------------------------------------------------------------------
  505. layer name = transformer.layers.20.mlp.dense_4h_to_h.bias
  506. ----------------------------------------------------------------------------------------------------
  507. layer name = transformer.layers.21.input_layernorm.weight
  508. ----------------------------------------------------------------------------------------------------
  509. layer name = transformer.layers.21.input_layernorm.bias
  510. ----------------------------------------------------------------------------------------------------
  511. layer name = transformer.layers.21.attention.query_key_value.weight
  512. ----------------------------------------------------------------------------------------------------
  513. layer name = transformer.layers.21.attention.query_key_value.bias
  514. ----------------------------------------------------------------------------------------------------
  515. layer name = transformer.layers.21.attention.dense.weight
  516. ----------------------------------------------------------------------------------------------------
  517. layer name = transformer.layers.21.attention.dense.bias
  518. ----------------------------------------------------------------------------------------------------
  519. layer name = transformer.layers.21.post_attention_layernorm.weight
  520. ----------------------------------------------------------------------------------------------------
  521. layer name = transformer.layers.21.post_attention_layernorm.bias
  522. ----------------------------------------------------------------------------------------------------
  523. layer name = transformer.layers.21.mlp.dense_h_to_4h.weight
  524. ----------------------------------------------------------------------------------------------------
  525. layer name = transformer.layers.21.mlp.dense_h_to_4h.bias
  526. ----------------------------------------------------------------------------------------------------
  527. layer name = transformer.layers.21.mlp.dense_4h_to_h.weight
  528. ----------------------------------------------------------------------------------------------------
  529. layer name = transformer.layers.21.mlp.dense_4h_to_h.bias
  530. ----------------------------------------------------------------------------------------------------
  531. layer name = transformer.layers.22.input_layernorm.weight
  532. ----------------------------------------------------------------------------------------------------
  533. layer name = transformer.layers.22.input_layernorm.bias
  534. ----------------------------------------------------------------------------------------------------
  535. layer name = transformer.layers.22.attention.query_key_value.weight
  536. ----------------------------------------------------------------------------------------------------
  537. layer name = transformer.layers.22.attention.query_key_value.bias
  538. ----------------------------------------------------------------------------------------------------
  539. layer name = transformer.layers.22.attention.dense.weight
  540. ----------------------------------------------------------------------------------------------------
  541. layer name = transformer.layers.22.attention.dense.bias
  542. ----------------------------------------------------------------------------------------------------
  543. layer name = transformer.layers.22.post_attention_layernorm.weight
  544. ----------------------------------------------------------------------------------------------------
  545. layer name = transformer.layers.22.post_attention_layernorm.bias
  546. ----------------------------------------------------------------------------------------------------
  547. layer name = transformer.layers.22.mlp.dense_h_to_4h.weight
  548. ----------------------------------------------------------------------------------------------------
  549. layer name = transformer.layers.22.mlp.dense_h_to_4h.bias
  550. ----------------------------------------------------------------------------------------------------
  551. layer name = transformer.layers.22.mlp.dense_4h_to_h.weight
  552. ----------------------------------------------------------------------------------------------------
  553. layer name = transformer.layers.22.mlp.dense_4h_to_h.bias
  554. ----------------------------------------------------------------------------------------------------
  555. layer name = transformer.layers.23.input_layernorm.weight
  556. ----------------------------------------------------------------------------------------------------
  557. layer name = transformer.layers.23.input_layernorm.bias
  558. ----------------------------------------------------------------------------------------------------
  559. layer name = transformer.layers.23.attention.query_key_value.weight
  560. ----------------------------------------------------------------------------------------------------
  561. layer name = transformer.layers.23.attention.query_key_value.bias
  562. ----------------------------------------------------------------------------------------------------
  563. layer name = transformer.layers.23.attention.dense.weight
  564. ----------------------------------------------------------------------------------------------------
  565. layer name = transformer.layers.23.attention.dense.bias
  566. ----------------------------------------------------------------------------------------------------
  567. layer name = transformer.layers.23.post_attention_layernorm.weight
  568. ----------------------------------------------------------------------------------------------------
  569. layer name = transformer.layers.23.post_attention_layernorm.bias
  570. ----------------------------------------------------------------------------------------------------
  571. layer name = transformer.layers.23.mlp.dense_h_to_4h.weight
  572. ----------------------------------------------------------------------------------------------------
  573. layer name = transformer.layers.23.mlp.dense_h_to_4h.bias
  574. ----------------------------------------------------------------------------------------------------
  575. layer name = transformer.layers.23.mlp.dense_4h_to_h.weight
  576. ----------------------------------------------------------------------------------------------------
  577. layer name = transformer.layers.23.mlp.dense_4h_to_h.bias
  578. ----------------------------------------------------------------------------------------------------
  579. layer name = transformer.layers.24.input_layernorm.weight
  580. ----------------------------------------------------------------------------------------------------
  581. layer name = transformer.layers.24.input_layernorm.bias
  582. ----------------------------------------------------------------------------------------------------
  583. layer name = transformer.layers.24.attention.query_key_value.weight
  584. ----------------------------------------------------------------------------------------------------
  585. layer name = transformer.layers.24.attention.query_key_value.bias
  586. ----------------------------------------------------------------------------------------------------
  587. layer name = transformer.layers.24.attention.dense.weight
  588. ----------------------------------------------------------------------------------------------------
  589. layer name = transformer.layers.24.attention.dense.bias
  590. ----------------------------------------------------------------------------------------------------
  591. layer name = transformer.layers.24.post_attention_layernorm.weight
  592. ----------------------------------------------------------------------------------------------------
  593. layer name = transformer.layers.24.post_attention_layernorm.bias
  594. ----------------------------------------------------------------------------------------------------
  595. layer name = transformer.layers.24.mlp.dense_h_to_4h.weight
  596. ----------------------------------------------------------------------------------------------------
  597. layer name = transformer.layers.24.mlp.dense_h_to_4h.bias
  598. ----------------------------------------------------------------------------------------------------
  599. layer name = transformer.layers.24.mlp.dense_4h_to_h.weight
  600. ----------------------------------------------------------------------------------------------------
  601. layer name = transformer.layers.24.mlp.dense_4h_to_h.bias
  602. ----------------------------------------------------------------------------------------------------
  603. layer name = transformer.layers.25.input_layernorm.weight
  604. ----------------------------------------------------------------------------------------------------
  605. layer name = transformer.layers.25.input_layernorm.bias
  606. ----------------------------------------------------------------------------------------------------
  607. layer name = transformer.layers.25.attention.query_key_value.weight
  608. ----------------------------------------------------------------------------------------------------
  609. layer name = transformer.layers.25.attention.query_key_value.bias
  610. ----------------------------------------------------------------------------------------------------
  611. layer name = transformer.layers.25.attention.dense.weight
  612. ----------------------------------------------------------------------------------------------------
  613. layer name = transformer.layers.25.attention.dense.bias
  614. ----------------------------------------------------------------------------------------------------
  615. layer name = transformer.layers.25.post_attention_layernorm.weight
  616. ----------------------------------------------------------------------------------------------------
  617. layer name = transformer.layers.25.post_attention_layernorm.bias
  618. ----------------------------------------------------------------------------------------------------
  619. layer name = transformer.layers.25.mlp.dense_h_to_4h.weight
  620. ----------------------------------------------------------------------------------------------------
  621. layer name = transformer.layers.25.mlp.dense_h_to_4h.bias
  622. ----------------------------------------------------------------------------------------------------
  623. layer name = transformer.layers.25.mlp.dense_4h_to_h.weight
  624. ----------------------------------------------------------------------------------------------------
  625. layer name = transformer.layers.25.mlp.dense_4h_to_h.bias
  626. ----------------------------------------------------------------------------------------------------
  627. layer name = transformer.layers.26.input_layernorm.weight
  628. ----------------------------------------------------------------------------------------------------
  629. layer name = transformer.layers.26.input_layernorm.bias
  630. ----------------------------------------------------------------------------------------------------
  631. layer name = transformer.layers.26.attention.query_key_value.weight
  632. ----------------------------------------------------------------------------------------------------
  633. layer name = transformer.layers.26.attention.query_key_value.bias
  634. ----------------------------------------------------------------------------------------------------
  635. layer name = transformer.layers.26.attention.dense.weight
  636. ----------------------------------------------------------------------------------------------------
  637. layer name = transformer.layers.26.attention.dense.bias
  638. ----------------------------------------------------------------------------------------------------
  639. layer name = transformer.layers.26.post_attention_layernorm.weight
  640. ----------------------------------------------------------------------------------------------------
  641. layer name = transformer.layers.26.post_attention_layernorm.bias
  642. ----------------------------------------------------------------------------------------------------
  643. layer name = transformer.layers.26.mlp.dense_h_to_4h.weight
  644. ----------------------------------------------------------------------------------------------------
  645. layer name = transformer.layers.26.mlp.dense_h_to_4h.bias
  646. ----------------------------------------------------------------------------------------------------
  647. layer name = transformer.layers.26.mlp.dense_4h_to_h.weight
  648. ----------------------------------------------------------------------------------------------------
  649. layer name = transformer.layers.26.mlp.dense_4h_to_h.bias
  650. ----------------------------------------------------------------------------------------------------
  651. layer name = transformer.layers.27.input_layernorm.weight
  652. ----------------------------------------------------------------------------------------------------
  653. layer name = transformer.layers.27.input_layernorm.bias
  654. ----------------------------------------------------------------------------------------------------
  655. layer name = transformer.layers.27.attention.query_key_value.weight
  656. ----------------------------------------------------------------------------------------------------
  657. layer name = transformer.layers.27.attention.query_key_value.bias
  658. ----------------------------------------------------------------------------------------------------
  659. layer name = transformer.layers.27.attention.dense.weight
  660. ----------------------------------------------------------------------------------------------------
  661. layer name = transformer.layers.27.attention.dense.bias
  662. ----------------------------------------------------------------------------------------------------
  663. layer name = transformer.layers.27.post_attention_layernorm.weight
  664. ----------------------------------------------------------------------------------------------------
  665. layer name = transformer.layers.27.post_attention_layernorm.bias
  666. ----------------------------------------------------------------------------------------------------
  667. layer name = transformer.layers.27.mlp.dense_h_to_4h.weight
  668. ----------------------------------------------------------------------------------------------------
  669. layer name = transformer.layers.27.mlp.dense_h_to_4h.bias
  670. ----------------------------------------------------------------------------------------------------
  671. layer name = transformer.layers.27.mlp.dense_4h_to_h.weight
  672. ----------------------------------------------------------------------------------------------------
  673. layer name = transformer.layers.27.mlp.dense_4h_to_h.bias
  674. ----------------------------------------------------------------------------------------------------
  675. layer name = transformer.final_layernorm.weight
  676. ----------------------------------------------------------------------------------------------------
  677. layer name = transformer.final_layernorm.bias
  678. ----------------------------------------------------------------------------------------------------

添加Lora层之后:

  1. PeftModelForCausalLM(
  2. (base_model): LoraModel(
  3. (model): ChatGLMForConditionalGeneration(
  4. (transformer): ChatGLMModel(
  5. (word_embeddings): Embedding(130528, 4096)
  6. (layers): ModuleList(
  7. (0-27): 28 x GLMBlock(
  8. (input_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  9. (attention): SelfAttention(
  10. (rotary_emb): RotaryEmbedding()
  11. (query_key_value): Linear(
  12. in_features=4096, out_features=12288, bias=True
  13. (lora_dropout): ModuleDict(
  14. (default): Dropout(p=0.1, inplace=False)
  15. )
  16. (lora_A): ModuleDict(
  17. (default): Linear(in_features=4096, out_features=8, bias=False)
  18. )
  19. (lora_B): ModuleDict(
  20. (default): Linear(in_features=8, out_features=12288, bias=False)
  21. )
  22. (lora_embedding_A): ParameterDict()
  23. (lora_embedding_B): ParameterDict()
  24. )
  25. (dense): Linear(in_features=4096, out_features=4096, bias=True)
  26. )
  27. (post_attention_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  28. (mlp): GLU(
  29. (dense_h_to_4h): Linear(in_features=4096, out_features=16384, bias=True)
  30. (dense_4h_to_h): Linear(in_features=16384, out_features=4096, bias=True)
  31. )
  32. )
  33. )
  34. (final_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
  35. )
  36. (lm_head): CastOutputToFloat(
  37. (0): Linear(in_features=4096, out_features=130528, bias=False)
  38. )
  39. )
  40. )
  41. )
  1. layer name = base_model.model.transformer.word_embeddings.weight
  2. ----------------------------------------------------------------------------------------------------
  3. layer name = base_model.model.transformer.layers.0.input_layernorm.weight
  4. ----------------------------------------------------------------------------------------------------
  5. layer name = base_model.model.transformer.layers.0.input_layernorm.bias
  6. ----------------------------------------------------------------------------------------------------
  7. layer name = base_model.model.transformer.layers.0.attention.query_key_value.weight
  8. ----------------------------------------------------------------------------------------------------
  9. layer name = base_model.model.transformer.layers.0.attention.query_key_value.bias
  10. ----------------------------------------------------------------------------------------------------
  11. layer name = base_model.model.transformer.layers.0.attention.query_key_value.lora_A.default.weight
  12. ----------------------------------------------------------------------------------------------------
  13. layer name = base_model.model.transformer.layers.0.attention.query_key_value.lora_B.default.weight
  14. ----------------------------------------------------------------------------------------------------
  15. layer name = base_model.model.transformer.layers.0.attention.dense.weight
  16. ----------------------------------------------------------------------------------------------------
  17. layer name = base_model.model.transformer.layers.0.attention.dense.bias
  18. ----------------------------------------------------------------------------------------------------
  19. layer name = base_model.model.transformer.layers.0.post_attention_layernorm.weight
  20. ----------------------------------------------------------------------------------------------------
  21. layer name = base_model.model.transformer.layers.0.post_attention_layernorm.bias
  22. ----------------------------------------------------------------------------------------------------
  23. layer name = base_model.model.transformer.layers.0.mlp.dense_h_to_4h.weight
  24. ----------------------------------------------------------------------------------------------------
  25. layer name = base_model.model.transformer.layers.0.mlp.dense_h_to_4h.bias
  26. ----------------------------------------------------------------------------------------------------
  27. layer name = base_model.model.transformer.layers.0.mlp.dense_4h_to_h.weight
  28. ----------------------------------------------------------------------------------------------------
  29. layer name = base_model.model.transformer.layers.0.mlp.dense_4h_to_h.bias
  30. ----------------------------------------------------------------------------------------------------
  31. layer name = base_model.model.transformer.layers.1.input_layernorm.weight
  32. ----------------------------------------------------------------------------------------------------
  33. layer name = base_model.model.transformer.layers.1.input_layernorm.bias
  34. ----------------------------------------------------------------------------------------------------
  35. layer name = base_model.model.transformer.layers.1.attention.query_key_value.weight
  36. ----------------------------------------------------------------------------------------------------
  37. layer name = base_model.model.transformer.layers.1.attention.query_key_value.bias
  38. ----------------------------------------------------------------------------------------------------
  39. layer name = base_model.model.transformer.layers.1.attention.query_key_value.lora_A.default.weight
  40. ----------------------------------------------------------------------------------------------------
  41. layer name = base_model.model.transformer.layers.1.attention.query_key_value.lora_B.default.weight
  42. ----------------------------------------------------------------------------------------------------
  43. layer name = base_model.model.transformer.layers.1.attention.dense.weight
  44. ----------------------------------------------------------------------------------------------------
  45. layer name = base_model.model.transformer.layers.1.attention.dense.bias
  46. ----------------------------------------------------------------------------------------------------
  47. layer name = base_model.model.transformer.layers.1.post_attention_layernorm.weight
  48. ----------------------------------------------------------------------------------------------------
  49. layer name = base_model.model.transformer.layers.1.post_attention_layernorm.bias
  50. ----------------------------------------------------------------------------------------------------
  51. layer name = base_model.model.transformer.layers.1.mlp.dense_h_to_4h.weight
  52. ----------------------------------------------------------------------------------------------------
  53. layer name = base_model.model.transformer.layers.1.mlp.dense_h_to_4h.bias
  54. ----------------------------------------------------------------------------------------------------
  55. layer name = base_model.model.transformer.layers.1.mlp.dense_4h_to_h.weight
  56. ----------------------------------------------------------------------------------------------------
  57. layer name = base_model.model.transformer.layers.1.mlp.dense_4h_to_h.bias
  58. ----------------------------------------------------------------------------------------------------
  59. layer name = base_model.model.transformer.layers.2.input_layernorm.weight
  60. ----------------------------------------------------------------------------------------------------
  61. layer name = base_model.model.transformer.layers.2.input_layernorm.bias
  62. ----------------------------------------------------------------------------------------------------
  63. layer name = base_model.model.transformer.layers.2.attention.query_key_value.weight
  64. ----------------------------------------------------------------------------------------------------
  65. layer name = base_model.model.transformer.layers.2.attention.query_key_value.bias
  66. ----------------------------------------------------------------------------------------------------
  67. layer name = base_model.model.transformer.layers.2.attention.query_key_value.lora_A.default.weight
  68. ----------------------------------------------------------------------------------------------------
  69. layer name = base_model.model.transformer.layers.2.attention.query_key_value.lora_B.default.weight
  70. ----------------------------------------------------------------------------------------------------
  71. layer name = base_model.model.transformer.layers.2.attention.dense.weight
  72. ----------------------------------------------------------------------------------------------------
  73. layer name = base_model.model.transformer.layers.2.attention.dense.bias
  74. ----------------------------------------------------------------------------------------------------
  75. layer name = base_model.model.transformer.layers.2.post_attention_layernorm.weight
  76. ----------------------------------------------------------------------------------------------------
  77. layer name = base_model.model.transformer.layers.2.post_attention_layernorm.bias
  78. ----------------------------------------------------------------------------------------------------
  79. layer name = base_model.model.transformer.layers.2.mlp.dense_h_to_4h.weight
  80. ----------------------------------------------------------------------------------------------------
  81. layer name = base_model.model.transformer.layers.2.mlp.dense_h_to_4h.bias
  82. ----------------------------------------------------------------------------------------------------
  83. layer name = base_model.model.transformer.layers.2.mlp.dense_4h_to_h.weight
  84. ----------------------------------------------------------------------------------------------------
  85. layer name = base_model.model.transformer.layers.2.mlp.dense_4h_to_h.bias
  86. ----------------------------------------------------------------------------------------------------
  87. layer name = base_model.model.transformer.layers.3.input_layernorm.weight
  88. ----------------------------------------------------------------------------------------------------
  89. layer name = base_model.model.transformer.layers.3.input_layernorm.bias
  90. ----------------------------------------------------------------------------------------------------
  91. layer name = base_model.model.transformer.layers.3.attention.query_key_value.weight
  92. ----------------------------------------------------------------------------------------------------
  93. layer name = base_model.model.transformer.layers.3.attention.query_key_value.bias
  94. ----------------------------------------------------------------------------------------------------
  95. layer name = base_model.model.transformer.layers.3.attention.query_key_value.lora_A.default.weight
  96. ----------------------------------------------------------------------------------------------------
  97. layer name = base_model.model.transformer.layers.3.attention.query_key_value.lora_B.default.weight
  98. ----------------------------------------------------------------------------------------------------
  99. layer name = base_model.model.transformer.layers.3.attention.dense.weight
  100. ----------------------------------------------------------------------------------------------------
  101. layer name = base_model.model.transformer.layers.3.attention.dense.bias
  102. ----------------------------------------------------------------------------------------------------
  103. layer name = base_model.model.transformer.layers.3.post_attention_layernorm.weight
  104. ----------------------------------------------------------------------------------------------------
  105. layer name = base_model.model.transformer.layers.3.post_attention_layernorm.bias
  106. ----------------------------------------------------------------------------------------------------
  107. layer name = base_model.model.transformer.layers.3.mlp.dense_h_to_4h.weight
  108. ----------------------------------------------------------------------------------------------------
  109. layer name = base_model.model.transformer.layers.3.mlp.dense_h_to_4h.bias
  110. ----------------------------------------------------------------------------------------------------
  111. layer name = base_model.model.transformer.layers.3.mlp.dense_4h_to_h.weight
  112. ----------------------------------------------------------------------------------------------------
  113. layer name = base_model.model.transformer.layers.3.mlp.dense_4h_to_h.bias
  114. ----------------------------------------------------------------------------------------------------
  115. layer name = base_model.model.transformer.layers.4.input_layernorm.weight
  116. ----------------------------------------------------------------------------------------------------
  117. layer name = base_model.model.transformer.layers.4.input_layernorm.bias
  118. ----------------------------------------------------------------------------------------------------
  119. layer name = base_model.model.transformer.layers.4.attention.query_key_value.weight
  120. ----------------------------------------------------------------------------------------------------
  121. layer name = base_model.model.transformer.layers.4.attention.query_key_value.bias
  122. ----------------------------------------------------------------------------------------------------
  123. layer name = base_model.model.transformer.layers.4.attention.query_key_value.lora_A.default.weight
  124. ----------------------------------------------------------------------------------------------------
  125. layer name = base_model.model.transformer.layers.4.attention.query_key_value.lora_B.default.weight
  126. ----------------------------------------------------------------------------------------------------
  127. layer name = base_model.model.transformer.layers.4.attention.dense.weight
  128. ----------------------------------------------------------------------------------------------------
  129. layer name = base_model.model.transformer.layers.4.attention.dense.bias
  130. ----------------------------------------------------------------------------------------------------
  131. layer name = base_model.model.transformer.layers.4.post_attention_layernorm.weight
  132. ----------------------------------------------------------------------------------------------------
  133. layer name = base_model.model.transformer.layers.4.post_attention_layernorm.bias
  134. ----------------------------------------------------------------------------------------------------
  135. layer name = base_model.model.transformer.layers.4.mlp.dense_h_to_4h.weight
  136. ----------------------------------------------------------------------------------------------------
  137. layer name = base_model.model.transformer.layers.4.mlp.dense_h_to_4h.bias
  138. ----------------------------------------------------------------------------------------------------
  139. layer name = base_model.model.transformer.layers.4.mlp.dense_4h_to_h.weight
  140. ----------------------------------------------------------------------------------------------------
  141. layer name = base_model.model.transformer.layers.4.mlp.dense_4h_to_h.bias
  142. ----------------------------------------------------------------------------------------------------
  143. layer name = base_model.model.transformer.layers.5.input_layernorm.weight
  144. ----------------------------------------------------------------------------------------------------
  145. layer name = base_model.model.transformer.layers.5.input_layernorm.bias
  146. ----------------------------------------------------------------------------------------------------
  147. layer name = base_model.model.transformer.layers.5.attention.query_key_value.weight
  148. ----------------------------------------------------------------------------------------------------
  149. layer name = base_model.model.transformer.layers.5.attention.query_key_value.bias
  150. ----------------------------------------------------------------------------------------------------
  151. layer name = base_model.model.transformer.layers.5.attention.query_key_value.lora_A.default.weight
  152. ----------------------------------------------------------------------------------------------------
  153. layer name = base_model.model.transformer.layers.5.attention.query_key_value.lora_B.default.weight
  154. ----------------------------------------------------------------------------------------------------
  155. layer name = base_model.model.transformer.layers.5.attention.dense.weight
  156. ----------------------------------------------------------------------------------------------------
  157. layer name = base_model.model.transformer.layers.5.attention.dense.bias
  158. ----------------------------------------------------------------------------------------------------
  159. layer name = base_model.model.transformer.layers.5.post_attention_layernorm.weight
  160. ----------------------------------------------------------------------------------------------------
  161. layer name = base_model.model.transformer.layers.5.post_attention_layernorm.bias
  162. ----------------------------------------------------------------------------------------------------
  163. layer name = base_model.model.transformer.layers.5.mlp.dense_h_to_4h.weight
  164. ----------------------------------------------------------------------------------------------------
  165. layer name = base_model.model.transformer.layers.5.mlp.dense_h_to_4h.bias
  166. ----------------------------------------------------------------------------------------------------
  167. layer name = base_model.model.transformer.layers.5.mlp.dense_4h_to_h.weight
  168. ----------------------------------------------------------------------------------------------------
  169. layer name = base_model.model.transformer.layers.5.mlp.dense_4h_to_h.bias
  170. ----------------------------------------------------------------------------------------------------
  171. layer name = base_model.model.transformer.layers.6.input_layernorm.weight
  172. ----------------------------------------------------------------------------------------------------
  173. layer name = base_model.model.transformer.layers.6.input_layernorm.bias
  174. ----------------------------------------------------------------------------------------------------
  175. layer name = base_model.model.transformer.layers.6.attention.query_key_value.weight
  176. ----------------------------------------------------------------------------------------------------
  177. layer name = base_model.model.transformer.layers.6.attention.query_key_value.bias
  178. ----------------------------------------------------------------------------------------------------
  179. layer name = base_model.model.transformer.layers.6.attention.query_key_value.lora_A.default.weight
  180. ----------------------------------------------------------------------------------------------------
  181. layer name = base_model.model.transformer.layers.6.attention.query_key_value.lora_B.default.weight
  182. ----------------------------------------------------------------------------------------------------
  183. layer name = base_model.model.transformer.layers.6.attention.dense.weight
  184. ----------------------------------------------------------------------------------------------------
  185. layer name = base_model.model.transformer.layers.6.attention.dense.bias
  186. ----------------------------------------------------------------------------------------------------
  187. layer name = base_model.model.transformer.layers.6.post_attention_layernorm.weight
  188. ----------------------------------------------------------------------------------------------------
  189. layer name = base_model.model.transformer.layers.6.post_attention_layernorm.bias
  190. ----------------------------------------------------------------------------------------------------
  191. layer name = base_model.model.transformer.layers.6.mlp.dense_h_to_4h.weight
  192. ----------------------------------------------------------------------------------------------------
  193. layer name = base_model.model.transformer.layers.6.mlp.dense_h_to_4h.bias
  194. ----------------------------------------------------------------------------------------------------
  195. layer name = base_model.model.transformer.layers.6.mlp.dense_4h_to_h.weight
  196. ----------------------------------------------------------------------------------------------------
  197. layer name = base_model.model.transformer.layers.6.mlp.dense_4h_to_h.bias
  198. ----------------------------------------------------------------------------------------------------
  199. layer name = base_model.model.transformer.layers.7.input_layernorm.weight
  200. ----------------------------------------------------------------------------------------------------
  201. layer name = base_model.model.transformer.layers.7.input_layernorm.bias
  202. ----------------------------------------------------------------------------------------------------
  203. layer name = base_model.model.transformer.layers.7.attention.query_key_value.weight
  204. ----------------------------------------------------------------------------------------------------
  205. layer name = base_model.model.transformer.layers.7.attention.query_key_value.bias
  206. ----------------------------------------------------------------------------------------------------
  207. layer name = base_model.model.transformer.layers.7.attention.query_key_value.lora_A.default.weight
  208. ----------------------------------------------------------------------------------------------------
  209. layer name = base_model.model.transformer.layers.7.attention.query_key_value.lora_B.default.weight
  210. ----------------------------------------------------------------------------------------------------
  211. layer name = base_model.model.transformer.layers.7.attention.dense.weight
  212. ----------------------------------------------------------------------------------------------------
  213. layer name = base_model.model.transformer.layers.7.attention.dense.bias
  214. ----------------------------------------------------------------------------------------------------
  215. layer name = base_model.model.transformer.layers.7.post_attention_layernorm.weight
  216. ----------------------------------------------------------------------------------------------------
  217. layer name = base_model.model.transformer.layers.7.post_attention_layernorm.bias
  218. ----------------------------------------------------------------------------------------------------
  219. layer name = base_model.model.transformer.layers.7.mlp.dense_h_to_4h.weight
  220. ----------------------------------------------------------------------------------------------------
  221. layer name = base_model.model.transformer.layers.7.mlp.dense_h_to_4h.bias
  222. ----------------------------------------------------------------------------------------------------
  223. layer name = base_model.model.transformer.layers.7.mlp.dense_4h_to_h.weight
  224. ----------------------------------------------------------------------------------------------------
  225. layer name = base_model.model.transformer.layers.7.mlp.dense_4h_to_h.bias
  226. ----------------------------------------------------------------------------------------------------
  227. layer name = base_model.model.transformer.layers.8.input_layernorm.weight
  228. ----------------------------------------------------------------------------------------------------
  229. layer name = base_model.model.transformer.layers.8.input_layernorm.bias
  230. ----------------------------------------------------------------------------------------------------
  231. layer name = base_model.model.transformer.layers.8.attention.query_key_value.weight
  232. ----------------------------------------------------------------------------------------------------
  233. layer name = base_model.model.transformer.layers.8.attention.query_key_value.bias
  234. ----------------------------------------------------------------------------------------------------
  235. layer name = base_model.model.transformer.layers.8.attention.query_key_value.lora_A.default.weight
  236. ----------------------------------------------------------------------------------------------------
  237. layer name = base_model.model.transformer.layers.8.attention.query_key_value.lora_B.default.weight
  238. ----------------------------------------------------------------------------------------------------
  239. layer name = base_model.model.transformer.layers.8.attention.dense.weight
  240. ----------------------------------------------------------------------------------------------------
  241. layer name = base_model.model.transformer.layers.8.attention.dense.bias
  242. ----------------------------------------------------------------------------------------------------
  243. layer name = base_model.model.transformer.layers.8.post_attention_layernorm.weight
  244. ----------------------------------------------------------------------------------------------------
  245. layer name = base_model.model.transformer.layers.8.post_attention_layernorm.bias
  246. ----------------------------------------------------------------------------------------------------
  247. layer name = base_model.model.transformer.layers.8.mlp.dense_h_to_4h.weight
  248. ----------------------------------------------------------------------------------------------------
  249. layer name = base_model.model.transformer.layers.8.mlp.dense_h_to_4h.bias
  250. ----------------------------------------------------------------------------------------------------
  251. layer name = base_model.model.transformer.layers.8.mlp.dense_4h_to_h.weight
  252. ----------------------------------------------------------------------------------------------------
  253. layer name = base_model.model.transformer.layers.8.mlp.dense_4h_to_h.bias
  254. ----------------------------------------------------------------------------------------------------
  255. layer name = base_model.model.transformer.layers.9.input_layernorm.weight
  256. ----------------------------------------------------------------------------------------------------
  257. layer name = base_model.model.transformer.layers.9.input_layernorm.bias
  258. ----------------------------------------------------------------------------------------------------
  259. layer name = base_model.model.transformer.layers.9.attention.query_key_value.weight
  260. ----------------------------------------------------------------------------------------------------
  261. layer name = base_model.model.transformer.layers.9.attention.query_key_value.bias
  262. ----------------------------------------------------------------------------------------------------
  263. layer name = base_model.model.transformer.layers.9.attention.query_key_value.lora_A.default.weight
  264. ----------------------------------------------------------------------------------------------------
  265. layer name = base_model.model.transformer.layers.9.attention.query_key_value.lora_B.default.weight
  266. ----------------------------------------------------------------------------------------------------
  267. layer name = base_model.model.transformer.layers.9.attention.dense.weight
  268. ----------------------------------------------------------------------------------------------------
  269. layer name = base_model.model.transformer.layers.9.attention.dense.bias
  270. ----------------------------------------------------------------------------------------------------
  271. layer name = base_model.model.transformer.layers.9.post_attention_layernorm.weight
  272. ----------------------------------------------------------------------------------------------------
  273. layer name = base_model.model.transformer.layers.9.post_attention_layernorm.bias
  274. ----------------------------------------------------------------------------------------------------
  275. layer name = base_model.model.transformer.layers.9.mlp.dense_h_to_4h.weight
  276. ----------------------------------------------------------------------------------------------------
  277. layer name = base_model.model.transformer.layers.9.mlp.dense_h_to_4h.bias
  278. ----------------------------------------------------------------------------------------------------
  279. layer name = base_model.model.transformer.layers.9.mlp.dense_4h_to_h.weight
  280. ----------------------------------------------------------------------------------------------------
  281. layer name = base_model.model.transformer.layers.9.mlp.dense_4h_to_h.bias
  282. ----------------------------------------------------------------------------------------------------
  283. layer name = base_model.model.transformer.layers.10.input_layernorm.weight
  284. ----------------------------------------------------------------------------------------------------
  285. layer name = base_model.model.transformer.layers.10.input_layernorm.bias
  286. ----------------------------------------------------------------------------------------------------
  287. layer name = base_model.model.transformer.layers.10.attention.query_key_value.weight
  288. ----------------------------------------------------------------------------------------------------
  289. layer name = base_model.model.transformer.layers.10.attention.query_key_value.bias
  290. ----------------------------------------------------------------------------------------------------
  291. layer name = base_model.model.transformer.layers.10.attention.query_key_value.lora_A.default.weight
  292. ----------------------------------------------------------------------------------------------------
  293. layer name = base_model.model.transformer.layers.10.attention.query_key_value.lora_B.default.weight
  294. ----------------------------------------------------------------------------------------------------
  295. layer name = base_model.model.transformer.layers.10.attention.dense.weight
  296. ----------------------------------------------------------------------------------------------------
  297. layer name = base_model.model.transformer.layers.10.attention.dense.bias
  298. ----------------------------------------------------------------------------------------------------
  299. layer name = base_model.model.transformer.layers.10.post_attention_layernorm.weight
  300. ----------------------------------------------------------------------------------------------------
  301. layer name = base_model.model.transformer.layers.10.post_attention_layernorm.bias
  302. ----------------------------------------------------------------------------------------------------
  303. layer name = base_model.model.transformer.layers.10.mlp.dense_h_to_4h.weight
  304. ----------------------------------------------------------------------------------------------------
  305. layer name = base_model.model.transformer.layers.10.mlp.dense_h_to_4h.bias
  306. ----------------------------------------------------------------------------------------------------
  307. layer name = base_model.model.transformer.layers.10.mlp.dense_4h_to_h.weight
  308. ----------------------------------------------------------------------------------------------------
  309. layer name = base_model.model.transformer.layers.10.mlp.dense_4h_to_h.bias
  310. ----------------------------------------------------------------------------------------------------
  311. layer name = base_model.model.transformer.layers.11.input_layernorm.weight
  312. ----------------------------------------------------------------------------------------------------
  313. layer name = base_model.model.transformer.layers.11.input_layernorm.bias
  314. ----------------------------------------------------------------------------------------------------
  315. layer name = base_model.model.transformer.layers.11.attention.query_key_value.weight
  316. ----------------------------------------------------------------------------------------------------
  317. layer name = base_model.model.transformer.layers.11.attention.query_key_value.bias
  318. ----------------------------------------------------------------------------------------------------
  319. layer name = base_model.model.transformer.layers.11.attention.query_key_value.lora_A.default.weight
  320. ----------------------------------------------------------------------------------------------------
  321. layer name = base_model.model.transformer.layers.11.attention.query_key_value.lora_B.default.weight
  322. ----------------------------------------------------------------------------------------------------
  323. layer name = base_model.model.transformer.layers.11.attention.dense.weight
  324. ----------------------------------------------------------------------------------------------------
  325. layer name = base_model.model.transformer.layers.11.attention.dense.bias
  326. ----------------------------------------------------------------------------------------------------
  327. layer name = base_model.model.transformer.layers.11.post_attention_layernorm.weight
  328. ----------------------------------------------------------------------------------------------------
  329. layer name = base_model.model.transformer.layers.11.post_attention_layernorm.bias
  330. ----------------------------------------------------------------------------------------------------
  331. layer name = base_model.model.transformer.layers.11.mlp.dense_h_to_4h.weight
  332. ----------------------------------------------------------------------------------------------------
  333. layer name = base_model.model.transformer.layers.11.mlp.dense_h_to_4h.bias
  334. ----------------------------------------------------------------------------------------------------
  335. layer name = base_model.model.transformer.layers.11.mlp.dense_4h_to_h.weight
  336. ----------------------------------------------------------------------------------------------------
  337. layer name = base_model.model.transformer.layers.11.mlp.dense_4h_to_h.bias
  338. ----------------------------------------------------------------------------------------------------
  339. layer name = base_model.model.transformer.layers.12.input_layernorm.weight
  340. ----------------------------------------------------------------------------------------------------
  341. layer name = base_model.model.transformer.layers.12.input_layernorm.bias
  342. ----------------------------------------------------------------------------------------------------
  343. layer name = base_model.model.transformer.layers.12.attention.query_key_value.weight
  344. ----------------------------------------------------------------------------------------------------
  345. layer name = base_model.model.transformer.layers.12.attention.query_key_value.bias
  346. ----------------------------------------------------------------------------------------------------
  347. layer name = base_model.model.transformer.layers.12.attention.query_key_value.lora_A.default.weight
  348. ----------------------------------------------------------------------------------------------------
  349. layer name = base_model.model.transformer.layers.12.attention.query_key_value.lora_B.default.weight
  350. ----------------------------------------------------------------------------------------------------
  351. layer name = base_model.model.transformer.layers.12.attention.dense.weight
  352. ----------------------------------------------------------------------------------------------------
  353. layer name = base_model.model.transformer.layers.12.attention.dense.bias
  354. ----------------------------------------------------------------------------------------------------
  355. layer name = base_model.model.transformer.layers.12.post_attention_layernorm.weight
  356. ----------------------------------------------------------------------------------------------------
  357. layer name = base_model.model.transformer.layers.12.post_attention_layernorm.bias
  358. ----------------------------------------------------------------------------------------------------
  359. layer name = base_model.model.transformer.layers.12.mlp.dense_h_to_4h.weight
  360. ----------------------------------------------------------------------------------------------------
  361. layer name = base_model.model.transformer.layers.12.mlp.dense_h_to_4h.bias
  362. ----------------------------------------------------------------------------------------------------
  363. layer name = base_model.model.transformer.layers.12.mlp.dense_4h_to_h.weight
  364. ----------------------------------------------------------------------------------------------------
  365. layer name = base_model.model.transformer.layers.12.mlp.dense_4h_to_h.bias
  366. ----------------------------------------------------------------------------------------------------
  367. layer name = base_model.model.transformer.layers.13.input_layernorm.weight
  368. ----------------------------------------------------------------------------------------------------
  369. layer name = base_model.model.transformer.layers.13.input_layernorm.bias
  370. ----------------------------------------------------------------------------------------------------
  371. layer name = base_model.model.transformer.layers.13.attention.query_key_value.weight
  372. ----------------------------------------------------------------------------------------------------
  373. layer name = base_model.model.transformer.layers.13.attention.query_key_value.bias
  374. ----------------------------------------------------------------------------------------------------
  375. layer name = base_model.model.transformer.layers.13.attention.query_key_value.lora_A.default.weight
  376. ----------------------------------------------------------------------------------------------------
  377. layer name = base_model.model.transformer.layers.13.attention.query_key_value.lora_B.default.weight
  378. ----------------------------------------------------------------------------------------------------
  379. layer name = base_model.model.transformer.layers.13.attention.dense.weight
  380. ----------------------------------------------------------------------------------------------------
  381. layer name = base_model.model.transformer.layers.13.attention.dense.bias
  382. ----------------------------------------------------------------------------------------------------
  383. layer name = base_model.model.transformer.layers.13.post_attention_layernorm.weight
  384. ----------------------------------------------------------------------------------------------------
  385. layer name = base_model.model.transformer.layers.13.post_attention_layernorm.bias
  386. ----------------------------------------------------------------------------------------------------
  387. layer name = base_model.model.transformer.layers.13.mlp.dense_h_to_4h.weight
  388. ----------------------------------------------------------------------------------------------------
  389. layer name = base_model.model.transformer.layers.13.mlp.dense_h_to_4h.bias
  390. ----------------------------------------------------------------------------------------------------
  391. layer name = base_model.model.transformer.layers.13.mlp.dense_4h_to_h.weight
  392. ----------------------------------------------------------------------------------------------------
  393. layer name = base_model.model.transformer.layers.13.mlp.dense_4h_to_h.bias
  394. ----------------------------------------------------------------------------------------------------
  395. layer name = base_model.model.transformer.layers.14.input_layernorm.weight
  396. ----------------------------------------------------------------------------------------------------
  397. layer name = base_model.model.transformer.layers.14.input_layernorm.bias
  398. ----------------------------------------------------------------------------------------------------
  399. layer name = base_model.model.transformer.layers.14.attention.query_key_value.weight
  400. ----------------------------------------------------------------------------------------------------
  401. layer name = base_model.model.transformer.layers.14.attention.query_key_value.bias
  402. ----------------------------------------------------------------------------------------------------
  403. layer name = base_model.model.transformer.layers.14.attention.query_key_value.lora_A.default.weight
  404. ----------------------------------------------------------------------------------------------------
  405. layer name = base_model.model.transformer.layers.14.attention.query_key_value.lora_B.default.weight
  406. ----------------------------------------------------------------------------------------------------
  407. layer name = base_model.model.transformer.layers.14.attention.dense.weight
  408. ----------------------------------------------------------------------------------------------------
  409. layer name = base_model.model.transformer.layers.14.attention.dense.bias
  410. ----------------------------------------------------------------------------------------------------
  411. layer name = base_model.model.transformer.layers.14.post_attention_layernorm.weight
  412. ----------------------------------------------------------------------------------------------------
  413. layer name = base_model.model.transformer.layers.14.post_attention_layernorm.bias
  414. ----------------------------------------------------------------------------------------------------
  415. layer name = base_model.model.transformer.layers.14.mlp.dense_h_to_4h.weight
  416. ----------------------------------------------------------------------------------------------------
  417. layer name = base_model.model.transformer.layers.14.mlp.dense_h_to_4h.bias
  418. ----------------------------------------------------------------------------------------------------
  419. layer name = base_model.model.transformer.layers.14.mlp.dense_4h_to_h.weight
  420. ----------------------------------------------------------------------------------------------------
  421. layer name = base_model.model.transformer.layers.14.mlp.dense_4h_to_h.bias
  422. ----------------------------------------------------------------------------------------------------
  423. layer name = base_model.model.transformer.layers.15.input_layernorm.weight
  424. ----------------------------------------------------------------------------------------------------
  425. layer name = base_model.model.transformer.layers.15.input_layernorm.bias
  426. ----------------------------------------------------------------------------------------------------
  427. layer name = base_model.model.transformer.layers.15.attention.query_key_value.weight
  428. ----------------------------------------------------------------------------------------------------
  429. layer name = base_model.model.transformer.layers.15.attention.query_key_value.bias
  430. ----------------------------------------------------------------------------------------------------
  431. layer name = base_model.model.transformer.layers.15.attention.query_key_value.lora_A.default.weight
  432. ----------------------------------------------------------------------------------------------------
  433. layer name = base_model.model.transformer.layers.15.attention.query_key_value.lora_B.default.weight
  434. ----------------------------------------------------------------------------------------------------
  435. layer name = base_model.model.transformer.layers.15.attention.dense.weight
  436. ----------------------------------------------------------------------------------------------------
  437. layer name = base_model.model.transformer.layers.15.attention.dense.bias
  438. ----------------------------------------------------------------------------------------------------
  439. layer name = base_model.model.transformer.layers.15.post_attention_layernorm.weight
  440. ----------------------------------------------------------------------------------------------------
  441. layer name = base_model.model.transformer.layers.15.post_attention_layernorm.bias
  442. ----------------------------------------------------------------------------------------------------
  443. layer name = base_model.model.transformer.layers.15.mlp.dense_h_to_4h.weight
  444. ----------------------------------------------------------------------------------------------------
  445. layer name = base_model.model.transformer.layers.15.mlp.dense_h_to_4h.bias
  446. ----------------------------------------------------------------------------------------------------
  447. layer name = base_model.model.transformer.layers.15.mlp.dense_4h_to_h.weight
  448. ----------------------------------------------------------------------------------------------------
  449. layer name = base_model.model.transformer.layers.15.mlp.dense_4h_to_h.bias
  450. ----------------------------------------------------------------------------------------------------
  451. layer name = base_model.model.transformer.layers.16.input_layernorm.weight
  452. ----------------------------------------------------------------------------------------------------
  453. layer name = base_model.model.transformer.layers.16.input_layernorm.bias
  454. ----------------------------------------------------------------------------------------------------
  455. layer name = base_model.model.transformer.layers.16.attention.query_key_value.weight
  456. ----------------------------------------------------------------------------------------------------
  457. layer name = base_model.model.transformer.layers.16.attention.query_key_value.bias
  458. ----------------------------------------------------------------------------------------------------
  459. layer name = base_model.model.transformer.layers.16.attention.query_key_value.lora_A.default.weight
  460. ----------------------------------------------------------------------------------------------------
  461. layer name = base_model.model.transformer.layers.16.attention.query_key_value.lora_B.default.weight
  462. ----------------------------------------------------------------------------------------------------
  463. layer name = base_model.model.transformer.layers.16.attention.dense.weight
  464. ----------------------------------------------------------------------------------------------------
  465. layer name = base_model.model.transformer.layers.16.attention.dense.bias
  466. ----------------------------------------------------------------------------------------------------
  467. layer name = base_model.model.transformer.layers.16.post_attention_layernorm.weight
  468. ----------------------------------------------------------------------------------------------------
  469. layer name = base_model.model.transformer.layers.16.post_attention_layernorm.bias
  470. ----------------------------------------------------------------------------------------------------
  471. layer name = base_model.model.transformer.layers.16.mlp.dense_h_to_4h.weight
  472. ----------------------------------------------------------------------------------------------------
  473. layer name = base_model.model.transformer.layers.16.mlp.dense_h_to_4h.bias
  474. ----------------------------------------------------------------------------------------------------
  475. layer name = base_model.model.transformer.layers.16.mlp.dense_4h_to_h.weight
  476. ----------------------------------------------------------------------------------------------------
  477. layer name = base_model.model.transformer.layers.16.mlp.dense_4h_to_h.bias
  478. ----------------------------------------------------------------------------------------------------
  479. layer name = base_model.model.transformer.layers.17.input_layernorm.weight
  480. ----------------------------------------------------------------------------------------------------
  481. layer name = base_model.model.transformer.layers.17.input_layernorm.bias
  482. ----------------------------------------------------------------------------------------------------
  483. layer name = base_model.model.transformer.layers.17.attention.query_key_value.weight
  484. ----------------------------------------------------------------------------------------------------
  485. layer name = base_model.model.transformer.layers.17.attention.query_key_value.bias
  486. ----------------------------------------------------------------------------------------------------
  487. layer name = base_model.model.transformer.layers.17.attention.query_key_value.lora_A.default.weight
  488. ----------------------------------------------------------------------------------------------------
  489. layer name = base_model.model.transformer.layers.17.attention.query_key_value.lora_B.default.weight
  490. ----------------------------------------------------------------------------------------------------
  491. layer name = base_model.model.transformer.layers.17.attention.dense.weight
  492. ----------------------------------------------------------------------------------------------------
  493. layer name = base_model.model.transformer.layers.17.attention.dense.bias
  494. ----------------------------------------------------------------------------------------------------
  495. layer name = base_model.model.transformer.layers.17.post_attention_layernorm.weight
  496. ----------------------------------------------------------------------------------------------------
  497. layer name = base_model.model.transformer.layers.17.post_attention_layernorm.bias
  498. ----------------------------------------------------------------------------------------------------
  499. layer name = base_model.model.transformer.layers.17.mlp.dense_h_to_4h.weight
  500. ----------------------------------------------------------------------------------------------------
  501. layer name = base_model.model.transformer.layers.17.mlp.dense_h_to_4h.bias
  502. ----------------------------------------------------------------------------------------------------
  503. layer name = base_model.model.transformer.layers.17.mlp.dense_4h_to_h.weight
  504. ----------------------------------------------------------------------------------------------------
  505. layer name = base_model.model.transformer.layers.17.mlp.dense_4h_to_h.bias
  506. ----------------------------------------------------------------------------------------------------
  507. layer name = base_model.model.transformer.layers.18.input_layernorm.weight
  508. ----------------------------------------------------------------------------------------------------
  509. layer name = base_model.model.transformer.layers.18.input_layernorm.bias
  510. ----------------------------------------------------------------------------------------------------
  511. layer name = base_model.model.transformer.layers.18.attention.query_key_value.weight
  512. ----------------------------------------------------------------------------------------------------
  513. layer name = base_model.model.transformer.layers.18.attention.query_key_value.bias
  514. ----------------------------------------------------------------------------------------------------
  515. layer name = base_model.model.transformer.layers.18.attention.query_key_value.lora_A.default.weight
  516. ----------------------------------------------------------------------------------------------------
  517. layer name = base_model.model.transformer.layers.18.attention.query_key_value.lora_B.default.weight
  518. ----------------------------------------------------------------------------------------------------
  519. layer name = base_model.model.transformer.layers.18.attention.dense.weight
  520. ----------------------------------------------------------------------------------------------------
  521. layer name = base_model.model.transformer.layers.18.attention.dense.bias
  522. ----------------------------------------------------------------------------------------------------
  523. layer name = base_model.model.transformer.layers.18.post_attention_layernorm.weight
  524. ----------------------------------------------------------------------------------------------------
  525. layer name = base_model.model.transformer.layers.18.post_attention_layernorm.bias
  526. ----------------------------------------------------------------------------------------------------
  527. layer name = base_model.model.transformer.layers.18.mlp.dense_h_to_4h.weight
  528. ----------------------------------------------------------------------------------------------------
  529. layer name = base_model.model.transformer.layers.18.mlp.dense_h_to_4h.bias
  530. ----------------------------------------------------------------------------------------------------
  531. layer name = base_model.model.transformer.layers.18.mlp.dense_4h_to_h.weight
  532. ----------------------------------------------------------------------------------------------------
  533. layer name = base_model.model.transformer.layers.18.mlp.dense_4h_to_h.bias
  534. ----------------------------------------------------------------------------------------------------
  535. layer name = base_model.model.transformer.layers.19.input_layernorm.weight
  536. ----------------------------------------------------------------------------------------------------
  537. layer name = base_model.model.transformer.layers.19.input_layernorm.bias
  538. ----------------------------------------------------------------------------------------------------
  539. layer name = base_model.model.transformer.layers.19.attention.query_key_value.weight
  540. ----------------------------------------------------------------------------------------------------
  541. layer name = base_model.model.transformer.layers.19.attention.query_key_value.bias
  542. ----------------------------------------------------------------------------------------------------
  543. layer name = base_model.model.transformer.layers.19.attention.query_key_value.lora_A.default.weight
  544. ----------------------------------------------------------------------------------------------------
  545. layer name = base_model.model.transformer.layers.19.attention.query_key_value.lora_B.default.weight
  546. ----------------------------------------------------------------------------------------------------
  547. layer name = base_model.model.transformer.layers.19.attention.dense.weight
  548. ----------------------------------------------------------------------------------------------------
  549. layer name = base_model.model.transformer.layers.19.attention.dense.bias
  550. ----------------------------------------------------------------------------------------------------
  551. layer name = base_model.model.transformer.layers.19.post_attention_layernorm.weight
  552. ----------------------------------------------------------------------------------------------------
  553. layer name = base_model.model.transformer.layers.19.post_attention_layernorm.bias
  554. ----------------------------------------------------------------------------------------------------
  555. layer name = base_model.model.transformer.layers.19.mlp.dense_h_to_4h.weight
  556. ----------------------------------------------------------------------------------------------------
  557. layer name = base_model.model.transformer.layers.19.mlp.dense_h_to_4h.bias
  558. ----------------------------------------------------------------------------------------------------
  559. layer name = base_model.model.transformer.layers.19.mlp.dense_4h_to_h.weight
  560. ----------------------------------------------------------------------------------------------------
  561. layer name = base_model.model.transformer.layers.19.mlp.dense_4h_to_h.bias
  562. ----------------------------------------------------------------------------------------------------
  563. layer name = base_model.model.transformer.layers.20.input_layernorm.weight
  564. ----------------------------------------------------------------------------------------------------
  565. layer name = base_model.model.transformer.layers.20.input_layernorm.bias
  566. ----------------------------------------------------------------------------------------------------
  567. layer name = base_model.model.transformer.layers.20.attention.query_key_value.weight
  568. ----------------------------------------------------------------------------------------------------
  569. layer name = base_model.model.transformer.layers.20.attention.query_key_value.bias
  570. ----------------------------------------------------------------------------------------------------
  571. layer name = base_model.model.transformer.layers.20.attention.query_key_value.lora_A.default.weight
  572. ----------------------------------------------------------------------------------------------------
  573. layer name = base_model.model.transformer.layers.20.attention.query_key_value.lora_B.default.weight
  574. ----------------------------------------------------------------------------------------------------
  575. layer name = base_model.model.transformer.layers.20.attention.dense.weight
  576. ----------------------------------------------------------------------------------------------------
  577. layer name = base_model.model.transformer.layers.20.attention.dense.bias
  578. ----------------------------------------------------------------------------------------------------
  579. layer name = base_model.model.transformer.layers.20.post_attention_layernorm.weight
  580. ----------------------------------------------------------------------------------------------------
  581. layer name = base_model.model.transformer.layers.20.post_attention_layernorm.bias
  582. ----------------------------------------------------------------------------------------------------
  583. layer name = base_model.model.transformer.layers.20.mlp.dense_h_to_4h.weight
  584. ----------------------------------------------------------------------------------------------------
  585. layer name = base_model.model.transformer.layers.20.mlp.dense_h_to_4h.bias
  586. ----------------------------------------------------------------------------------------------------
  587. layer name = base_model.model.transformer.layers.20.mlp.dense_4h_to_h.weight
  588. ----------------------------------------------------------------------------------------------------
  589. layer name = base_model.model.transformer.layers.20.mlp.dense_4h_to_h.bias
  590. ----------------------------------------------------------------------------------------------------
  591. layer name = base_model.model.transformer.layers.21.input_layernorm.weight
  592. ----------------------------------------------------------------------------------------------------
  593. layer name = base_model.model.transformer.layers.21.input_layernorm.bias
  594. ----------------------------------------------------------------------------------------------------
  595. layer name = base_model.model.transformer.layers.21.attention.query_key_value.weight
  596. ----------------------------------------------------------------------------------------------------
  597. layer name = base_model.model.transformer.layers.21.attention.query_key_value.bias
  598. ----------------------------------------------------------------------------------------------------
  599. layer name = base_model.model.transformer.layers.21.attention.query_key_value.lora_A.default.weight
  600. ----------------------------------------------------------------------------------------------------
  601. layer name = base_model.model.transformer.layers.21.attention.query_key_value.lora_B.default.weight
  602. ----------------------------------------------------------------------------------------------------
  603. layer name = base_model.model.transformer.layers.21.attention.dense.weight
  604. ----------------------------------------------------------------------------------------------------
  605. layer name = base_model.model.transformer.layers.21.attention.dense.bias
  606. ----------------------------------------------------------------------------------------------------
  607. layer name = base_model.model.transformer.layers.21.post_attention_layernorm.weight
  608. ----------------------------------------------------------------------------------------------------
  609. layer name = base_model.model.transformer.layers.21.post_attention_layernorm.bias
  610. ----------------------------------------------------------------------------------------------------
  611. layer name = base_model.model.transformer.layers.21.mlp.dense_h_to_4h.weight
  612. ----------------------------------------------------------------------------------------------------
  613. layer name = base_model.model.transformer.layers.21.mlp.dense_h_to_4h.bias
  614. ----------------------------------------------------------------------------------------------------
  615. layer name = base_model.model.transformer.layers.21.mlp.dense_4h_to_h.weight
  616. ----------------------------------------------------------------------------------------------------
  617. layer name = base_model.model.transformer.layers.21.mlp.dense_4h_to_h.bias
  618. ----------------------------------------------------------------------------------------------------
  619. layer name = base_model.model.transformer.layers.22.input_layernorm.weight
  620. ----------------------------------------------------------------------------------------------------
  621. layer name = base_model.model.transformer.layers.22.input_layernorm.bias
  622. ----------------------------------------------------------------------------------------------------
  623. layer name = base_model.model.transformer.layers.22.attention.query_key_value.weight
  624. ----------------------------------------------------------------------------------------------------
  625. layer name = base_model.model.transformer.layers.22.attention.query_key_value.bias
  626. ----------------------------------------------------------------------------------------------------
  627. layer name = base_model.model.transformer.layers.22.attention.query_key_value.lora_A.default.weight
  628. ----------------------------------------------------------------------------------------------------
  629. layer name = base_model.model.transformer.layers.22.attention.query_key_value.lora_B.default.weight
  630. ----------------------------------------------------------------------------------------------------
  631. layer name = base_model.model.transformer.layers.22.attention.dense.weight
  632. ----------------------------------------------------------------------------------------------------
  633. layer name = base_model.model.transformer.layers.22.attention.dense.bias
  634. ----------------------------------------------------------------------------------------------------
  635. layer name = base_model.model.transformer.layers.22.post_attention_layernorm.weight
  636. ----------------------------------------------------------------------------------------------------
  637. layer name = base_model.model.transformer.layers.22.post_attention_layernorm.bias
  638. ----------------------------------------------------------------------------------------------------
  639. layer name = base_model.model.transformer.layers.22.mlp.dense_h_to_4h.weight
  640. ----------------------------------------------------------------------------------------------------
  641. layer name = base_model.model.transformer.layers.22.mlp.dense_h_to_4h.bias
  642. ----------------------------------------------------------------------------------------------------
  643. layer name = base_model.model.transformer.layers.22.mlp.dense_4h_to_h.weight
  644. ----------------------------------------------------------------------------------------------------
  645. layer name = base_model.model.transformer.layers.22.mlp.dense_4h_to_h.bias
  646. ----------------------------------------------------------------------------------------------------
  647. layer name = base_model.model.transformer.layers.23.input_layernorm.weight
  648. ----------------------------------------------------------------------------------------------------
  649. layer name = base_model.model.transformer.layers.23.input_layernorm.bias
  650. ----------------------------------------------------------------------------------------------------
  651. layer name = base_model.model.transformer.layers.23.attention.query_key_value.weight
  652. ----------------------------------------------------------------------------------------------------
  653. layer name = base_model.model.transformer.layers.23.attention.query_key_value.bias
  654. ----------------------------------------------------------------------------------------------------
  655. layer name = base_model.model.transformer.layers.23.attention.query_key_value.lora_A.default.weight
  656. ----------------------------------------------------------------------------------------------------
  657. layer name = base_model.model.transformer.layers.23.attention.query_key_value.lora_B.default.weight
  658. ----------------------------------------------------------------------------------------------------
  659. layer name = base_model.model.transformer.layers.23.attention.dense.weight
  660. ----------------------------------------------------------------------------------------------------
  661. layer name = base_model.model.transformer.layers.23.attention.dense.bias
  662. ----------------------------------------------------------------------------------------------------
  663. layer name = base_model.model.transformer.layers.23.post_attention_layernorm.weight
  664. ----------------------------------------------------------------------------------------------------
  665. layer name = base_model.model.transformer.layers.23.post_attention_layernorm.bias
  666. ----------------------------------------------------------------------------------------------------
  667. layer name = base_model.model.transformer.layers.23.mlp.dense_h_to_4h.weight
  668. ----------------------------------------------------------------------------------------------------
  669. layer name = base_model.model.transformer.layers.23.mlp.dense_h_to_4h.bias
  670. ----------------------------------------------------------------------------------------------------
  671. layer name = base_model.model.transformer.layers.23.mlp.dense_4h_to_h.weight
  672. ----------------------------------------------------------------------------------------------------
  673. layer name = base_model.model.transformer.layers.23.mlp.dense_4h_to_h.bias
  674. ----------------------------------------------------------------------------------------------------
  675. layer name = base_model.model.transformer.layers.24.input_layernorm.weight
  676. ----------------------------------------------------------------------------------------------------
  677. layer name = base_model.model.transformer.layers.24.input_layernorm.bias
  678. ----------------------------------------------------------------------------------------------------
  679. layer name = base_model.model.transformer.layers.24.attention.query_key_value.weight
  680. ----------------------------------------------------------------------------------------------------
  681. layer name = base_model.model.transformer.layers.24.attention.query_key_value.bias
  682. ----------------------------------------------------------------------------------------------------
  683. layer name = base_model.model.transformer.layers.24.attention.query_key_value.lora_A.default.weight
  684. ----------------------------------------------------------------------------------------------------
  685. layer name = base_model.model.transformer.layers.24.attention.query_key_value.lora_B.default.weight
  686. ----------------------------------------------------------------------------------------------------
  687. layer name = base_model.model.transformer.layers.24.attention.dense.weight
  688. ----------------------------------------------------------------------------------------------------
  689. layer name = base_model.model.transformer.layers.24.attention.dense.bias
  690. ----------------------------------------------------------------------------------------------------
  691. layer name = base_model.model.transformer.layers.24.post_attention_layernorm.weight
  692. ----------------------------------------------------------------------------------------------------
  693. layer name = base_model.model.transformer.layers.24.post_attention_layernorm.bias
  694. ----------------------------------------------------------------------------------------------------
  695. layer name = base_model.model.transformer.layers.24.mlp.dense_h_to_4h.weight
  696. ----------------------------------------------------------------------------------------------------
  697. layer name = base_model.model.transformer.layers.24.mlp.dense_h_to_4h.bias
  698. ----------------------------------------------------------------------------------------------------
  699. layer name = base_model.model.transformer.layers.24.mlp.dense_4h_to_h.weight
  700. ----------------------------------------------------------------------------------------------------
  701. layer name = base_model.model.transformer.layers.24.mlp.dense_4h_to_h.bias
  702. ----------------------------------------------------------------------------------------------------
  703. layer name = base_model.model.transformer.layers.25.input_layernorm.weight
  704. ----------------------------------------------------------------------------------------------------
  705. layer name = base_model.model.transformer.layers.25.input_layernorm.bias
  706. ----------------------------------------------------------------------------------------------------
  707. layer name = base_model.model.transformer.layers.25.attention.query_key_value.weight
  708. ----------------------------------------------------------------------------------------------------
  709. layer name = base_model.model.transformer.layers.25.attention.query_key_value.bias
  710. ----------------------------------------------------------------------------------------------------
  711. layer name = base_model.model.transformer.layers.25.attention.query_key_value.lora_A.default.weight
  712. ----------------------------------------------------------------------------------------------------
  713. layer name = base_model.model.transformer.layers.25.attention.query_key_value.lora_B.default.weight
  714. ----------------------------------------------------------------------------------------------------
  715. layer name = base_model.model.transformer.layers.25.attention.dense.weight
  716. ----------------------------------------------------------------------------------------------------
  717. layer name = base_model.model.transformer.layers.25.attention.dense.bias
  718. ----------------------------------------------------------------------------------------------------
  719. layer name = base_model.model.transformer.layers.25.post_attention_layernorm.weight
  720. ----------------------------------------------------------------------------------------------------
  721. layer name = base_model.model.transformer.layers.25.post_attention_layernorm.bias
  722. ----------------------------------------------------------------------------------------------------
  723. layer name = base_model.model.transformer.layers.25.mlp.dense_h_to_4h.weight
  724. ----------------------------------------------------------------------------------------------------
  725. layer name = base_model.model.transformer.layers.25.mlp.dense_h_to_4h.bias
  726. ----------------------------------------------------------------------------------------------------
  727. layer name = base_model.model.transformer.layers.25.mlp.dense_4h_to_h.weight
  728. ----------------------------------------------------------------------------------------------------
  729. layer name = base_model.model.transformer.layers.25.mlp.dense_4h_to_h.bias
  730. ----------------------------------------------------------------------------------------------------
  731. layer name = base_model.model.transformer.layers.26.input_layernorm.weight
  732. ----------------------------------------------------------------------------------------------------
  733. layer name = base_model.model.transformer.layers.26.input_layernorm.bias
  734. ----------------------------------------------------------------------------------------------------
  735. layer name = base_model.model.transformer.layers.26.attention.query_key_value.weight
  736. ----------------------------------------------------------------------------------------------------
  737. layer name = base_model.model.transformer.layers.26.attention.query_key_value.bias
  738. ----------------------------------------------------------------------------------------------------
  739. layer name = base_model.model.transformer.layers.26.attention.query_key_value.lora_A.default.weight
  740. ----------------------------------------------------------------------------------------------------
  741. layer name = base_model.model.transformer.layers.26.attention.query_key_value.lora_B.default.weight
  742. ----------------------------------------------------------------------------------------------------
  743. layer name = base_model.model.transformer.layers.26.attention.dense.weight
  744. ----------------------------------------------------------------------------------------------------
  745. layer name = base_model.model.transformer.layers.26.attention.dense.bias
  746. ----------------------------------------------------------------------------------------------------
  747. layer name = base_model.model.transformer.layers.26.post_attention_layernorm.weight
  748. ----------------------------------------------------------------------------------------------------
  749. layer name = base_model.model.transformer.layers.26.post_attention_layernorm.bias
  750. ----------------------------------------------------------------------------------------------------
  751. layer name = base_model.model.transformer.layers.26.mlp.dense_h_to_4h.weight
  752. ----------------------------------------------------------------------------------------------------
  753. layer name = base_model.model.transformer.layers.26.mlp.dense_h_to_4h.bias
  754. ----------------------------------------------------------------------------------------------------
  755. layer name = base_model.model.transformer.layers.26.mlp.dense_4h_to_h.weight
  756. ----------------------------------------------------------------------------------------------------
  757. layer name = base_model.model.transformer.layers.26.mlp.dense_4h_to_h.bias
  758. ----------------------------------------------------------------------------------------------------
  759. layer name = base_model.model.transformer.layers.27.input_layernorm.weight
  760. ----------------------------------------------------------------------------------------------------
  761. layer name = base_model.model.transformer.layers.27.input_layernorm.bias
  762. ----------------------------------------------------------------------------------------------------
  763. layer name = base_model.model.transformer.layers.27.attention.query_key_value.weight
  764. ----------------------------------------------------------------------------------------------------
  765. layer name = base_model.model.transformer.layers.27.attention.query_key_value.bias
  766. ----------------------------------------------------------------------------------------------------
  767. layer name = base_model.model.transformer.layers.27.attention.query_key_value.lora_A.default.weight
  768. ----------------------------------------------------------------------------------------------------
  769. layer name = base_model.model.transformer.layers.27.attention.query_key_value.lora_B.default.weight
  770. ----------------------------------------------------------------------------------------------------
  771. layer name = base_model.model.transformer.layers.27.attention.dense.weight
  772. ----------------------------------------------------------------------------------------------------
  773. layer name = base_model.model.transformer.layers.27.attention.dense.bias
  774. ----------------------------------------------------------------------------------------------------
  775. layer name = base_model.model.transformer.layers.27.post_attention_layernorm.weight
  776. ----------------------------------------------------------------------------------------------------
  777. layer name = base_model.model.transformer.layers.27.post_attention_layernorm.bias
  778. ----------------------------------------------------------------------------------------------------
  779. layer name = base_model.model.transformer.layers.27.mlp.dense_h_to_4h.weight
  780. ----------------------------------------------------------------------------------------------------
  781. layer name = base_model.model.transformer.layers.27.mlp.dense_h_to_4h.bias
  782. ----------------------------------------------------------------------------------------------------
  783. layer name = base_model.model.transformer.layers.27.mlp.dense_4h_to_h.weight
  784. ----------------------------------------------------------------------------------------------------
  785. layer name = base_model.model.transformer.layers.27.mlp.dense_4h_to_h.bias
  786. ----------------------------------------------------------------------------------------------------
  787. layer name = base_model.model.transformer.final_layernorm.weight
  788. ----------------------------------------------------------------------------------------------------
  789. layer name = base_model.model.transformer.final_layernorm.bias
  790. ----------------------------------------------------------------------------------------------------




ChatGLM-6B模型微调实战(以 ADGEN (广告生成) 数据集为例,序列长度达 2048)_桂花很香,旭很美的博客-CSDN博客

https://songshanhu.csdn.net/64425c1dae650e245cfead85.html

https://devpress.csdn.net/chuangye/6438f5c2986c660f3cf93bbb.html

LLM大模型低资源微调p tuning v2和lora区别 - 知乎

【自然语言处理】【大模型】极低资源微调大模型方法LoRA以及BLOOM-LORA实现代码 - 知乎 

声明:本文内容由网友自发贡献,转载请注明出处:【wpsshop】
推荐阅读
相关标签
  

闽ICP备14008679号