赞
踩
「学习记录」
模型结构:
- LlamaConfig {
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 1,
- "eos_token_id": 2,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 11008,
- "max_position_embeddings": 2048,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 32,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-06,
- "rope_scaling": null,
- "rope_theta": 10000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.38.2",
- "use_cache": true,
- "vocab_size": 32000
- }
重要的:32层,32个attention heads,词表大小为 32000
各参数及size:
- model.embed_tokens.weight torch.Size([32000, 4096])
-
- model.layers.0.self_attn.q_proj.weight torch.Size([4096, 4096])
- model.layers.0.self_attn.k_proj.weight torch.Size([4096, 4096])
- model.layers.0.self_attn.v_proj.weight torch.Size([4096, 4096])
- model.layers.0.self_attn.o_proj.weight torch.Size([4096, 4096])
- model.layers.0.mlp.gate_proj.weight torch.Size([11008, 4096])
- model.layers.0.mlp.up_proj.weight torch.Size([11008, 4096])
- model.layers.0.mlp.down_proj.weight torch.Size([4096, 11008])
- model.layers.0.input_layernorm.weight torch.Size([4096])
- model.layers.0.post_attention_layernorm.weight torch.Size([4096])
- …
- model.layers.31.self_attn.q_proj.weight torch.Size([4096, 4096])
- model.layers.31.self_attn.k_proj.weight torch.Size([4096, 4096])
- model.layers.31.self_attn.v_proj.weight torch.Size([4096, 4096])
- model.layers.31.self_attn.o_proj.weight torch.Size([4096, 4096])
- model.layers.31.mlp.gate_proj.weight torch.Size([11008, 4096])
- model.layers.31.mlp.up_proj.weight torch.Size([11008, 4096])
- model.layers.31.mlp.down_proj.weight torch.Size([4096, 11008])
- model.layers.31.input_layernorm.weight torch.Size([4096])
- model.layers.31.post_attention_layernorm.weight torch.Size([4096])
-
- model.norm.weight torch.Size([4096])
- lm_head.weight torch.Size([32000, 4096])
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。