[Lora][微调] Qwen-VL/Qwen-VL-chat微调问题_assertionerror: only support self-attention curren

作者：小舞很执着 | 2024-08-22 20:53:33

踩

assertionerror: only support self-attention currently

@[Lora][微调] Qwen-VL/Qwen-VL-chat微调问题

关于Qwen-VL在lora过程中出现的问题总结。

模型预训练

错误一 “erfinv_cuda” not implemented for ‘BFloat16’

RuntimeError: "erfinv_cuda" not implemented for 'BFloat16'
1

参考github中issue253给出的意见，修改Qwen-VL-Chat/visual.py下的相关内容。

# visual.py 第18行
# from torch.nn.init import trunc_normal_
from torch.nn.init import normal_

# visual.py 第117行
# trunc_normal_(self.query, std=.02)
normal_(self.query, std=.02)

# visual.py 第132行
# trunc_normal_(m.weight, std=.02)
normal_(m.weight, std=.02)
1
2
3
4
5
6
7
8
9
10
11

其实报错原因可以很明显的看出来是由于erfinv_cuda和BFloat16之间的兼容性问题，理论上应该可以将 --bf16 置为False避免这个问题，不过：

我们支持混合精度训练，因此你可以设置–bf16 True或者–fp16 True。经验上，如果你的机器支持bf16，我们建议使用bf16，这样可以和我们的预训练和对齐训练保持一致，这也是为什么我们把默认配置设为它的原因。

另外，normal_与trunc_normal_不同之处在于,trunc_normal_会将数据控制在a和b之间，有造成模型初始化最大值和最小值不同，不清楚对于训练的影响是否会存在很大差异：

torch.nn.init.trunc_normal_(tensor, mean=0.0, std=1.0, a=-2.0, b=2.0, generator=None)
1

torch.nn.init.trunc_normal_(tensor, mean=0.0, std=1.0, a=-2.0, b=2.0, generator=None)
Fill the input Tensor with values drawn from a truncated normal distribution.
The values are effectively drawn from the normal distribution N(mean,std²)N(mean,std²) with values outside [*a,b*][*a,b*] redrawn until they are within the bounds. The method used for generating the random values works best when a≤mean≤b.

torch.nn.init.normal_(tensor, mean=0.0, std=1.0, generator=None)
1

Fill the input Tensor with values drawn from the normal distribution.
N(mean,std²)N(mean,std²).

错误二 Only Support Self-Attention Currently

AssertionError: Only Support Self-Attention Currently
1

参考知乎魩雨给出的意见，还有一些其他报错可以参考，比如deepspeed等，这些暂时训练时还未遇到，若后期遇到进一步验证，修改Qwen-VL-Chat/visual.py下的相关内容。

# visual.py 第192行
# assert query is key, 'Only Support Self-Attention Currently'
assert torch.allclose(query, key), 'Only Support Self-Attention Currently'
1
2
3

模型整合

错误三 ‘QWenTokenizer’ object has no attribute ‘IMAGE_ST’

AttributeError: 'QWenTokenizer' object has no attribute 'IMAGE_ST'
1

参考github中issue287给出的意见，修改tokenization_qwen.py下的相关内容。

class QWenTokenizer(PreTrainedTokenizer):
    ...
    # super().__init__(**kwargs)
    self.image_start_tag = image_start_tag
    self.image_end_tag = image_end_tag
    self.image_pad_tag = image_pad_tag
    self.ref_start_tag = ref_start_tag
    self.ref_end_tag = ref_end_tag
    self.box_start_tag = box_start_tag
    self.box_end_tag = box_end_tag
    self.quad_start_tag = quad_start_tag
    self.quad_end_tag = quad_end_tag
    self.IMAGE_ST = (
        ref_start_tag, ref_end_tag,
        box_start_tag, box_end_tag,
        quad_start_tag, quad_end_tag,
        image_start_tag, image_end_tag,
        image_pad_tag
    )
    super().__init__(**kwargs)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
'运行

错误四：size mismatch for base_model.model.transformer.wte.modules_to_save.default.weight

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.transformer.wte.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 4096]) from checkpoint, the shape in current model is torch.Size([151860, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 4096]) from checkpoint, the shape in current model is torch.Size([151860, 4096]).
1
2
3

预留参考issue415，修改模型文件夹Qwen-VL下的tokenization_qwen.py第45行内容，填补未训练的76个tokens：
代码修改如下：

# EXTRAS = tuple((f"<|extra_{i}|>" for i in range(205)))
EXTRAS = tuple((f"<|extra_{i}|>" for i in range(281)))
1
2

目前模型给出的self.tokenizer.n_vocab的长度为151860 ，这个数字是qwen.tiktoken的长度151643 + 217个特殊字符的个数计算而来，而模型的配置文件中的长度为 “vocab_size”: 151936 ，造成Qwen-VL经过lora微调后无法对齐，目前还缺少76个字符

如果采用Qwen-VL-chat的话则不用担心，因为finetune.py：

if lora_args.q_lora or "chat" in model_args.model_name_or_path.lower():
    modules_to_save = None
else:
    modules_to_save = ["wte", "lm_head"]
1
2
3
4

使用q_lora和chat模型的话，是不会引入这两个参数的。

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/小舞很执着/article/detail/1017916