赞
踩
- >>> from transformers import AutoModelForCausalLM, AutoTokenizer
- >>> model_dir = "qwen/Qwen-VL-Chat-Int4"
- >>> tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
- >>> model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",trust_remote_code=True).eval()
-
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- File "/ssd1/miniconda3/envs/pytorch2.1.2/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 511, in from_pretrained
- return model_class.from_pretrained(
- File "/ssd1/miniconda3/envs/pytorch2.1.2/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2945, in from_pretrained
- model = quantizer.convert_model(model)
- File "/ssd1/miniconda3/envs/pytorch2.1.2/lib/python3.8/site-packages/optimum/gptq/quantizer.py", line 229, in convert_model
- self._replace_by_quant_layers(model, layers_to_be_replaced)
- File "/ssd1/miniconda3/envs/pytorch2.1.2/lib/python3.8/site-packages/optimum/gptq/quantizer.py", line 256, in _replace_by_quant_layers
- QuantLinear = dynamically_import_QuantLinear(
- TypeError: dynamically_import_QuantLinear() got an unexpected keyword argument 'disable_exllamav2'
pip install optimum==1.12.0
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。