赞
踩
python run.py --datasets ceval_gen --hf-path /root/model/Shanghai_AI_Laboratory/internlm2-chat-7b/ --tokenizer-path /root/model/Shanghai_AI_Laboratory/internlm2-chat-7b/ --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug
python run.py --datasets ceval_gen --hf-path /share/temp/model_repos/internlm2-chat-7b/ --tokenizer-path /share/temp/model_repos/internlm2-chat-7b/ --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug
这次评估不知道为什么没有结果
重新搭了环境,还是没结果,但internlm-chat-7b是有结果的
全部删了重新搭环境,再次评测,出结果了
配置文件为
from mmengine.config import read_base from opencompass.models.turbomind import TurboMindModel with read_base(): # choose a list of datasets from .datasets.ceval.ceval_gen_5f30c7 import ceval_datasets datasets = sum((v for k, v in locals().items() if k.endswith('_datasets')), []) internlm2_meta_template = dict( round=[ dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'), dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True), ], eos_token_id=92542 ) # config for internlm-chat-7b internlm2_chat_7b = dict( type=TurboMindModel, abbr='internlm2-chat-7b-turbomind', path='internlm/internlm2-chat-7b', engine_config=dict(session_len=2048, max_batch_size=32, rope_scaling_factor=1.0), gen_config=dict(top_k=1, top_p=0.8, temperature=1.0, max_new_tokens=100), max_out_len=100, max_seq_len=2048, batch_size=32, concurrency=32, meta_template=internlm2_meta_template, run_cfg=dict(num_gpus=1, num_procs=1), end_str='<|im_end|>' ) models = [internlm2_chat_7b]
然后在命令行输入:
~/opencompass# python run.py configs/eval_internlm2_chat_7b_turbomind.py -w outputs/turbomind/internlm2-chat-7b
开始评估
评估结果
可以看出,lmdeploy部署后的internlm_chat_7b评测结果有明显提升!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。