当前位置:   article > 正文

Llama 本地推理huggingFace模型的两种方式_huggingface opt模型推理代码

huggingface opt模型推理代码

 方式1:

  1. from transformers import LlamaForCausalLM, AutoTokenizer
  2. #下载好的hf模型地址
  3. hf_model_path = './Llama-2-7b'
  4. model = LlamaForCausalLM.from_pretrained(hf_model_path, device_map="auto")
  5. tokenizer = AutoTokenizer.from_pretrained(hf_model_path)
  6. prompt = "Hey, are you conscious? Can you talk to me?"
  7. inputs = tokenizer(prompt, return_tensors="pt")
  8. # Generate
  9. generate_ids = model.generate(inputs.input_ids, max_length=30)
  10. res = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
  11. print(res)

方式2:

  1. import transformers,torch
  2. from transformers import LlamaForCausalLM, AutoTokenizer
  3. #下载好的hf模型地址
  4. hf_model_path = './Llama-2-7b'
  5. tokenizer = AutoTokenizer.from_pretrained(hf_model_path)
  6. pipeline = transformers.pipeline(
  7. "text-generation",
  8. model=hf_model_path,
  9. torch_dtype=torch.float16,
  10. device_map="auto",
  11. )
  12. sequences = pipeline(
  13. 'I liked "Breaking Bad" and "Band of Brothers". Do you have any recommendations of other shows I might like?\n',
  14. do_sample=True,
  15. top_k=10,
  16. num_return_sequences=1,
  17. eos_token_id=tokenizer.eos_token_id,
  18. max_length=200,
  19. )
  20. for seq in sequences:
  21. print(f"Result: {seq['generated_text']}")

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/神奇cpp/article/detail/865336
推荐阅读
相关标签
  

闽ICP备14008679号