当前位置:   article > 正文

LLM-Intro to Large Language Models

LLM-Intro to Large Language Models

LLM

some LLM’s model and weight are not opened to user

what is?

Llama 270b model

  • 2 files

    • parameters file
      • parameter or weight of neural network
      • parameter – 2bytes, float number
    • code run parameters(inference)
      • c or python, etc
      • for c, 500 lines code without dependency to run
      • self contained package(no network need)
  • how to get parameters?

    • lossy compress large chunk of text (10TB) with 6000 GPU for 12 days (cost 200$) to 140G zip file(gestalt of the text, weights and parameters)
  • what neural do is trying to predict the next word in a sequence. parameters are dispersed throughout the neural network and neurons are connected to each other, fire in a certain way
    在这里插入图片描述

  • prediction has strong relationship with compression

  • LLM create a correct form of text and fill it with its knowedge. not create a copy of text that was be trained.

  • how does it work?

在这里插入图片描述
在这里插入图片描述

training stage

  • pre-training

    • expensive
    • base model. get a document generator model
    • it’s about knowledge
    • internet documents
  • fine tuning

    • cheaper
    • assistant model. get a assistant model
    • it’s about alighment
    • Q&A document
    • training with high quality conversation(question and answer).write labeling instructions to specify how assistant should behave
    • focus on quality not amount
      在这里插入图片描述
  • stage 3(optional)

    • use comparison label
    • reenforcement learning from human feedback

在这里插入图片描述

  • labeling is a human-machine collaboration

在这里插入图片描述

  • rank of LLM

在这里插入图片描述

LLM scaling laws:

  • more D and N will get better model

在这里插入图片描述

在这里插入图片描述

  • multimodality. now some LLM like GPT can use different tools to help it with answering questions. browser, calculator, python interpreter.

  • future directions of development in LLM

give LLM system 2 ablility

在这里插入图片描述
在这里插入图片描述

  • LLM now only have system one(instinctive)
  • convert time to accuracy

self-improvement

在这里插入图片描述

  • in narrow domain it is possible to self-improve

customization

experts in certain domain

future of LLM

在这里插入图片描述

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/AllinToyou/article/detail/565356
推荐阅读
相关标签
  

闽ICP备14008679号