赞
踩
本系列以同济大学的检索增强生成(RAG)综述[1],ACL2023 检索增强语言模型(RALM) Tutorial[2]作为参考材料,讲解RAG的前世今身,包含概述,评估方法,检索器,生成器,增强方法,多模态RAG等内容。
本篇为概述篇,介绍RAG的基本概念和技术分类。
GPT系列,LLaMA,Gemini等取得了很好的效果,但是对于特定领域或高度专业化的用户查询,尤其是查询超出了模型的训练数据或者需要最新的信息时,大模型很容易产生错误的信息或幻觉现象,从而使得大模型的落地较为困难,一个缓解这些限制的方法是检索增强生成(RAG)。RAG将外部的数据检索整合到生成过程中,从而使模型能够提供准确且相关的回复。
[3]提出了RAG的概念,如下图,给定query先用最小内积搜索(MIPS)检索最相关的k个文档,得到相应的文档编码z,利用查询编码x和z组合输入生成器得到最终的输出。
RAG发展经历了几个阶段,最开始以基于Transformer的预训练阶段的优化为主,接下来是一段沉寂期。直到ChatGPT的出现,RAG研究重点转向为利用LLM的能力获得更高的可控性和解决不断发展的需求,主流的研究集中在推理上,微调的研究相对较少。随着GPT4的出现,研究重点再次转变为混合方法,即推理,微调和预训练三种方法及结合。
下图是[1]中一个简单的RAG应用流。包含用户输入,文档查询,文档与提示组合,获得输出四个步骤。
下图是[2]中定义的检索增强生成的基本组成,即必要条件为:推理阶段使用外部数据库进行辅助生成。
[2]中主要认为检索增强生成主要需要解决3个问题,即检索什么,怎样使用检索结果,什么时候检索。下列小括号中为代表性方法。
总的来说RAG融合了信息检索技术和上下文学习技术来提升LLM能力。现在的研究一般将RAG分为三步进行:
下图为三种RAG框架的展示与对比。三种RAG从左至右逐渐扩大范围,右边包含了左边的技术。此部分内容也可参考IVAN ILIN大佬的博客Advanced RAG Techniques: an Illustrated Overview
Naive RAG主要包含了检索和阅读两个模块,而Advanced RAG在Naive RAG基础上主要增加了查询重写和重排序,而Modular RAG在Advance RAG基础上更加多样化且更加灵活。
Naive RAG是基本的“检索-阅读”框架:索引,检索,生成三步走。
索引包括文档转换(将pdf,html,word,markdown等内容转换为纯文本),分块,编码得到文本块向量,索引建立(一般使用向量数据库比如Faiss,Milvus等)
存在的问题:
Advanced RAG主要优化Naive RAG中的检索流程
Modular RAG结合了各种方法来提升各个模块性能。模块化的RAG范式正日益成为RAG领域的规范,允许使用序列化流水线或跨多个模块的端到端训练方法。
这一部分主要优化了检索流程,力求取得检索效率和上下文信息含量的平衡。
大家好,我是NLP研究者BrownSearch,如果你觉得本文对你有帮助的话,不妨点赞或收藏支持我的创作,您的正反馈是我持续更新的动力!如果想了解更多LLM/检索的知识,记得关注我!
[1]Gao Y, Xiong Y, Gao X, et al. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint arXiv:2312.10997, 2023.
[2]Asai A, Min S, Zhong Z, et al. Retrieval-based language models and applications[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts). 2023: 41-46.
[3]Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474.
[4]Ram O, Levine Y, Dalmedigos I, et al. In-context retrieval-augmented language models[J]. arXiv preprint arXiv:2302.00083, 2023.
[5]Khandelwal U, Levy O, Jurafsky D, et al. Generalization through Memorization: Nearest Neighbor Language Models[C]//International Conference on Learning Representations. 2019.
[6]He J, Neubig G, Berg-Kirkpatrick T. Efficient Nearest Neighbor Language Models[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 5703-5714.
[7]Shi W, Min S, Yasunaga M, et al. Replug: Retrieval-augmented black-box language models[J]. arXiv preprint arXiv:2301.12652, 2023.
[8]Guu K, Lee K, Tung Z, et al. Retrieval augmented language model pre-training[C]//International conference on machine learning. PMLR, 2020: 3929-3938.
[9]Nishikawa S, Ri R, Yamada I, et al. EASE: Entity-Aware Contrastive Learning of Sentence Embedding[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022: 3870-3885.
[10]Kang M, Kwak J M, Baek J, et al. Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation[J]. arXiv preprint arXiv:2305.18846, 2023.
[11]Borgeaud S, Mensch A, Hoffmann J, et al. Improving language models by retrieving from trillions of tokens[C]//International conference on machine learning. PMLR, 2022: 2206-2240.
[12]Khattab O, Santhanam K, Li X L, et al. Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP[J]. arXiv preprint arXiv:2212.14024, 2022.
[13]Liang H, Zhang W, Li W, et al. InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions[J]. arXiv preprint arXiv:2304.05684, 2023.
[14]Wang Y, Li P, Sun M, et al. Self-knowledge guided retrieval augmentation for large language models[J]. arXiv preprint arXiv:2310.05002, 2023.
[15]Jiang Z, Xu F F, Gao L, et al. Active retrieval augmented generation[J]. arXiv preprint arXiv:2305.06983, 2023.
[16]Huang J, Ping W, Xu P, et al. Raven: In-context learning with retrieval augmented encoder-decoder language models[J]. arXiv preprint arXiv:2308.07922, 2023.
[17]Izacard G, Lewis P, Lomeli M, et al. Few-shot learning with retrieval augmented language models[J]. arXiv preprint arXiv:2208.03299, 2022.
[18]Li X, Liu Z, Xiong C, et al. Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data[J]. arXiv preprint arXiv:2305.19912, 2023.
[19]Jiang H, Wu Q, Lin C Y, et al. Llmlingua: Compressing prompts for accelerated inference of large language models[J]. arXiv preprint arXiv:2310.05736, 2023.
[20]Litman R, Anschel O, Tsiper S, et al. Scatter: selective context attentional scene text recognizer[C]//proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11962-11972.
[21]Jiang H, Wu Q, Lin C Y, et al. LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023: 13358-13376.
[22]Xu F, Shi W, Choi E. Recomp: Improving retrieval-augmented lms with compression and selective augmentation[J]. arXiv preprint arXiv:2310.04408, 2023.
[23]Xu P, Ping W, Wu X, et al. Retrieval meets long context large language models[J]. arXiv preprint arXiv:2310.03025, 2023.
[24]Chen H, Pasunuru R, Weston J, et al. Walking down the memory maze: Beyond context limit through interactive reading[J]. arXiv preprint arXiv:2310.05029, 2023.
[25]Cheng X, Luo D, Chen X, et al. Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory[J]. arXiv preprint arXiv:2305.02437, 2023.
[26]Adrian H. Raudaschl. Forget rag, the future is rag-fusion. https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1, 2023.
[27]Li X, Nie E, Liang S. From classification to generation: Insights into crosslingual retrieval augmented icl[J]. arXiv preprint arXiv:2311.06595, 2023.
[28]Yu W, Iter D, Wang S, et al. Generate rather than retrieve: Large language models are strong context generators[J]. arXiv preprint arXiv:2209.10063, 2022.
[29]Cheng D, Huang S, Bi J, et al. UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation[J]. arXiv preprint arXiv:2303.08518, 2023.
[30]Dai Z, Zhao V Y, Ma J, et al. Promptagator: Few-shot Dense Retrieval From 8 Examples[C]//The Eleventh International Conference on Learning Representations. 2022.
[31]Ma X, Gong Y, He P, et al. Query Rewriting for Retrieval-Augmented Large Language Models[J]. arXiv preprint arXiv:2305.14283, 2023.
[32]Sun Z, Wang X, Tay Y, et al. Recitation-augmented language models[J]. arXiv preprint arXiv:2210.01296, 2022.
[33]Shao Z, Gong Y, Shen Y, et al. Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy[J]. arXiv preprint arXiv:2305.15294, 2023.
[34]Zheng H S, Mishra S, Chen X, et al. Take a step back: evoking reasoning via abstraction in large language models[J]. arXiv preprint arXiv:2310.06117, 2023.
[35]Gao L, Ma X, Lin J, et al. Precise zero-shot dense retrieval without relevance labels[J]. arXiv preprint arXiv:2212.10496, 2022.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。