Prompt Engineering 面面观

提示工程(Prompt Engineering),也称为 In-Context Prompting,是指在不更新模型权重的情况下如何与 LLM 交互以引导其行为以获得所需结果的方法。 在提示工程中,任务的描述会被嵌入到输入中。例如,不是隐含地给予模型一定的参数,而是以问题的形式直接输入。 提示工程的典型工作方式是将一个或多个任务转换为基于提示的数据集,并通过所谓的“基于提示的学习(prompt-based learning)”来训练语言模型



  • 指令:想要模型执行的特定任务或指令。

  • 上下文:包含外部信息或额外的上下文信息,引导语言模型更好地响应。

  • 输入数据:用户输入的内容或问题。

  • 输出指示:指定输出的类型或格式。


  • 从简单开始:在设计提示时,需要记住这是一个迭代的过程,需要大量的实验来获得最佳结果。可以从简单的提示开始,不断添加更多的元素和上下文,以获得更好的结果。

  • 指令:可以使用命令来指示模型执行各种简单任务,例如“写入”、“分类”、“总结”、“翻译”、“排序”等,从而为各种简单任务设计有效的提示。

  • 具体性:对希望模型执行的指令和任务,提示越具体和详细,结果就越好。实际上,在提示中提供示例非常有效,可以以特定格式获得所需的输出。

  • 避免不精确:这里的类比非常类似于有效的沟通——越直接,信息传递就越有效。

  • 做还是不做:设计提示时的另一个常见技巧是避免说不要做什么,而是说要做什么。


时至今日,改进提示显然有助于在不同任务上获得更好的结果。这就是提示工程背后的整个理念。在本节中,我们将介绍更高级的提示工程技术,使我们能够完成更复杂和有趣的任务,所有测试案例均通过text-davinci-003 得到。

2.1 Zero-shot 与 Few-shot

Zero-shot 与 Few-shot 是最基础的提示技术。经过大量数据训练并调整指令的LLM能够执行 Zero-shot 任务,即直接向模型输入文本以获取回答。

如,Zero-shot 输入:

Text: I'll bet the video game is a lot more fun than the film. Sentiment:


Positive - The speaker expresses that they think the video game is more enjoyable than the film.

Few-shot learning 在目标任务上提供了一组高质量的演示,每个演示都包含输入和期望的输出。 当模型首先看到好的例子时,它可以更好地理解人类的意图和需要什么样的答案的标准。 因此,少样本学习通常比零样本学习有更好的性能。 然而,它是以更多的 token 消耗为代价的,并且当输入和输出文本很长时可能会达到上下文长度限制。

如,Few-shot 输入:

Text: (lawrence bounces) all over the stage, dancing, running, sweating, mopping his face and generally displaying the wacky talent that brought him fame in the first place. Sentiment: positive Text: despite all evidence to the contrary, this clunker has somehow managed to pose as an actual feature movie, the kind that charges full admission and gets hyped on tv and purports to amuse small children and ostensible adults. Sentiment: negative Text: for the first time in years, de niro digs deep emotionally, perhaps because he's been stirred by the powerful work of his co-stars. Sentiment: positive Text: I'll bet the video game is a lot more fun than the film. Sentiment:



2.2 思维链(Chain-of-Thought, CoT)提示

CoT 提示 生成一系列短句来逐步描述推理逻辑,称为推理链或基本原理,最终得出最终答案。 CoT 的好处对于复杂的推理任务更为,同时使用大型模型(例如,参数超过 50B)时效果更加明显。


如,Few-shot CoT 输入:

Question: Tom and Elizabeth have a competition to climb a hill. Elizabeth takes 30 minutes to climb the hill. Tom takes four times as long as Elizabeth does to climb the hill. How many hours does it take Tom to climb up the hill? Answer: It takes Tom 30*4 = <<30*4=120>>120 minutes to climb the hill. It takes Tom 120/60 = <<120/60=2>>2 hours to climb the hill. So the answer is 2. === Question: Jack is a soccer player. He needs to buy two pairs of socks and a pair of soccer shoes. Each pair of socks cost $9.50, and the shoes cost $92. Jack has $40. How much more money does Jack need? Answer: The total cost of two pairs of socks is $9.50 x 2 = $<<9.5*2=19>>19. The total cost of the socks and the shoes is $19 + $92 = $<<19+92=111>>111. Jack need $111 - $40 = $<<111-40=71>>71 more. So the answer is 71. === Question: Marty has 100 centimeters of ribbon that he must cut into 4 equal parts. Each of the cut parts must be divided into 5 equal parts. How long will each final cut be? Answer:


Marty has 100 cm of ribbon that needs to be cut into 4 equal parts. Each of the four cuts will be 100/4 = <<100/4=25>>25 cm long. Marty must then divide each of the four cuts into 5 equal parts. Each of the five final cuts will be 25/5 = <<25/5=5>>5 cm long. So the answer is 5 cm.

如,Zero-shot CoT 输入:

Question: Marty has 100 centimeters of ribbon that he must cut into 4 equal parts. Each of the cut parts must be divided into 5 equal parts. How long will each final cut be? Answer: Let's think step by step.


Step 1: Marty has to cut 100 centimeters of ribbon into 4 equal parts. So, each of the parts will be 25 centimeters long. Step 2: Marty has to divide each of the 4 parts into 5 equal parts. So, each of the final cut parts will be 5 centimeters long.

2.3 指令提示(Instruction Prompting)

Instructed LM(例如 InstructGPT)使用高质量数据微调预训练模型,使 LM 更好地理解用户意图并遵循指令。 RLHF 是一种常用的方法。 instruction style fine-tuning 的好处是改进了模型,使其更符合人类的意图,并大大降低了沟通成本。


Describe what is quantum physics to a 6-year-old.


Quantum physics is like a super special game. It's all about understanding the very tiny things that make up our world. It's like trying to find out how puzzle pieces fit together to make a big picture.

In-context instruction learning 将小样本学习与指令提示相结合。 它在提示中包含多个跨不同任务的演示示例,每个演示都由指令、任务输入和输出组成。 请注意,他们的实验仅针对分类任务,指令提示包含所有标签选项。

如,In-context instruction learning 输入:

Definition: Determine the speaker of the dialogue, "agent" or "customer". Input: I have successfully booked your tickets. Ouput: agent Definition: Determine which category the question asks for, "Quantity" or "Location". Input: What's the oldest building in US? Ouput: Location Definition: Classify the sentiment of the given movie review, "positive" or "negative". Input: i'll bet the video game is a lot more fun than the film. Output:



2.4 自我一致性(Self-Consistency)采样



When I was 6 my sister was half my age. Now I’m 70 how old is my sister?


Your sister is now 35 years old.


Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done,there will be 21 trees. How many trees did the grove workers plant today? A: We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted.So, they must have planted 21 - 15 = 6 trees. The answer is 6. Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot? A: There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 = 5 cars. The answer is 5. Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total? A: Leah had 32 chocolates and Leah’s sister had 42. That means there were originally 32 + 42 = 74chocolates. 35 have been eaten. So in total they still have 74 - 35 = 39 chocolates. The answer is 39. Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipopsdid Jason give to Denny? A: Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number oflollipops he has given to Denny must have been 20 - 12 = 8 lollipops. The answer is 8. Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys doeshe have now? A: He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, soin total he has 7 + 2 = 9 toys. The answer is 9. Q: There were nine computers in the server room. Five more computers were installed each day, frommonday to thursday. How many computers are now in the server room? A: There are 4 days from monday to thursday. 5 computers were added each day. That means in total 4 * 5 =20 computers were added. There were 9 computers in the beginning, so now there are 9 + 20 = 29 computers.The answer is 29. Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How manygolf balls did he have at the end of wednesday? A: Michael initially had 58 balls. He lost 23 on Tuesday, so after that he has 58 - 23 = 35 balls. OnWednesday he lost 2 more so now he has 35 - 2 = 33 balls. The answer is 33. Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left? A: She bought 5 bagels for $3 each. This means she spent 5 Q: When I was 6 my sister was half my age. Now I’m 70 how old is my sister? A:


When you were 6, your sister was 6/2 = 3 years old. Now you are 70 years old. That means your sister is 70- 3 = 67 years old. The answer is 67.


When you were 6, your sister was 6 / 2 = 3 years old. Now that you are 70 years old, your sister mustbe 70 / 2 = 35 years old. The answer is 35.


When I was 6, my sister was 3 (half my age). 70 years have passed since then. That means my sister is now 3 + 70 = 73 years old. The answer is 73.

2.5 自动提示工程师(Automatic Prompt Engineer, APE)

APE 是一种搜索模型生成的候选指令池,然后根据所选得分函数过滤候选集以最终选择得分最高的最佳候选指令的方法。其过程可概括为3个阶段:

提示 LLM 根据输入输出对形式的一小组演示生成候选指令。如:{{Given desired input-output pairs}}\n\nThe instruction is ;


  1. 使用迭代蒙特卡洛搜索方法通过提示(如:Generate a variation of the following instruction while keeping the semantic meaning.\n\nInput: ...\n\nOutput:...)来提出语义相似的变体来改进最佳候选者。

为了构造自动户 CoT 提示,Shum et al. (2023) 建议进行剪枝选择,包括以下3步:

  1. 增强(Augment):使用 Few-shot 或 Zero-shot CoT 提示生成给定问题的多个伪思维链;

  2. 修剪(Prune):根据生成的答案是否与基本事实相匹配来修剪伪链。

  3. 选择(Select):应用减少方差的策略梯度策略来学习所选示例的概率分布,同时将示例的概率分布视为策略,将验证集的准确性视为奖励。

Zhang et al. (2023) 认为采用聚类技术对问题进行抽样,然后生成链。 他们观察到 LLM 倾向于犯某些类型的错误。 一种类型的错误在嵌入空间中可能相似,因此被组合在一起。 通过仅从频繁错误的集群中抽取一个或几个样本,我们可以防止对一种错误类型的过多错误演示,并收集一组不同的示例。

  1. 问题聚类(Question clustering):Embed 问题使用 k-means 的方法进行聚类。

  2. 示例选择(Demonstration selection):从每个集群中选择一组有代表性的问题; 即来自一个集群的一个示例。 每个簇中的样本按到簇质心的距离排序,最接近质心的样本首先被选择。

  3. 论据生成(Rationale generation):使用 Zero-shot CoT 为选定的问题生成推理链,并构建 Few-shot 提示以运行推理。


3.1 实用工具

  • OpenAI Cookbook has many in-depth examples for how to utilize LLM efficiently.

  • LangChain, a library for combining language models with other components to build applications.

  • Prompt Engineering Guide repo contains a pretty comprehensive collection of education materials on prompt engineering.

  • learnprompting.org

  • PromptPerfect

  • Semantic Kernel

3.2 数据集

  • Anthropic's Red Team dataset(opens in a new tab),(论文)(opens in a new tab)

  • Awesome ChatGPT Prompts(opens in a new tab)

  • DiffusionDB(opens in a new tab)

  • Midjourney Prompts(opens in a new tab)

  • P3 - Public Pool of Prompts(opens in a new tab)

  • PartiPrompts(opens in a new tab)

  • Real Toxicity Prompts(opens in a new tab)

  • Stable Diffusion Dataset(opens in a new tab)

  • WritingPrompts(opens in a new tab)

3.3 相关论文


  1. [1] Prompt Engineering https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/#chain-of-thought-cot

  2. [2] Prompt Engineering Guide https://www.promptingguide.ai/zh

  3. [3] https://platform.openai.com/playgroundyground

  4. [4] [2201.11903] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (arxiv.org)

  5. [5] [2211.01910] Large Language Models Are Human-Level Prompt Engineers (arxiv.org)

  6. [6] Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data



