赞
踩
ArXiv:https://arxiv.org/pdf/2211.10435
GitHub:https://reasonwithpal.com/
This bridges an important gap in chain-of- thought-like methods, where reasoning chains can be correct but produce an incorrect answer.
本文提出Program-Aided Language Model(PAL)。
相比于Chain-of-thought,每一个exemplar中包含一个推理路径,这个推理路径时融合了自然语言和python代码。且最终只提供完整的变成代码,不提供最终答案。大模型在该prompt的引导下对目标测试样本进行推理和代码生成,最终借助python解释器获得最终答案。
下图展示了一个同时含有自然语言和python代码的推理路径:
PAL方法与CoT的对比图如下所示:
Exemplar的构建
对于评测数据集中,如果现有的工作如果已经提供了exemplar,则直接使用,否则则随机采样3~6个标注样本作为exemplar。
推理路径中的代码函数名称也要与原始变量名保持一致,采用下划线分割的形式定义。
For example, a variable that describes the number of apples in the basket should have a name such as num apples in basket. This keeps the generated code linked to the entities in the question.
数据集:
GSM8K-Hard:
作者通过启发式更改数字的方式构建了一个新的数据集,并基于这个数据发现50%的情况下大模型虽然给出正确的推理思路但是由于交大的数字计算存在错误导致最终预测错误。
符号推理的prompt样例:
数学运算实验结果:
针对数学运算、符号推理、算法运算三种类型的任务分别设计了带有编程语言和自然语言的prompt。
MATH_CHAT_BETA_PROMPT = '''
Let's use python to solve math problems. Here are three examples how to do it,
Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
def solution():
“”“Olivia has $23. She bought five bagels for $3 each. How much money does she have left?”“”
money_initial = 23
bagels = 5
bagel_cost = 3
money_spent = bagels * bagel_cost
money_left = money_initial - money_spent
result = money_left
return result
Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?
def solution():
“”“Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?”“”
golf_balls_initial = 58
golf_balls_lost_tuesday = 23
golf_balls_lost_wednesday = 2
golf_balls_left = golf_balls_initial - golf_balls_lost_tuesday - golf_balls_lost_wednesday
result = golf_balls_left
return result
Q: There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?
def solution():
“”“There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?”“”
computers_initial = 9
computers_per_day = 5
num_days = 4 # 4 days between monday and thursday
computers_added = computers_per_day * num_days
computers_total = computers_initial + computers_added
result = computers_total
return result
How about this question?
Q: {question}
'''.strip()
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。