最少到最多提示过程

最少到最多提示过程 (Least to Most prompting, LtM) 将思维链提示 (CoT prompting)进一步发展，首先将问题分解为子问题，然后逐个解决。它是受到针对儿童的现实教育策略的启发而发展出的一种技术。

与思维链提示过程类似，需要解决的问题被分解成一组建立在彼此之上的子问题。在第二步中，这些子问题被逐个解决。与思维链不同的是，先前子问题的解决方案被输入到提示中，以尝试解决下一个问题。

LtM 的图示

示例：回复客户咨询

让我们问一个稍微复杂的客服问题：

客户咨询:

我在3月1日购买了你们 Arnold 系列的一件T恤。我看到它有折扣，所以买了一件原价为 30 美元的衬衫，打了 6 折。我发现你们现在有一个新的T恤折扣活动，50% off。我想知道我是否可以退货，同时拥有足够的店内积分来购买两件你们的T恤？

指令:

你是一名客户服务代表，负责友好地回复客户的查询。退货在 30 天内允许。今天是 3 月 29 日。目前所有衬衫都有 50% 的折扣。你的店里衬衫价格在 18-100 美元之间。请勿编造任何关于折扣政策的信息。

尊敬的客户，

感谢你购买我们的 Arnold 系列T恤。我们很高兴能为你提供服务。

很遗憾，我们不能接受你的退货，因为你已经超过了我们的30天退货政策。但是，我们现在正在举行一个特别的T恤折扣活动，所有T恤都有50％的折扣。你可以使用你的店内积分来购买两件T恤

这个回答是错误的（目前还在退货时间内），那我们来将问题分解为子问题试试：

客户咨询:

指令:

在回答这个问题之前，必须解决哪些子问题？

客户购买衬衫的日期；
当前的折扣活动；
店内衬衫的价格范围；
客户是否有足够的店内积分；
客户退货的政策。

让我们试着解决第一个子问题：

客户咨询:

指令:

确定客户是否在30天的退货期内。让我们一步一步来。

尊敬的客户，非常感谢你对 Arnold 系列的支持！我们很高兴地向你提供服务。我们的退货政策是，客户可以在购买商品的 30 天内进行退货。由于你是在 3 月 1 日购买的，所以你仍然在退货期内，可以退货。

此外，我们目前正在举行一个 50% 的折扣活动，你可以使用你的店内积分购买两件 Arnold 系列的 T 恤。我们的衬衫价格在 18-100 美元之间，你可以根据自己的需求选择合适的衬衫。

如果你有任何其他问题，请随时与我们联系。我们将竭诚为你服务！

仅仅通过解决第一个子问题，我们就能解决整个问题。如果 GPT-3 没有立即给出答案，我们可以解决下一个子问题，直到它返回答案。值得注意的是，我们使用 让我们一步一步来 (Let's go step by step.)的提示短语。这个提示不是必须的，但对于这个例子来说效果很好。

示例：字符连接

LtM 最初是使用少样本提示的方式引入的，而不是显式指令将问题分解为多个步骤（如上所示）。除此之外，有时也可以使用单一提示而不是提示链来实现它。让我们来看看字符连接的尾字问题，例如给定输入词语 思考、机器，则输出应为 考器。

第一次尝试：标准提示

即使使用更先进的模型（如 text-davinci-003），标准提示与少样本示例的表现也非常糟糕。

Q: think, machine

A: ke

Q: learning, reasoning, generalization

A: ggn

Q: artificial, intelligence

A: le

Q: transformer, language, vision

A: ren

Q: foo,bar,baz,blip

lip

第二次尝试：思维链

思维链的表现比标准提示好得多。这是因为它现在允许模型考虑自己提取每个单词的最后一个字母，将复杂性降低到分组已经收集的字母的行为。然而，这种方法在更长的输入下也可能慢慢出现问题。

Q: think, machine

A: The last letter of "think" is "k". The last letter of "machine" is "e". So "think, machine" is "ke".

Q: learning, reasoning, generalization

A: The last letter of "learning" is "g". The last letter of "reasoning" is "n". The last letter of "generalization" is "n". So "learning, reasoning, generalization" is "ggn".

Q: artificial, intelligence

A: The last letter of "artificial" is "l". The last letter of "intelligence" is "e". So "artificial, intelligence" is "le".

Q: transformer, language, vision

A: The last letter of "transformer" is "r". The last letter of "language" is "e". The last letter of "vision" is "n". So "transformer, language, vision" is "ren".

Q: foo,bar,baz,blip

The last letter of "foo" is "o". The last letter of "bar" is "r". The last letter of "baz" is "z". The last letter of "blip" is "p". So "foo,bar,baz,blip" is "orzp".

第三次尝试：LtM（单一提示）

使用 LtM，我们通过重新表述先前串联的结果来增强思维链的概念。这种做法使得每个步骤变的简单，即每次只需要连接一个字符。这种方法带来了非常好的效果，12 个乃至更多的词都能得到正确结果。

这种方法看起来与思维链非常相似，但在概念上大有不同。在这里，每一步都引入了上一步连接的结果。例如，在“思考、机器、学习”的这个例子种，它不会单独连接字符“考”，“器”，“习”，而是连接“考”和“器”，然后连接“考器”和“习”。由于重新引入了上一步的结果，模型现在可以推广到更长的链，因为它每一步都带着增量结果，同时单步骤内只需要做很少的工作。

Q: think, machine

A: The last letter of "think" is "k". The last letter of "machine" is "e". Concatenating "k" and "e" gives "ke". So "think, machine" output "ke".

Q: think, machine, learning

A: "think, machine" outputs "ke". The last letter of "learning" is "g". Concatenating "ke" and "g" gives "keg". So "think, machine, learning" is "keg".

Q: transformer, language

A: The last letter of "transformer" is "r". The last letter of "language" is "e". Concatenating "r" and "e" gives "re". So "transformer, language" is "re".

Q: transformer, language, vision

A: "transformer, language" outputs "re". The last letter of "vision" is "n". Concatenating "re" and "n" gives "ren". So "transformer, language, vision" is "ren".

Q: foo,bar,baz,blip,learn,prompting,world,shaking,event,dancefloor,prisma,giraffe

The last letter of "foo" is "o". The last letter of "bar" is "r". Concatenating "o" and "r" gives "or". The last letter of "baz" is "z". Concatenating "or" and "z" gives "orz". The last letter of "blip" is "p". Concatenating "orz" and "p" gives "orzp". The last letter of "learn" is "n". Concatenating "orzp" and "n" gives "orzpn". The last letter of "prompting" is "g". Concatenating "orzpn" and "g" gives "orzpng". The last letter of "world" is "d". Concatenating "orzpng" and "d" gives "orzpngd". The last letter of "shaking" is "g". Concatenating "orzpngd" and "g" gives "orzpngdg". The last letter of "event" is "t". Concatenating "orzpngdg" and "t" gives "orzpngdgt".

结论

在具有 12 个词的字符问题上，思维链的准确率为 34％，而 LtM 的准确率为 74％（该论文使用 text-davinci-002 作为模型）

示例：组合泛化问题(compositional generalization) (SCAN)

SCAN 基准测试要求模型将自然语言转换为动作序列。例如，句子 “run left and walk twice” 将被翻译为 “TURN_LEFT + RUN + WALK * 2”。当面对训练集中长度更长的序列时，语言模型的表现尤其差。

第一次尝试：标准提示

使用简单的标准提示，text-davinci-003 的表现非常出色，但仍然失败了。

Q: turn left

A: TURN LEFT

Q: turn right

A: TURN RIGHT

Q: jump left

A: TURN LEFT + JUMP

Q: run right

A: TURN RIGHT + RUN

Q: look twice

A: LOOK * 2

Q: run and look twice

A: RUN + LOOK * 2

Q: jump right thrice

A: (TURN RIGHT + JUMP) * 3

Q: walk after run

A: RUN + WALK

Q: turn opposite left

A: TURN LEFT * 2

Q: turn around left

A: TURN LEFT * 4

Q: turn opposite right

A: TURN RIGHT * 2

Q: turn around right

A: TURN RIGHT * 4

Q: walk opposite left

A: TURN LEFT * 2 + WALK

Q: walk around left

A: (TURN LEFT + WALK) * 4

Q: "jump around left twice after walk opposite left thrice"

(TURN LEFT * 2 + WALK) * 3 + (TURN LEFT + JUMP) * 2

第二次尝试：LtM，第一步 - 缩减

在这里，我们使用两个不同的提示。第一个提示用于将输入问题缩减为一个步骤序列。第二个提示用于将这个缩减后的步骤序列映射到实际的操作中。

这两个提示都相当长，因而使用压缩的 Python 符号表示操作，以节省标记（tokens）。

第一步将自然语言描述分解为更明确但仍类似人类的语言。这将有助于映射步骤按顺序解决问题。例如，“jump around left twice” 被简化为 “jump left” -> TURN_LEFT + JUMP 和 “jump around left” -> (TURN_LEFT + JUMP) * 4。同样，减少步骤是用来解释重复概念（twice、thrice 等）的。

Q: look right after look twice

A: "look right after look twice" can be solved by: "look right", "look twice".

Q: jump opposite right thrice and walk

A: "jump opposite right thrice" can be solved by: "jump opposite right", "jump opposite right thrice". "walk" can be solved by: "walk". So, "jump opposite right thrice and walk" can be solved by: "jump opposite right", "jump opposite right thrice", "walk".

Q: run left twice and run right

A: "run left twice" can be solved by: "run left", "run left twice". "run right" can be solved by "run right". So, "run left twice and run right" can be solved by: "run left", "run left twice", "run right".

Q: run opposite right

A: "run opposite right" can be solved by "run opposite right".

Q: look opposite right thrice after walk

A: "look opposite right thrice" can be solved by: "look opposite right", "look opposite right thrice". "walk" can be solved by "walk". So, "look opposite right thrice after walk" can be solved by: "look opposite right", "look opposite right thrice", "walk".

Q: jump around right

A: "jump around right" can be solved by: "jump right", "jump around right". So, "jump around right" can be solved by: "jump right", "jump around right".

Q: look around right thrice and walk

A: "look around right thrice" can be solved by: "look right", "look around right", "look around right thrice". "walk" can be solved by "walk". So, "look around right thrice and walk" can be solved by: "look right", "look around right", "look around right thrice", "walk".

Q: turn right after run right thrice

A: "turn right" can be solved by: "turn right". "run right thrice" can be solved by: "run right", "run right thrice". So, "turn right after run right thrice" can be solved by: "turn right", "run right", "run right thrice".

Q: jump around left twice after walk opposite left thrice

"jump around left twice" can be solved by: "jump left", "jump around left", "jump around left twice". "walk opposite left thrice" can be solved by: "walk opposite left", "walk opposite left thrice". So, "jump around left twice after walk opposite left thrice" can be solved by: "jump left", "jump around left", "jump around left twice", "walk opposite left", "walk opposite left thrice".

第二次尝试：LtM，第二步 - 映射

在第二步中，我们使用缩减过的结果，并再次使用相当长的提示（14个案例）将简化的自然语言描述转换为一系列操作。

在这里，我们注入第一步的输出：

"jump around left twice" can be solved by: "jump left", "jump around left", "jump around left twice". "walk opposite left thrice" can be solved by: "walk opposite left", "walk opposite left thrice". So, "jump around left twice after walk opposite left thrice" can be solved by: "jump left", "jump around left", "jump around left twice", "walk opposite left", "walk opposite left thrice".

到 LLM 中。

Q: turn left

A: "turn left" outputs "TURN LEFT".

Q: turn right

A: "turn right" outputs "TURN RIGHT".

Q: jump left

A: The output of "jump left" concatenates: the output of "turn left", the output of "jump". "turn left" outputs "TURN LEFT". "jump" outputs "JUMP". So concatenating the output of "turn left" and the out- put of "jump" leads to "TURN LEFT" + "JUMP". So the output of "jump left" is "TURN LEFT" + "JUMP".

Q: run right

A: The output of "run right" concatenates: the output of "turn right", the output of "run". "turn right" outputs "TURN RIGHT". "run" outputs "RUN". So concatenating the output of "turn right" and the output of "run" leads to "TURN RIGHT" + "RUN". So the output of "run right" is "TURN RIGHT" + "RUN".

Q: look twice

A: The output of "look twice" concatenates: the output of "look", the output of "look". "look" outputs "LOOK". So repeating the output of "look" two times leads to "LOOK" * 2. So the output of "look twice" is "LOOK" * 2.

Q: run and look twice

A: The output of "run and look twice" concatenates: the output of "run", the output of "look twice". "run" outputs "RUN". "look twice" outputs "LOOK" * 2. So concatenating the output of "run" and the output of "look twice" leads to "RUN" + "LOOK" * 2. So the output of "run and look twice" is "RUN" + "LOOK" * 2.

Q: jump right thrice

A: The output of "jump right thrice" concatenates: the output of "jump right", the output of "jump right", the output of "jump right". "jump right" outputs "TURN RIGHT" + "JUMP". So repeating the output of "jump right" three times leads to ("TURN RIGHT" + "JUMP") * 3. So the output of "jump right thrice" is ("TURN RIGHT" + "JUMP") * 3.

Q: walk after run

A: The output of "walk after run" concatenates: the output of "run", the output of "walk". "run" outputs "RUN". "walk" outputs "WALK". So concatenating the output of "run" and the output of "walk" leads to "RUN" + "WALK". So the output of "walk after run" is "RUN" + "WALK".

Q: turn opposite left

A: The output of "turn opposite left" concatenates: the output of "turn left", the output of "turn left". "turn left" outputs "TURN LEFT". So repeating the output of "turn left" twice leads to "TURN LEFT" * 2. So the output of "turn opposite left" is "TURN LEFT" * 2.

Q: turn around left

A: The output of "turn around left" concatenates: the output of "turn left", the output of "turn left", the output of "turn left", the output of "turn left". "turn left" outputs "TURN LEFT". So repeating the output of "turn left" four times leads to "TURN LEFT" * 4. So the output of "turn around left" is "TURN LEFT" * 4.

Q: turn opposite right

A: The output of "turn opposite right" concatenates: the output of "turn right", the output of "turn right". "turn right" outputs "TURN RIGHT". So repeating the output of "turn right" twice leads to "TURN RIGHT" * 2. So the output of "turn opposite right" is "TURN RIGHT" * 2.

Q: turn around right

A: The output of "turn around right" concatenates: the output of "turn right", the output of "turn right", the output of "turn right", the output of "turn right". "turn right" outputs "TURN RIGHT". So repeating the output of "turn right" four times leads to "TURN RIGHT" * 4. So the output of "turn around right" is "TURN RIGHT" * 4.

Q: walk opposite left

A: The output of "walk opposite left" concatenates: the output of "turn opposite left", the output of "walk". "turn opposite left" outputs "TURN LEFT" * 2. "walk" outputs "WALK". So concatenating the output of "turn opposite left" and the output of "walk" leads to "TURN LEFT" * 2 + "WALK". So the output of "walk opposite left" is "TURN LEFT" * 2 + "WALK".

Q: walk around left

A: The output of "walk around left" concatenates: the output of "walk left", the output of "walk left", the output of "walk left", the output of "walk left". "walk left" outputs "TURN LEFT" + "WALK". So repeating the output of "walk around left" four times leads to ("TURN LEFT" + "WALK") * 4. So the output of "walk around left" is ("TURN LEFT" + "WALK") * 4.

Q: "jump around left twice after walk opposite left thrice"

The output of "jump around left twice after walk opposite left thrice" concatenates: the output of "walk opposite left thrice", the output of "jump around left twice". "walk opposite left thrice" outputs "TURN LEFT" * 2 + "WALK" * 3. "jump around left twice" outputs ("TURN LEFT" + "JUMP") * 4. So concatenating the output of "walk opposite left thrice" and the output of "jump around left twice" leads to "TURN LEFT" * 2 + "WALK" * 3 + ("TURN LEFT" + "JUMP") * 4. So the output of "jump around left twice after walk opposite left thrice" is "TURN LEFT" * 2 + "WALK" * 3 + ("TURN LEFT" + "JUMP") * 4.

结论

LtM 带来了多项提升：

相对于思维链提高了准确性
在难度高于提示的问题上提升了泛化能力
在组合泛化方面的性能得到了显著提高，特别是在SCAN基准测试中

使用 text-davinci-002（论文中使用的模型）的标准提示解决了 6% 的 SCAN 问题，而 LtM 提示则取得了惊人的 76% 的成功率。在 code-davinci-002 中，结果更为显著，LtM 达到了 99.7% 的成功率。

最少到最多提示过程 ​

示例：回复客户咨询 ​

示例：字符连接 ​

第一次尝试：标准提示 ​

第二次尝试：思维链 ​

第三次尝试：LtM（单一提示） ​

结论 ​

示例：组合泛化问题(compositional generalization) (SCAN) ​

第一次尝试：标准提示 ​

第二次尝试：LtM，第一步 - 缩减 ​

第二次尝试：LtM，第二步 - 映射 ​

结论 ​

最少到最多提示过程

示例：回复客户咨询

示例：字符连接

第一次尝试：标准提示

第二次尝试：思维链

第三次尝试：LtM（单一提示）

结论

示例：组合泛化问题(compositional generalization) (SCAN)

第一次尝试：标准提示

第二次尝试：LtM，第一步 - 缩减

第二次尝试：LtM，第二步 - 映射

结论