TY - JOUR
T1 - Metacognitive Capabilities of LLMs
T2 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024
AU - Didolkar, Aniket
AU - Goyal, Anirudh
AU - Ke, Nan Rosemary
AU - Guo, Siyuan
AU - Valko, Michal
AU - Lillicrap, Timothy
AU - Rezende, Danilo
AU - Bengio, Yoshua
AU - Mozer, Michael
AU - Arora, Sanjeev
N1 - Publisher Copyright:
© 2024 Neural information processing systems foundation. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, developing a prompt-guided interaction procedure to get a powerful LLM to assign sensible skill labels to math questions, followed by having it perform semantic clustering to obtain coarser families of skill labels. These coarse skill labels look interpretable to humans. To validate that these skill labels are meaningful and relevant to the LLM's reasoning processes we perform the following experiments. (a) We ask GPT-4 to assign skill labels to training questions in math datasets GSM8K and MATH. (b) When using an LLM to solve the test questions, we present it with the full list of skill labels and ask it to identify the skill needed. Then it is presented with randomly selected exemplar solved questions associated with that skill label. This improves accuracy on GSM8k and MATH for several strong LLMs, including code-assisted models. The methodology presented is domain-agnostic, even though this article applies it to math problems.
AB - Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, developing a prompt-guided interaction procedure to get a powerful LLM to assign sensible skill labels to math questions, followed by having it perform semantic clustering to obtain coarser families of skill labels. These coarse skill labels look interpretable to humans. To validate that these skill labels are meaningful and relevant to the LLM's reasoning processes we perform the following experiments. (a) We ask GPT-4 to assign skill labels to training questions in math datasets GSM8K and MATH. (b) When using an LLM to solve the test questions, we present it with the full list of skill labels and ask it to identify the skill needed. Then it is presented with randomly selected exemplar solved questions associated with that skill label. This improves accuracy on GSM8k and MATH for several strong LLMs, including code-assisted models. The methodology presented is domain-agnostic, even though this article applies it to math problems.
UR - http://www.scopus.com/inward/record.url?scp=105000468190&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105000468190&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:105000468190
SN - 1049-5258
VL - 37
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 9 December 2024 through 15 December 2024
ER -