TY - GEN
T1 - Factual Probing Is [MASK]
T2 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021
AU - Zhong, Zexuan
AU - Friedman, Dan
AU - Chen, Danqi
N1 - Funding Information:
We thank Zhengbao Jiang for answering questions about LPAQA. We thank the members of the Princeton NLP group and the anonymous reviewers for their valuable comments and feedback. This work is supported in part by a Graduate Fellowship at Princeton University.
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - Petroni et al. (2019) demonstrated that it is possible to retrieve world facts from a pre-trained language model by expressing them as cloze-style prompts and interpret the model’s prediction accuracy as a lower bound on the amount of factual information it encodes. Subsequent work has attempted to tighten the estimate by searching for better prompts, using a disjoint set of facts as training data. In this work, we make two complementary contributions to better understand these factual probing techniques. First, we propose OPTIPROMPT, a novel and efficient method which directly optimizes in continuous embedding space. We find this simple method is able to predict an additional 6.4% of facts in the LAMA benchmark. Second, we raise a more important question: Can we really interpret these probing results as a lower bound? Is it possible that these prompt-search methods learn from the training data too? We find, somewhat surprisingly, that the training data used by these methods contains certain regularities of the underlying fact distribution, and all the existing prompt methods, including ours, are able to exploit them for better fact prediction. We conduct a set of control experiments to disentangle “learning” from “learning to recall”, providing a more detailed picture of what different prompts can reveal about pre-trained language models.
AB - Petroni et al. (2019) demonstrated that it is possible to retrieve world facts from a pre-trained language model by expressing them as cloze-style prompts and interpret the model’s prediction accuracy as a lower bound on the amount of factual information it encodes. Subsequent work has attempted to tighten the estimate by searching for better prompts, using a disjoint set of facts as training data. In this work, we make two complementary contributions to better understand these factual probing techniques. First, we propose OPTIPROMPT, a novel and efficient method which directly optimizes in continuous embedding space. We find this simple method is able to predict an additional 6.4% of facts in the LAMA benchmark. Second, we raise a more important question: Can we really interpret these probing results as a lower bound? Is it possible that these prompt-search methods learn from the training data too? We find, somewhat surprisingly, that the training data used by these methods contains certain regularities of the underlying fact distribution, and all the existing prompt methods, including ours, are able to exploit them for better fact prediction. We conduct a set of control experiments to disentangle “learning” from “learning to recall”, providing a more detailed picture of what different prompts can reveal about pre-trained language models.
UR - http://www.scopus.com/inward/record.url?scp=85136834500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136834500&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85136834500
T3 - NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 5017
EP - 5033
BT - NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
Y2 - 6 June 2021 through 11 June 2021
ER -