TY - GEN
T1 - Referral Augmentation for Zero-Shot Information Retrieval
AU - Tang, Michael
AU - Yao, Shunyu
AU - Yang, John
AU - Narasimhan, Karthik
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals: text from other documents that cite or link to the given document. We find that RAR provides significant performance gains for tasks across paper retrieval, entity retrieval, and open-domain question-answering in both zero-shot and in-domain (e.g., fine-tuned) settings. We examine how RAR provides especially strong improvements on more structured tasks, and can greatly outperform generative text expansion techniques such as DocT5Query (Nogueira et al., 2019) and Query2Doc (Wang et al., 2023), with a 37% and 21% absolute improvement on ACL paper retrieval, respectively. We also compare three ways to aggregate referrals for RAR. Overall, we believe RAR can help revive and re-contextualize the classic information retrieval idea of using anchor texts to improve the representations of documents in a wide variety of corpuses in the age of neural retrieval.
AB - We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals: text from other documents that cite or link to the given document. We find that RAR provides significant performance gains for tasks across paper retrieval, entity retrieval, and open-domain question-answering in both zero-shot and in-domain (e.g., fine-tuned) settings. We examine how RAR provides especially strong improvements on more structured tasks, and can greatly outperform generative text expansion techniques such as DocT5Query (Nogueira et al., 2019) and Query2Doc (Wang et al., 2023), with a 37% and 21% absolute improvement on ACL paper retrieval, respectively. We also compare three ways to aggregate referrals for RAR. Overall, we believe RAR can help revive and re-contextualize the classic information retrieval idea of using anchor texts to improve the representations of documents in a wide variety of corpuses in the age of neural retrieval.
UR - https://www.scopus.com/pages/publications/85205315275
UR - https://www.scopus.com/pages/publications/85205315275#tab=citedBy
U2 - 10.18653/v1/2024.findings-acl.798
DO - 10.18653/v1/2024.findings-acl.798
M3 - Conference contribution
AN - SCOPUS:85205315275
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 13452
EP - 13461
BT - The 62nd Annual Meeting of the Association for Computational Linguistics
A2 - Ku, Lun-Wei
A2 - Martins, Andre
A2 - Srikumar, Vivek
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Y2 - 11 August 2024 through 16 August 2024
ER -