TY - JOUR
T1 - A new robust relevance model in the language model framework
AU - Li, Xiaoyan
N1 - Funding Information:
This work was supported in part by Center for Intelligent Information Retrieval at the University of Massachusetts at Amherst, by a Mount Holyoke College Start-up Research Grant, and by DARPA under contract number HR0011-06-C-0023. Any opinions, findings and conclusions or recommendations expressed in this material are the author’s and do not necessarily reflect those of the sponsors. Earlier versions of this work was first appeared in a technical report ( Li, 2005 ), and then was presented at the Fourth IASTED International Conference on Communications, Internet and Information Technology ( Li, 2006 ).
PY - 2008/5
Y1 - 2008/5
N2 - In this paper, a new robust relevance model is proposed that can be applied to both pseudo and true relevance feedback in the language-modeling framework for document retrieval. There are at least three main differences between our new relevance model and other relevance models. The proposed model brings back the original query into the relevance model by treating it as a short, special document, in addition to a number of top-ranked documents returned from the first round retrieval for pseudo feedback, or a number of relevant documents for true relevance feedback. Second, instead of using a uniform prior as in the original relevance model proposed by Lavrenko and Croft, documents are assigned with different priors according to their lengths (in terms) and ranks in the first round retrieval. Third, the probability of a term in the relevance model is further adjusted by its probability in a background language model. In both pseudo and true relevance cases, we have compared the performance of our model to that of the two baselines: the original relevance model and a linear combination model. Our experimental results show that the proposed new model outperforms both of the two baselines in terms of mean average precision.
AB - In this paper, a new robust relevance model is proposed that can be applied to both pseudo and true relevance feedback in the language-modeling framework for document retrieval. There are at least three main differences between our new relevance model and other relevance models. The proposed model brings back the original query into the relevance model by treating it as a short, special document, in addition to a number of top-ranked documents returned from the first round retrieval for pseudo feedback, or a number of relevant documents for true relevance feedback. Second, instead of using a uniform prior as in the original relevance model proposed by Lavrenko and Croft, documents are assigned with different priors according to their lengths (in terms) and ranks in the first round retrieval. Third, the probability of a term in the relevance model is further adjusted by its probability in a background language model. In both pseudo and true relevance cases, we have compared the performance of our model to that of the two baselines: the original relevance model and a linear combination model. Our experimental results show that the proposed new model outperforms both of the two baselines in terms of mean average precision.
KW - Feedback
KW - Language modeling
KW - Query expansion
KW - Relevance models
UR - http://www.scopus.com/inward/record.url?scp=40649118468&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40649118468&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2007.07.005
DO - 10.1016/j.ipm.2007.07.005
M3 - Article
AN - SCOPUS:40649118468
SN - 0306-4573
VL - 44
SP - 991
EP - 1007
JO - Information Processing and Management
JF - Information Processing and Management
IS - 3
ER -