TY - JOUR
T1 - An information-pattern-based approach to novelty detection
AU - Li, Xiaoyan
AU - Croft, W. Bruce
N1 - Funding Information:
This paper is based on the PhD thesis work of the first author ( Li, 2006 ) at the University of Massachusetts, Amherst. The work was supported in part by the Center for Intelligent Information Retrieval, by SPAWARSYSCEN-SD grant numbers N66001-99-1-8912 and N66001-1-8903, and in part by the Defense Advanced Research Projects Agency (DARPA) under contract number HR0011-06-C-0023. Any opinions, findings and conclusions or recommendations expressed in this material are the authors’ and do not necessarily reflect those of the sponsors.
PY - 2008/5
Y1 - 2008/5
N2 - In this paper, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, "novelty" is redefined based on the proposed information patterns, and several different types of information patterns are given corresponding to different types of users' information needs. Second, a thorough analysis of sentence level information patterns is elaborated using data from the TREC novelty tracks, including sentence lengths, named entities (NEs), and sentence level opinion patterns. Finally, a unified information-pattern-based approach to novelty detection (ip-BAND) is presented for both specific NE topics and more general topics. Experiments on novelty detection on data from the TREC 2002, 2003 and 2004 novelty tracks show that the proposed approach significantly improves the performance of novelty detection in terms of precision at top ranks. Future research directions are suggested.
AB - In this paper, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, "novelty" is redefined based on the proposed information patterns, and several different types of information patterns are given corresponding to different types of users' information needs. Second, a thorough analysis of sentence level information patterns is elaborated using data from the TREC novelty tracks, including sentence lengths, named entities (NEs), and sentence level opinion patterns. Finally, a unified information-pattern-based approach to novelty detection (ip-BAND) is presented for both specific NE topics and more general topics. Experiments on novelty detection on data from the TREC 2002, 2003 and 2004 novelty tracks show that the proposed approach significantly improves the performance of novelty detection in terms of precision at top ranks. Future research directions are suggested.
KW - Information patterns
KW - Information retrieval
KW - Named entities
KW - Novelty detection
KW - Question answering
UR - http://www.scopus.com/inward/record.url?scp=40649103141&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40649103141&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2007.09.013
DO - 10.1016/j.ipm.2007.09.013
M3 - Article
AN - SCOPUS:40649103141
SN - 0306-4573
VL - 44
SP - 1159
EP - 1188
JO - Information Processing and Management
JF - Information Processing and Management
IS - 3
ER -