Novelty detection based on sentence level patterns

Xiaoyan Li, W. Bruce Croft

Research output: Chapter in Book/Report/Conference proceedingConference contribution

32 Scopus citations

Abstract

The detection of new information in a document stream is an important component of many potential applications. In this paper, a new novelty detection approach based on the identification of sentence level patterns is proposed. Given a user's information need, some patterns in sentences such as combinations of query words, named entities and phrases, may contain more important and relevant information than single words. Therefore, the proposed novelty detection approach focuses on the identification of previously unseen query-related patterns in sentences. Specifically, a query is preprocessed and represented with patterns that include both query words and required answer types. These patterns are used to retrieve sentences, which are then determined to be novel if it is likely that a new answer is present. An analysis of patterns in sentences was performed with data from the TREC 2002 novelty track and experiments on novelty detection were carried out on data from the TREC 2003 and 2004 novelty tracks. The experimental results show that the proposed pattern-based approach significantly outperforms all three baselines in terms of precision at top ranks.

Original languageEnglish (US)
Title of host publicationCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages744-751
Number of pages8
ISBN (Print)1595931406, 9781595931405
DOIs
StatePublished - 2005
EventCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management - Bremen, Germany
Duration: Oct 31 2005Nov 5 2005

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

ConferenceCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
CountryGermany
CityBremen
Period10/31/0511/5/05

All Science Journal Classification (ASJC) codes

  • Business, Management and Accounting(all)

Keywords

  • Information patterns
  • Named entities
  • Novelty detection

Fingerprint Dive into the research topics of 'Novelty detection based on sentence level patterns'. Together they form a unique fingerprint.

  • Cite this

    Li, X., & Croft, W. B. (2005). Novelty detection based on sentence level patterns. In CIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management (pp. 744-751). (International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery. https://doi.org/10.1145/1099554.1099734