Improving novelty detection for general topics using sentence level information patterns

Xiaoyan Li, W. Bruce Croft

Research output: Chapter in Book/Report/Conference proceedingConference contribution

24 Scopus citations

Abstract

The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, the information-pattern concept for novelty detection is presented with the emphasis on new information patterns for general topics (queries) that cannot be simply turned into specific questions whose answers are specific named entities (NEs). Then we elaborate a thorough analysis of sentence level information patterns on data from the TREC novelty tracks, including sentence lengths, named entities, sentence level opinion patterns. This analysis provides guidelines in applying those patterns in novelty detection particularly for the general topics. Finally, a unified pattern-based approach is presented to novelty detection for both general and specific topics. The new method for dealing with general topics will be the focus. Experimental results show that the proposed approach significantly improves the performance of novelty detection for general topics as well as the overall performance for all topics from the 2002-2004 TREC novelty tracks.

Original languageEnglish (US)
Title of host publicationProceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006
Pages238-247
Number of pages10
DOIs
StatePublished - 2006
Externally publishedYes
Event15th ACM Conference on Information and Knowledge Management, CIKM 2006 - Arlington, VA, United States
Duration: Nov 6 2006Nov 11 2006

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference15th ACM Conference on Information and Knowledge Management, CIKM 2006
Country/TerritoryUnited States
CityArlington, VA
Period11/6/0611/11/06

All Science Journal Classification (ASJC) codes

  • General Decision Sciences
  • General Business, Management and Accounting

Keywords

  • Information patterns
  • Named entities
  • Novelty detection

Fingerprint

Dive into the research topics of 'Improving novelty detection for general topics using sentence level information patterns'. Together they form a unique fingerprint.

Cite this