Boosting for Document Routing

Raj D. Iyer, David D. Lewis, Robert E. Schapire, Yoram Singer, Amit Singhal

Research output: Contribution to conferencePaperpeer-review

33 Scopus citations

Abstract

RankBoost is a recently proposed algorithm for learning ranking functions. It is simple to implement and has strong justifications from computational learning theory. We describe the algorithm and present experimental results on applying it to the document routing problem. The first set of results applies RankBoost to a text representation produced using modern term weighting methods. Performance of RankBoost is somewhat inferior to that of a state-of-the-art routing algorithm which is, however, more complex and less theoretically justified than RankBoost. RankBoost achieves comparable performance to the state-of-the-art algorithm when combined with feature or example selection heuristics. Our second set of results examines the behavior of RankBoost when it has to learn not only a ranking function but also all aspects of term weighting from raw data. Performance is usually, though not always, less good here, but the term weighting functions implicit in the resulting ranking functions are intriguing, and the approach could easily be adapted to mixtures of textual and nontextual data.

Original languageEnglish (US)
Pages70-77
Number of pages8
DOIs
StatePublished - 2000
Event9th International Conference on Information and Knowledge Management (CIKM 2000) - McLean, VA, United States
Duration: Nov 10 2000 → …

Conference

Conference9th International Conference on Information and Knowledge Management (CIKM 2000)
Country/TerritoryUnited States
CityMcLean, VA
Period11/10/00 → …

All Science Journal Classification (ASJC) codes

  • General Business, Management and Accounting
  • General Decision Sciences

Keywords

  • boosting
  • ranking
  • routing
  • supervised learning
  • text representation

Fingerprint

Dive into the research topics of 'Boosting for Document Routing'. Together they form a unique fingerprint.

Cite this