Training algorithms for linear text classifiers

David D. Lewis, Robert E. Schapire, James P. Callan, Ron Papka

Research output: Contribution to journalConference articlepeer-review

370 Scopus citations

Abstract

Systems for text retrieval, routing, categorization and other IR tasks rely heavily on linear classifiers. We propose that two machine learning algorithms, the Widrow-Hoff and EG algorithms, be used in training linear text classifiers. In contrast to most IR methods, theoretical analysis provides performance guarantees and guidance on parameter settings for these algorithms. Experimental data is presented showing Widrow-Hoff and EG to be more effective than the widely used Rocchio algorithm on several categorization and routing tasks.

Original languageEnglish (US)
Pages (from-to)298-306
Number of pages9
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
DOIs
StatePublished - 1996
Externally publishedYes
EventProceedings of the 1996 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 96 - Zurich, Switz
Duration: Aug 18 1996Aug 22 1996

All Science Journal Classification (ASJC) codes

  • Management Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Training algorithms for linear text classifiers'. Together they form a unique fingerprint.

Cite this