Abstract
Two recently implemented machine-learning algorithms, RIPPER and sleeping-experts for phrases, are evaluated on a number of large text categorization problems. These algorithms both construct classifiers that allow the "context" of a word w to affect how (or even whether) the presence or absence of w will contribute to a classification. However, RIPPER and sleeping-experts differ radically in many other respects: differences include different notions as to what constitutes a context, different ways of combining contexts to construct a classifier, different methods to search for a combination of contexts, and different criteria as to what contexts should be included in such a combination. In spite of these differences, both RIPPER and sleeping-experts perform extremely well across a wide variety of categorization problems, generally outperforming previously applied learning methods. We view this result as a confirmation of the usefulness of classifiers that represent contextual information.
Original language | English (US) |
---|---|
Pages (from-to) | 141-173 |
Number of pages | 33 |
Journal | ACM Transactions on Information Systems |
Volume | 17 |
Issue number | 2 |
DOIs | |
State | Published - Apr 1999 |
All Science Journal Classification (ASJC) codes
- Information Systems
- General Business, Management and Accounting
- Computer Science Applications
Keywords
- Algorithms
- Experimentation
- H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval
- I.2.6 [Artificial Intelligence]: Learning - concept learning; parameter learning
- I.5.4 [Pattern Recognition]: Applications - text processing