Abstract
Fully unsupervised topic models have found fantastic success in document clustering and classification. However, these models often suffer from the tendency to learn less-than-meaningful or even redundant topics when the data is biased towards a set of features. For this reason, we propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed Guided NMF, that incorporates user-designed seed word supervision. Our experimental results demonstrate the promise of this model and illustrate that it is competitive with other methods of this ilk with only very little supervision information.
Original language | English (US) |
---|---|
Pages (from-to) | 3265-3269 |
Number of pages | 5 |
Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 2021-June |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Event | 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: Jun 6 2021 → Jun 11 2021 |
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Electrical and Electronic Engineering
Keywords
- Seed words
- Supervised nonnegative matrix factorization
- Supervised topic models