Abstract
An efficient sparse modeling pipeline for the classification of human actions from video is here developed. Spatio-temporal features that characterize local changes in the image are first extracted. This is followed by the learning of a class-structured dictionary encoding the individual actions of interest. Classification is then based on reconstruction, where the label assigned to each video comes from the optimal sparse linear combination of the learned basis vectors (action primitives) representing the actions. A low computational cost deep-layer model learning the inter-class correlations of the data is added for increasing discriminative power. In spite of its simplicity and low computational cost, the method outperforms previously reported results for virtually all standard datasets.
Original language | English (US) |
---|---|
Pages (from-to) | 1-15 |
Number of pages | 15 |
Journal | International Journal of Computer Vision |
Volume | 100 |
Issue number | 1 |
DOIs | |
State | Published - Oct 2012 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition
- Artificial Intelligence
Keywords
- Action classification
- Dictionary learning
- Sparse modeling
- Supervised learning