TY - GEN
T1 - Learning ordered representations with nested dropout
AU - Rippel, Oren
AU - Gelbart, Michael A.
AU - Adams, Ryan P.
N1 - Publisher Copyright:
Copyright 2014 by the author(s).
PY - 2014
Y1 - 2014
N2 - In this paper, we present results on ordered representations of data in which different dimensions have different degrees of importance. To learn these representations we introduce nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network. We first present a sequence of theoretical results for the special case of a semi-linear autoencoder. We rigorously show that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA. We then extend the algorithm to deep models and demonstrate the relevance of ordered representations to a number of applications. Specifically, we use the ordered property of the learned codes to construct hash-based data structures that permit very fast retrieval, achieving retrieval in time logarithmic in the database size and independent of the dimensionality of the representation. This allows codes that are hundreds of times longer than currently feasible for retrieval. We therefore avoid the diminished quality associated with short codes, while still performing retrieval that is competitive in speed with existing methods. We also show that ordered representations are a promising way to learn adaptive compression for efficient online data reconstruction.
AB - In this paper, we present results on ordered representations of data in which different dimensions have different degrees of importance. To learn these representations we introduce nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network. We first present a sequence of theoretical results for the special case of a semi-linear autoencoder. We rigorously show that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA. We then extend the algorithm to deep models and demonstrate the relevance of ordered representations to a number of applications. Specifically, we use the ordered property of the learned codes to construct hash-based data structures that permit very fast retrieval, achieving retrieval in time logarithmic in the database size and independent of the dimensionality of the representation. This allows codes that are hundreds of times longer than currently feasible for retrieval. We therefore avoid the diminished quality associated with short codes, while still performing retrieval that is competitive in speed with existing methods. We also show that ordered representations are a promising way to learn adaptive compression for efficient online data reconstruction.
UR - http://www.scopus.com/inward/record.url?scp=84919832906&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84919832906&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84919832906
T3 - 31st International Conference on Machine Learning, ICML 2014
SP - 3746
EP - 3756
BT - 31st International Conference on Machine Learning, ICML 2014
PB - International Machine Learning Society (IMLS)
T2 - 31st International Conference on Machine Learning, ICML 2014
Y2 - 21 June 2014 through 26 June 2014
ER -