TY - CONF
T1 - Geometry of polysemy
AU - Mu, Jiaqi
AU - Bhat, Suma
AU - Viswanath, Pramod
N1 - Funding Information:
Anna Krakovsk? was supported by the Scientific Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic and Slovak Academy of Sciences (VEGA 2/0081/19); Martina Chvostekov? was supported by the Scientific Grant Agency of the Min-istry of Education, Science, Research and Sport of the Slovak Republic and Slovak Academy of Sciences (VEGA 2/0081/19, VEGA 2/0096/21) and the Czech Academy of Sciences (Praemium Academiae awarded to M. Palu?).
Funding Information:
Anna Krakovská was supported by the Scientific Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic and Slovak Academy of Sciences (VEGA 2/0081/19); Martina Chvosteková was supported by the Scientific Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic and Slovak Academy of Sciences (VEGA 2/0081/19, VEGA 2/0096/21) and the Czech Academy of Sciences (Praemium Academiae awarded to M. Paluš).
Publisher Copyright:
© ICLR 2019 - Conference Track Proceedings. All rights reserved.
PY - 2017
Y1 - 2017
N2 - Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.
AB - Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus - yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.
UR - http://www.scopus.com/inward/record.url?scp=85083682601&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083682601&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85083682601
T2 - 5th International Conference on Learning Representations, ICLR 2017
Y2 - 24 April 2017 through 26 April 2017
ER -