Evaluating Vector-Space Models of Word Representation, or, The Unreasonable Effectiveness of Counting Words Near Other Words

Aida Nematzadeh, Stephan C. Meylan, Thomas L. Griffiths

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Scopus citations

Abstract

Vector-space models of semantics represent words as continuously-valued vectors and measure similarity based on the distance or angle between those vectors. Such representations have become increasingly popular due to the recent development of methods that allow them to be efficiently estimated from very large amounts of data. However, the idea of relating similarity to distance in a spatial representation has been criticized by cognitive scientists, as human similarity judgments have many properties that are inconsistent with the geometric constraints that a distance metric must obey. We show that two popular vector-space models, Word2Vec and GloVe, are unable to capture certain critical aspects of human word association data as a consequence of these constraints. However, a probabilistic topic model estimated from a relatively small curated corpus qualitatively reproduces the asymmetric patterns seen in the human data. We also demonstrate that a simple co-occurrence frequency performs similarly to reduced-dimensionality vector-space models on medium-size corpora, at least for relatively frequent words.

Original languageEnglish (US)
Title of host publicationCogSci 2017 - Proceedings of the 39th Annual Meeting of the Cognitive Science Society
Subtitle of host publicationComputational Foundations of Cognition
PublisherThe Cognitive Science Society
Pages859-864
Number of pages6
ISBN (Electronic)9780991196760
StatePublished - 2017
Externally publishedYes
Event39th Annual Meeting of the Cognitive Science Society: Computational Foundations of Cognition, CogSci 2017 - London, United Kingdom
Duration: Jul 26 2017Jul 29 2017

Publication series

NameCogSci 2017 - Proceedings of the 39th Annual Meeting of the Cognitive Science Society: Computational Foundations of Cognition

Conference

Conference39th Annual Meeting of the Cognitive Science Society: Computational Foundations of Cognition, CogSci 2017
Country/TerritoryUnited Kingdom
CityLondon
Period7/26/177/29/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Human-Computer Interaction
  • Cognitive Neuroscience

Keywords

  • vector-space models
  • word associations
  • word representations

Fingerprint

Dive into the research topics of 'Evaluating Vector-Space Models of Word Representation, or, The Unreasonable Effectiveness of Counting Words Near Other Words'. Together they form a unique fingerprint.

Cite this