TY - JOUR
T1 - Modeling human performance in statistical word segmentation
AU - Frank, Michael C.
AU - Goldwater, Sharon
AU - Griffiths, Thomas L.
AU - Tenenbaum, Joshua B.
N1 - Funding Information:
We gratefully acknowledge Elissa Newport and Richard Aslin for many valuable discussions of this work and thank LouAnn Gerken, Pierre Perruchet, and two anonymous reviewers for comments on the paper. Portions of the data in this paper were reported at the Cognitive Science conference in Frank, Goldwater, Mansinghka, Griffiths, and Tenenbaum (2007) . We acknowledge NSF Grant #BCS-0631518, and the first author was supported by a Jacob Javits Graduate Fellowship and NSF DDRIG #0746251 .
PY - 2010/11
Y1 - 2010/11
N2 - The ability to discover groupings in continuous stimuli on the basis of distributional information is present across species and across perceptual modalities. We investigate the nature of the computations underlying this ability using statistical word segmentation experiments in which we vary the length of sentences, the amount of exposure, and the number of words in the languages being learned. Although the results are intuitive from the perspective of a language learner (longer sentences, less training, and a larger language all make learning more difficult), standard computational proposals fail to capture several of these results. We describe how probabilistic models of segmentation can be modified to take into account some notion of memory or resource limitations in order to provide a closer match to human performance.
AB - The ability to discover groupings in continuous stimuli on the basis of distributional information is present across species and across perceptual modalities. We investigate the nature of the computations underlying this ability using statistical word segmentation experiments in which we vary the length of sentences, the amount of exposure, and the number of words in the languages being learned. Although the results are intuitive from the perspective of a language learner (longer sentences, less training, and a larger language all make learning more difficult), standard computational proposals fail to capture several of these results. We describe how probabilistic models of segmentation can be modified to take into account some notion of memory or resource limitations in order to provide a closer match to human performance.
KW - Computational modeling
KW - Language acquisition
KW - Statistical learning
KW - Word segmentation
UR - http://www.scopus.com/inward/record.url?scp=77957664574&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77957664574&partnerID=8YFLogxK
U2 - 10.1016/j.cognition.2010.07.005
DO - 10.1016/j.cognition.2010.07.005
M3 - Article
C2 - 20832060
AN - SCOPUS:77957664574
SN - 0010-0277
VL - 117
SP - 107
EP - 125
JO - Cognition
JF - Cognition
IS - 2
ER -