TY - JOUR
T1 - Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
AU - Goldstein, Ariel
AU - Grinstein-Dabush, Avigail
AU - Schain, Mariano
AU - Wang, Haocheng
AU - Hong, Zhuoqiao
AU - Aubrey, Bobbi
AU - Schain, Mariano
AU - Nastase, Samuel A.
AU - Zada, Zaid
AU - Ham, Eric
AU - Feder, Amir
AU - Gazula, Harshvardhan
AU - Buchnik, Eliav
AU - Doyle, Werner
AU - Devore, Sasha
AU - Dugan, Patricia
AU - Reichart, Roi
AU - Friedman, Daniel
AU - Brenner, Michael
AU - Hassidim, Avinatan
AU - Devinsky, Orrin
AU - Flinker, Adeen
AU - Hasson, Uri
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.
AB - Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.
UR - http://www.scopus.com/inward/record.url?scp=85189028464&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85189028464&partnerID=8YFLogxK
U2 - 10.1038/s41467-024-46631-y
DO - 10.1038/s41467-024-46631-y
M3 - Article
AN - SCOPUS:85189028464
SN - 2041-1723
VL - 15
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 2768
ER -