Scaling Data from Multiple Sources

Ted Enamorado, Gabriel López-Moctezuma, Marc Ratkovic

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We introduce a method for scaling two datasets from different sources. The proposed method estimates a latent factor common to both datasets as well as an idiosyncratic factor unique to each. In addition, it offers a flexible modeling strategy that permits the scaled locations to be a function of covariates, and efficient implementation allows for inference through resampling. A simulation study shows that our proposed method improves over existing alternatives in capturing the variation common to both datasets, as well as the latent factors specific to each. We apply our proposed method to vote and speech data from the 112th U.S. Senate. We recover a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives, while recovering the words most associated with each senator's location. In addition, we estimate a word-specific subspace that ranges from national security to budget concerns, and a vote-specific subspace with Tea Party senators on one extreme and senior committee leaders on the other.

Original languageEnglish (US)
Pages (from-to)212-235
Number of pages24
JournalPolitical Analysis
Volume29
Issue number2
DOIs
StatePublished - Apr 2021

All Science Journal Classification (ASJC) codes

  • Sociology and Political Science
  • Political Science and International Relations

Keywords

  • U.S. Senate
  • multidimensional scaling
  • principal component analysis

Fingerprint

Dive into the research topics of 'Scaling Data from Multiple Sources'. Together they form a unique fingerprint.

Cite this