Twitter vs. printed English: An information-theoretic comparison

Emma Glennon, Lalitha Sankar, H. Vincent Poor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter's entropy is overall higher than that of printed English, and 2) individual users' entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.

Original languageEnglish (US)
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages3069-3072
Number of pages4
DOIs
StatePublished - 2012
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: Mar 25 2012Mar 30 2012

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Country/TerritoryJapan
CityKyoto
Period3/25/123/30/12

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Keywords

  • Twitter
  • computer mediated communication
  • information entropy
  • information theory
  • redundancy

Fingerprint

Dive into the research topics of 'Twitter vs. printed English: An information-theoretic comparison'. Together they form a unique fingerprint.

Cite this