LatLong: Diagnosing wide-area latency changes for CDNs

Yaping Zhu, Benjamin Helsley, Jennifer L. Rexford, Aspi Siganporia, Sridhar Srinivasan

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

Minimizing user-perceived latency is crucial for Content Distribution Networks (CDNs) hosting interactive services. Latency may increase for many reasons, such as interdomain routing changes and the CDN's own load-balancing policies. CDNs need greater visibility into the causes of latency increases, so they can adapt by directing traffic to different servers or paths. In this paper, we propose a tool for CDNs to diagnose large latency increases, based on passive measurements of performance, traffic, and routing. Separating the many causes from the effects is challenging. We propose a decision tree for classifying latency changes, and determine how to distinguish traffic shifts from increases in latency for existing servers, routers, and paths. Another challenge is that network operators group related clients to reduce measurement and control overhead, but the clients in a region may use multiple servers and paths during a measurement interval. We propose metrics that quantify the latency contributions across sets of servers and routers. Based on the design, we implement the LatLong tool for diagnosing large latency increases for CDN. We use LatLong to analyze a month of data from Google's CDN, and find that nearly 1% of the daily latency changes increase delay by more than 100 msec. Note that the latency increase of 100 msec is significant, since these are daily averages over groups of clients, and we only focus on latency-sensitive traffic for our study. More than 40% of these increases coincide with interdomain routing changes, and more than one-third involve a shift in traffic to different servers. This is the first work to diagnose latency problems in a large, operational CDN from purely passive measurements. Through case studies of individual events, we identify research challenges for managing wide-area latency for CDNs.

Original languageEnglish (US)
Article number6233056
Pages (from-to)333-345
Number of pages13
JournalIEEE Transactions on Network and Service Management
Volume9
Issue number3
DOIs
StatePublished - 2012

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Keywords

  • Network diagnosis
  • content distribution networks (CDNs)
  • latency increases

Fingerprint

Dive into the research topics of 'LatLong: Diagnosing wide-area latency changes for CDNs'. Together they form a unique fingerprint.

Cite this