Measuring the effects of Internet path faults on reactive routing

Nick Feamster, David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek

Research output: Chapter in Book/Report/Conference proceedingConference contribution

44 Scopus citations

Abstract

Empirical evidence suggests that reactive routing systems improve resilience to Internet path failures. They detect and route around faulty paths based on measurements of path performance. This paper seeks to understand why and under what circumstances these techniques are effective. To do so, this paper correlates end-to-end active probing experiments, loss-triggered traceroutes of Internet paths, and BGP routing messages. These correlations shed light on three questions about Internet path failures: (1) Where do failures appear? (2) How long do they last? (3) How do they correlate with BGP routing instability? Data collected over 13 months from an Internet testbed of 31 topologically diverse hosts suggests that most path failures last less than fifteen minutes. Failures that appear in the network core correlate better with BGP instability than failures that appear close to end hosts. On average, most failures precede BGP messages by about four minutes, but there is often increased BGP traffic both before and after failures. Our findings suggest that reactive routing is most effective between hosts that have multiple connections to the Internet. The data set also suggests that passive observations of BGP routing messages could be used to predict about 20% of impending failures, allowing re-routing systems to react more quickly to failures.

Original languageEnglish (US)
Title of host publicationPerformance Evaluation Review
Pages126-137
Number of pages12
Edition1
DOIs
StatePublished - Jun 2003
EventACM SIGMETRICS 2003 - International Conference on Measurement and Modeling of Computer Systems - San Diego, CA, United States
Duration: Jun 10 2003Jun 14 2003

Publication series

NamePerformance Evaluation Review
Number1
Volume31
ISSN (Print)0163-5999
ISSN (Electronic)0163-5999

Other

OtherACM SIGMETRICS 2003 - International Conference on Measurement and Modeling of Computer Systems
Country/TerritoryUnited States
CitySan Diego, CA
Period6/10/036/14/03

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Keywords

  • Experimentation
  • Measurement
  • Performance
  • Reliability

Fingerprint

Dive into the research topics of 'Measuring the effects of Internet path faults on reactive routing'. Together they form a unique fingerprint.

Cite this