Network architecture for joint failure recovery and traffic engineering

Martin Suchara, Dahai Xu, Robert Doverspike, David Johnson, Jennifer L. Rexford

Research output: Chapter in Book/Report/Conference proceedingConference contribution

72 Scopus citations

Abstract

Today's networks typically handle traffic engineering (e.g., tuning the routing-protocol parameters to optimize the flow of traffic) and failure recovery (e.g., pre-installed backup paths) independently. In this paper, we propose a unified way to balance load efficiently under a wide range of failure scenarios. Our architecture supports flexible splitting of traffic over multiple precomputed paths, with efficient pathlevel failure detection and automatic load balancing over the remaining paths. We propose two candidate solutions that differ in how the routers rebalance the load after a failure, leading to a trade-off between router complexity and load-balancing performance. We present and solve the optimization problems that compute the configuration state for each router. Our experiments with traffic measurements and topology data (including shared risks in the underlying transport network) from a large ISP identify a "sweet spot" that achieves near-optimal load balancing under a variety of failure scenarios, with a relatively small amount of state in the routers. We believe that our solution for joint traffic engineering and failure recovery will appeal to Internet Service Providers as well as the operators of data-center networks.

Original languageEnglish (US)
Title of host publicationSIGMETRICS'11 - Proceedings of the 2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
PublisherAssociation for Computing Machinery
Pages97-108
Number of pages12
Volume39
Edition1 SPEC. ISSUE
ISBN (Print)9781450302623
DOIs
StatePublished - 2011
Event2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2011 - San Jose, United States
Duration: Jun 7 2011Jun 11 2011

Conference

Conference2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2011
Country/TerritoryUnited States
CitySan Jose
Period6/7/116/11/11

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Keywords

  • Failure recovery
  • Network architecture
  • Optimization

Fingerprint

Dive into the research topics of 'Network architecture for joint failure recovery and traffic engineering'. Together they form a unique fingerprint.

Cite this