Abstract
Today's networks typically handle traffic engineering (e.g., tuning the routing-protocol parameters to optimize the flow of traffic) and failure recovery (e.g., pre-installed backup paths) independently. In this paper, we propose a unified way to balance load efficiently under a wide range of failure scenarios. Our architecture supports flexible splitting of traffic over multiple precomputed paths, with efficient pathlevel failure detection and automatic load balancing over the remaining paths. We propose two candidate solutions that differ in how the routers rebalance the load after a failure, leading to a trade-off between router complexity and load-balancing performance. We present and solve the optimization problems that compute the configuration state for each router. Our experiments with traffic measurements and topology data (including shared risks in the underlying transport network) from a large ISP identify a "sweet spot" that achieves near-optimal load balancing under a variety of failure scenarios, with a relatively small amount of state in the routers. We believe that our solution for joint traffic engineering and failure recovery will appeal to Internet Service Providers as well as the operators of data-center networks.
Original language | English (US) |
---|---|
Title of host publication | SIGMETRICS'11 - Proceedings of the 2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems |
Publisher | Association for Computing Machinery |
Pages | 97-108 |
Number of pages | 12 |
Volume | 39 |
Edition | 1 SPEC. ISSUE |
ISBN (Print) | 9781450302623 |
DOIs | |
State | Published - 2011 |
Event | 2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2011 - San Jose, United States Duration: Jun 7 2011 → Jun 11 2011 |
Conference
Conference | 2011 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2011 |
---|---|
Country/Territory | United States |
City | San Jose |
Period | 6/7/11 → 6/11/11 |
All Science Journal Classification (ASJC) codes
- Software
- Hardware and Architecture
- Computer Networks and Communications
Keywords
- Failure recovery
- Network architecture
- Optimization