TY - GEN
T1 - Building bug-tolerant routers with virtualization
AU - Caesar, Matthew
AU - Rexford, Jennifer L.
PY - 2008/12/1
Y1 - 2008/12/1
N2 - Implementation bugs are a highly critical problem in wide-area networks. The software running on core routers is subject to vulnerabilities, coding mistakes, and misconfiguration. Unfortunately, these problems are often found after deployment in live networks, where they lead to outages, make networks prone to attack, and involve a challenging process to localize and debug. In this work, we propose a bug-tolerant router that runs multiple diverse copies of router software in parallel, such that each copy is unlikely to fail at the same time as the others. Diversity is achieved by varying the ordering and timing of routing messages, running different routing protocols, running code written by different implementers, etc. Because each copy is different, each copy will likely have a different output during an error, and hence a simple voting procedure is then used to decide which copy's output will drive packet forwarding and control-plane communication with other routers. In this paper we motivate our design, describe some design decisions and tradeoffs, and then conclude with a description of our ongoing work in building a prototype of this architecture.
AB - Implementation bugs are a highly critical problem in wide-area networks. The software running on core routers is subject to vulnerabilities, coding mistakes, and misconfiguration. Unfortunately, these problems are often found after deployment in live networks, where they lead to outages, make networks prone to attack, and involve a challenging process to localize and debug. In this work, we propose a bug-tolerant router that runs multiple diverse copies of router software in parallel, such that each copy is unlikely to fail at the same time as the others. Diversity is achieved by varying the ordering and timing of routing messages, running different routing protocols, running code written by different implementers, etc. Because each copy is different, each copy will likely have a different output during an error, and hence a simple voting procedure is then used to decide which copy's output will drive packet forwarding and control-plane communication with other routers. In this paper we motivate our design, describe some design decisions and tradeoffs, and then conclude with a description of our ongoing work in building a prototype of this architecture.
UR - http://www.scopus.com/inward/record.url?scp=63749095205&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63749095205&partnerID=8YFLogxK
U2 - 10.1145/1397718.1397730
DO - 10.1145/1397718.1397730
M3 - Conference contribution
AN - SCOPUS:63749095205
SN - 9781605581811
T3 - SIGCOMM 2008 Conference and the Co-located Workshops - PRESTO'08: Proceedings of the ACM Workshop on Programmable Routers for Extensible Services of Tomorrow
SP - 51
EP - 56
BT - SIGCOMM 2008 Conference and the Co-located Workshops - PRESTO'08
T2 - SIGCOMM 2008 Conference and the Co-located Workshops - PRESTO'08: ACM Workshop on Programmable Routers for Extensible Services of Tomorrow
Y2 - 22 August 2008 through 22 August 2008
ER -