Tolerating slowdowns in replicated state machines using copilots

Khiem Ngo, Siddhartha Sen, Wyatt Lloyd

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Replicated state machines are linearizable, fault-tolerant groups of replicas that are coordinated using a consensus algorithm. Copilot replication is the first 1-slowdown-tolerant consensus protocol: it delivers normal latency despite the slowdown of any 1 replica. Copilot uses two distinguished replicas-the pilot and copilot-to proactively add redundancy to all stages of processing a client's command. Copilot uses dependencies and deduplication to resolve potentially differing orderings proposed by the pilots. To avoid dependencies leading to either pilot being able to slow down the group, Copilot uses fast takeovers that allow a fast pilot to complete the ongoing work of a slow pilot. Copilot includes two optimizations-ping-pong batching and null dependency elimination-that improve its performance when there are 0 and 1 slow pilots respectively. Our evaluation of Copilot shows its performance is lower but competitive with Multi-Paxos and EPaxos when no replicas are slow. When a replica is slow, Copilot is the only protocol that avoids high latencies.

Original languageEnglish (US)
Title of host publicationProceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020
PublisherUSENIX Association
Pages583-598
Number of pages16
ISBN (Electronic)9781939133199
StatePublished - 2020
Event14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020 - Virtual, Online
Duration: Nov 4 2020Nov 6 2020

Publication series

NameProceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020

Conference

Conference14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020
CityVirtual, Online
Period11/4/2011/6/20

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems

Fingerprint

Dive into the research topics of 'Tolerating slowdowns in replicated state machines using copilots'. Together they form a unique fingerprint.

Cite this