TY - JOUR
T1 - C5
T2 - 49th International Conference on Very Large Data Bases, VLDB 2023
AU - Helt, Jeffrey
AU - Sharma, Abhinav
AU - Abadi, Daniel J.
AU - Lloyd, Wyatt
AU - Faleiro, Jose M.
N1 - Funding Information:
We thank the anonymous reviewers and our shepherd for their feedback. We also thank Princeton’s systems group for their comments on earlier drafts of this paper. This work was supported by the NSF under grants CNS-1824130 and IIS-1910613.
Publisher Copyright:
© 2022 VLDB Endowment.
PY - 2022
Y1 - 2022
N2 - Asynchronously replicated primary-backup databases are commonly deployed to improve availability and offload read-only transactions. To both apply replicated writes from the primary and serve read-only transactions, the backups implement a cloned concurrency control protocol. The protocol ensures read-only transactions always return a snapshot of state that previously existed on the primary. This compels the backup to exactly copy the commit order resulting from the primary’s concurrency control. Existing cloned concurrency control protocols guarantee this by limiting the backup’s parallelism. As a result, the primary’s concurrency control executes some workloads with more parallelism than these protocols. In this paper, we prove that this parallelism gap leads to unbounded replication lag, where writes can take arbitrarily long to replicate to the backup and which has led to catastrophic failures in production systems. We then design C5, the first cloned concurrency protocol to provide bounded replication lag. We implement two versions of C5: Our evaluation in MyRocks, a widely deployed database, demonstrates C5 provides bounded replication lag. Our evaluation in Cicada, a recent in-memory database, demonstrates C5 keeps up with even the fastest of primaries.
AB - Asynchronously replicated primary-backup databases are commonly deployed to improve availability and offload read-only transactions. To both apply replicated writes from the primary and serve read-only transactions, the backups implement a cloned concurrency control protocol. The protocol ensures read-only transactions always return a snapshot of state that previously existed on the primary. This compels the backup to exactly copy the commit order resulting from the primary’s concurrency control. Existing cloned concurrency control protocols guarantee this by limiting the backup’s parallelism. As a result, the primary’s concurrency control executes some workloads with more parallelism than these protocols. In this paper, we prove that this parallelism gap leads to unbounded replication lag, where writes can take arbitrarily long to replicate to the backup and which has led to catastrophic failures in production systems. We then design C5, the first cloned concurrency protocol to provide bounded replication lag. We implement two versions of C5: Our evaluation in MyRocks, a widely deployed database, demonstrates C5 provides bounded replication lag. Our evaluation in Cicada, a recent in-memory database, demonstrates C5 keeps up with even the fastest of primaries.
UR - http://www.scopus.com/inward/record.url?scp=85140404075&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85140404075&partnerID=8YFLogxK
U2 - 10.14778/3561261.3561262
DO - 10.14778/3561261.3561262
M3 - Conference article
AN - SCOPUS:85140404075
SN - 2150-8097
VL - 16
SP - 1
EP - 14
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 1
Y2 - 28 August 2023 through 1 September 2023
ER -