TY - GEN
T1 - Replex
T2 - 2016 USENIX Annual Technical Conference, USENIX ATC 2016
AU - Tai, Amy
AU - Wei, Michael
AU - Freedman, Michael J.
AU - Abraham, Ittai
AU - Malkhi, Dahlia
PY - 2016/1/1
Y1 - 2016/1/1
N2 - The need for scalable, high-performance datastores has led to the development of NoSQL databases, which achieve scalability by partitioning data over a single key. However, programmers often need to query data with other keys, which data stores provide by either querying every partition, eliminating the benefits of partitioning, or replicating additional indexes, wasting the benefits of data replication. In this paper, we show there is no need to compromise scalability for functionality. We present Replex, a datastore that enables efficient querying on multiple keys by rethinking data placement during replication. Traditionally, a data store is first globally partitioned, then each partition is replicated identically to multiple nodes. Instead, Replex relies on a novel replication unit, termed replex, which partitions a full copy of the data based on its unique key. Replexes eliminate any additional overhead to maintaining indices, at the cost of increasing recovery complexity. To address this issue, we also introduce hybrid replexes, which enable a rich design space for trading off steady-state performance with faster recovery. We build, parameterize, and evaluate Replex on multiple dimensions and find that Replex surpasses the steady-state and failure recovery performance of Hyper- Dex, a state-of-the-art multi-key data store.
AB - The need for scalable, high-performance datastores has led to the development of NoSQL databases, which achieve scalability by partitioning data over a single key. However, programmers often need to query data with other keys, which data stores provide by either querying every partition, eliminating the benefits of partitioning, or replicating additional indexes, wasting the benefits of data replication. In this paper, we show there is no need to compromise scalability for functionality. We present Replex, a datastore that enables efficient querying on multiple keys by rethinking data placement during replication. Traditionally, a data store is first globally partitioned, then each partition is replicated identically to multiple nodes. Instead, Replex relies on a novel replication unit, termed replex, which partitions a full copy of the data based on its unique key. Replexes eliminate any additional overhead to maintaining indices, at the cost of increasing recovery complexity. To address this issue, we also introduce hybrid replexes, which enable a rich design space for trading off steady-state performance with faster recovery. We build, parameterize, and evaluate Replex on multiple dimensions and find that Replex surpasses the steady-state and failure recovery performance of Hyper- Dex, a state-of-the-art multi-key data store.
UR - http://www.scopus.com/inward/record.url?scp=85019854372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85019854372&partnerID=8YFLogxK
M3 - Conference contribution
T3 - Proceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016
SP - 337
EP - 350
BT - Proceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016
PB - USENIX Association
Y2 - 22 June 2016 through 24 June 2016
ER -