Harnessing Correlations in Distributed Erasure-Coded Key-Value Stores

Ramy E. Ali, Viveck R. Cadambe

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Motivated by applications of distributed storage systems to key-value stores, the multi-version coding problem has been formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in the multi-version coding problem is to ensure that the latest possible version of the data is decodable even if the data updates have not reached all the servers in the system. In this paper, we study the storage cost of ensuring consistency for the case where the data versions are correlated, in contrast to previous work where the data versions were treated as being independent. We provide multi-version code constructions that show that the storage cost can be significantly smaller than the previous constructions depending on the degree of correlation, despite the asynchrony and the decentralized nature. Our achievability results are based on Reed-Solomon codes and random binning. Through an information-theoretic converse, we show that our multi-version codes are asymptotically nearly optimal, within a factor of 2, in certain interesting regimes.

Original languageEnglish (US)
Article number8737969
Pages (from-to)5907-5920
Number of pages14
JournalIEEE Transactions on Communications
Volume67
Issue number9
DOIs
StatePublished - Sep 2019

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Harnessing Correlations in Distributed Erasure-Coded Key-Value Stores'. Together they form a unique fingerprint.

Cite this