The multi-version coding problem is a recently formulated information-theoretic framework to study the storage cost of consistent key-value data stores. Previous work on multi-version coding considered a completely decentralized asynchronous system where the nodes (servers) are not aware of which updates (versions) of the data are received by the other nodes. In this paper, we relax this assumption and study a system where a node acquires side information of the versions propagated to some other nodes based on the network topology. Specifically, we study a storage system with n nodes over a graph that stores nu totally ordered versions of an object (message). Each node receives a subset of these nu versions. A node is aware of which versions that were received by its neighbors in the network graph. Our code constructions show that the side information can result in a better storage cost as compared with the case where the nodes do not exchange side information for some regimes at the expense of the additional latency and the negligible communication overhead of exchanging the side information. Through an information-theoretic converse, we identify surprising scenarios where exchanging tremendous amount of side information does not reduce the storage cost. Finally, we present a case study over Amazon web services (AWS) that demonstrates the potential storage cost reductions of our code constructions.
All Science Journal Classification (ASJC) codes
- Electrical and Electronic Engineering