Achieving Accountable MapReduce in cloud computing

Zhifeng Xiao, Yang Xiao

Research output: Contribution to journalArticle

38 Scopus citations

Abstract

MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does not guarantee that each working machine will be completely accountable; when nodes are malicious, they may intentionally misrepresent the processing result during mapping or reducing, and they may thus make the final results inaccurate and untrustworthy. In this paper, we propose Accountable MapReduce, which forces each machine to be held responsible for its behaviors. In our approach, we set up a group of auditors to perform an Accountability Test (A-test) that checks all of the working machines and detects malicious nodes in real time. The A-test can be implemented with different options depending upon how the auditors are assigned. To optimize the utilization resource, we also formalize the Optimal Worker and Auditor Assignment (OWAA) problem, which is aimed at finding the optimal number of workers and auditors in order to minimize the total processing time. Our evaluation results show that the A-test can be practically and effectively applied to existing cloud platforms employing MapReduce.

Original languageEnglish (US)
Pages (from-to)1-13
Number of pages13
JournalFuture Generation Computer Systems
Volume30
Issue number1
DOIs
StatePublished - Jan 1 2014

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Achieving Accountable MapReduce in cloud computing'. Together they form a unique fingerprint.

  • Cite this