Achieving Accountable MapReduce in cloud computing

Zhifeng Xiao, Yang Xiao

Research output: Contribution to journalArticle

37 Citations (Scopus)

Abstract

MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does not guarantee that each working machine will be completely accountable; when nodes are malicious, they may intentionally misrepresent the processing result during mapping or reducing, and they may thus make the final results inaccurate and untrustworthy. In this paper, we propose Accountable MapReduce, which forces each machine to be held responsible for its behaviors. In our approach, we set up a group of auditors to perform an Accountability Test (A-test) that checks all of the working machines and detects malicious nodes in real time. The A-test can be implemented with different options depending upon how the auditors are assigned. To optimize the utilization resource, we also formalize the Optimal Worker and Auditor Assignment (OWAA) problem, which is aimed at finding the optimal number of workers and auditors in order to minimize the total processing time. Our evaluation results show that the A-test can be practically and effectively applied to existing cloud platforms employing MapReduce.

Original languageEnglish (US)
Pages (from-to)1-13
Number of pages13
JournalFuture Generation Computer Systems
Volume30
Issue number1
DOIs
StatePublished - Jan 1 2014

Fingerprint

Cloud computing
Processing
Distributed computer systems
Fault tolerance

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

@article{f65283891e2b4453a54aa0a645c22d6a,
title = "Achieving Accountable MapReduce in cloud computing",
abstract = "MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does not guarantee that each working machine will be completely accountable; when nodes are malicious, they may intentionally misrepresent the processing result during mapping or reducing, and they may thus make the final results inaccurate and untrustworthy. In this paper, we propose Accountable MapReduce, which forces each machine to be held responsible for its behaviors. In our approach, we set up a group of auditors to perform an Accountability Test (A-test) that checks all of the working machines and detects malicious nodes in real time. The A-test can be implemented with different options depending upon how the auditors are assigned. To optimize the utilization resource, we also formalize the Optimal Worker and Auditor Assignment (OWAA) problem, which is aimed at finding the optimal number of workers and auditors in order to minimize the total processing time. Our evaluation results show that the A-test can be practically and effectively applied to existing cloud platforms employing MapReduce.",
author = "Zhifeng Xiao and Yang Xiao",
year = "2014",
month = "1",
day = "1",
doi = "10.1016/j.future.2013.07.001",
language = "English (US)",
volume = "30",
pages = "1--13",
journal = "Future Generation Computer Systems",
issn = "0167-739X",
publisher = "Elsevier",
number = "1",

}

Achieving Accountable MapReduce in cloud computing. / Xiao, Zhifeng; Xiao, Yang.

In: Future Generation Computer Systems, Vol. 30, No. 1, 01.01.2014, p. 1-13.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Achieving Accountable MapReduce in cloud computing

AU - Xiao, Zhifeng

AU - Xiao, Yang

PY - 2014/1/1

Y1 - 2014/1/1

N2 - MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does not guarantee that each working machine will be completely accountable; when nodes are malicious, they may intentionally misrepresent the processing result during mapping or reducing, and they may thus make the final results inaccurate and untrustworthy. In this paper, we propose Accountable MapReduce, which forces each machine to be held responsible for its behaviors. In our approach, we set up a group of auditors to perform an Accountability Test (A-test) that checks all of the working machines and detects malicious nodes in real time. The A-test can be implemented with different options depending upon how the auditors are assigned. To optimize the utilization resource, we also formalize the Optimal Worker and Auditor Assignment (OWAA) problem, which is aimed at finding the optimal number of workers and auditors in order to minimize the total processing time. Our evaluation results show that the A-test can be practically and effectively applied to existing cloud platforms employing MapReduce.

AB - MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does not guarantee that each working machine will be completely accountable; when nodes are malicious, they may intentionally misrepresent the processing result during mapping or reducing, and they may thus make the final results inaccurate and untrustworthy. In this paper, we propose Accountable MapReduce, which forces each machine to be held responsible for its behaviors. In our approach, we set up a group of auditors to perform an Accountability Test (A-test) that checks all of the working machines and detects malicious nodes in real time. The A-test can be implemented with different options depending upon how the auditors are assigned. To optimize the utilization resource, we also formalize the Optimal Worker and Auditor Assignment (OWAA) problem, which is aimed at finding the optimal number of workers and auditors in order to minimize the total processing time. Our evaluation results show that the A-test can be practically and effectively applied to existing cloud platforms employing MapReduce.

UR - http://www.scopus.com/inward/record.url?scp=84883165397&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883165397&partnerID=8YFLogxK

U2 - 10.1016/j.future.2013.07.001

DO - 10.1016/j.future.2013.07.001

M3 - Article

AN - SCOPUS:84883165397

VL - 30

SP - 1

EP - 13

JO - Future Generation Computer Systems

JF - Future Generation Computer Systems

SN - 0167-739X

IS - 1

ER -