Sharc: Managing CPU and Network Bandwidth in Shared Clusters

Bhuvan Urgaonkar, Prashant Shenoy

Research output: Contribution to journalArticle

51 Citations (Scopus)

Abstract

In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.

Original languageEnglish (US)
Pages (from-to)2-17
Number of pages16
JournalIEEE Transactions on Parallel and Distributed Systems
Volume15
Issue number1
DOIs
StatePublished - Jan 1 2004

Fingerprint

Program processors
Bandwidth
Interfaces (computer)

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

@article{065cc3731ab24bf88255af8874e29076,
title = "Sharc: Managing CPU and Network Bandwidth in Shared Clusters",
abstract = "In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.",
author = "Bhuvan Urgaonkar and Prashant Shenoy",
year = "2004",
month = "1",
day = "1",
doi = "10.1109/TPDS.2004.1264781",
language = "English (US)",
volume = "15",
pages = "2--17",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "1",

}

Sharc : Managing CPU and Network Bandwidth in Shared Clusters. / Urgaonkar, Bhuvan; Shenoy, Prashant.

In: IEEE Transactions on Parallel and Distributed Systems, Vol. 15, No. 1, 01.01.2004, p. 2-17.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Sharc

T2 - Managing CPU and Network Bandwidth in Shared Clusters

AU - Urgaonkar, Bhuvan

AU - Shenoy, Prashant

PY - 2004/1/1

Y1 - 2004/1/1

N2 - In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.

AB - In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc - a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares, and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources - CPU and network interface bandwidth - on a cluster-wide basis. Our techniques allow Sharc to 1) support reservation of CPU and network interface bandwidth for distributed applications, 2) dynamically allocate resources based on past usage, and 3) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.

UR - http://www.scopus.com/inward/record.url?scp=0742303481&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0742303481&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2004.1264781

DO - 10.1109/TPDS.2004.1264781

M3 - Article

AN - SCOPUS:0742303481

VL - 15

SP - 2

EP - 17

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 1

ER -