Modeling and analysis of dynamic coscheduling in parallel and distributed environments

Mark S. Squillante, Yanyong Zhang, Anand Sivasubramaniam, Natarajan Gautam, Hubertus Franke, Jose Moreira

Research output: Contribution to journalConference article

15 Scopus citations

Abstract

Scheduling in large-scale parallel systems has been and continues to be an important and challenging research problem. Several key factors, including the increasing use of off-the-shelf clusters of workstations to build such parallel systems, have resulted in the emergence of a new class of scheduling strategies, broadly referred to as dynamic coscheduling. Unfortunately, the size of both the design and performance spaces of these emerging scheduling strategies is quite large, due in part to the numerous dynamic interactions among the different components of the parallel computing environment as well as the wide range of applications and systems that can comprise the parallel environment. This in turn makes it difficult to fully explore the benefits and limitations of the various proposed dynamic coscheduling approaches for large-scale systems solely with the use of simulation and/or experimentation. To gain a better understanding of the fundamental properties of different dynamic coscheduling methods, we formulate a general mathematical model of this class of scheduling strategies within a unified framework that allows us to investigate a wide range of parallel environments. We derive a matrix-analytic analysis based on a stochastic decomposition and a fixed-point iteration. A large number of numerical experiments are performed in part to examine the accuracy of our approach. These numerical results are in excellent agreement with detailed simulation results. Our mathematical model and analysis is then used to explore several fundamental design and performance tradeoffs associated with the class of dynamic coscheduling policies across a broad spectrum of parallel computing environments.

Original languageEnglish (US)
Pages (from-to)43-54
Number of pages12
JournalPerformance Evaluation Review
Volume30
Issue number1
DOIs
StatePublished - 2002
EventACM SIGMETRICS 2002 International Conference on Measurement and Modeling of Computer Systems - Marina Del Rey, CA, United States
Duration: Jun 15 2002Jun 19 2002

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Modeling and analysis of dynamic coscheduling in parallel and distributed environments'. Together they form a unique fingerprint.

  • Cite this