Adapting application execution in CMPs using helper threads

Yang Ding, Mahmut Kandemir, Padma Raghavan, Mary Jane Irwin

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3%, 83.3%, 91.2%, and 94.2% savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.

Original languageEnglish (US)
Pages (from-to)790-806
Number of pages17
JournalJournal of Parallel and Distributed Computing
Volume69
Issue number9
DOIs
StatePublished - Sep 1 2009

Fingerprint

Chip multiprocessors
Thread
Program processors
Availability
Energy
Fast Fourier transforms
Workload
Voltage
LU decomposition
Conjugate Gradient
Electric potential
Overload
Fast Fourier transform
Energy Efficiency
Operating Systems
Energy efficiency
Scheduling
Optimise
Decomposition
Metric

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Ding, Yang ; Kandemir, Mahmut ; Raghavan, Padma ; Irwin, Mary Jane. / Adapting application execution in CMPs using helper threads. In: Journal of Parallel and Distributed Computing. 2009 ; Vol. 69, No. 9. pp. 790-806.
@article{21a1df83e3314480ba62eff9ca0cf33e,
title = "Adapting application execution in CMPs using helper threads",
abstract = "In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3{\%}, 83.3{\%}, 91.2{\%}, and 94.2{\%} savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.",
author = "Yang Ding and Mahmut Kandemir and Padma Raghavan and Irwin, {Mary Jane}",
year = "2009",
month = "9",
day = "1",
doi = "10.1016/j.jpdc.2009.04.004",
language = "English (US)",
volume = "69",
pages = "790--806",
journal = "Journal of Parallel and Distributed Computing",
issn = "0743-7315",
publisher = "Academic Press Inc.",
number = "9",

}

Adapting application execution in CMPs using helper threads. / Ding, Yang; Kandemir, Mahmut; Raghavan, Padma; Irwin, Mary Jane.

In: Journal of Parallel and Distributed Computing, Vol. 69, No. 9, 01.09.2009, p. 790-806.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Adapting application execution in CMPs using helper threads

AU - Ding, Yang

AU - Kandemir, Mahmut

AU - Raghavan, Padma

AU - Irwin, Mary Jane

PY - 2009/9/1

Y1 - 2009/9/1

N2 - In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3%, 83.3%, 91.2%, and 94.2% savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.

AB - In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3%, 83.3%, 91.2%, and 94.2% savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.

UR - http://www.scopus.com/inward/record.url?scp=67651006179&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67651006179&partnerID=8YFLogxK

U2 - 10.1016/j.jpdc.2009.04.004

DO - 10.1016/j.jpdc.2009.04.004

M3 - Article

AN - SCOPUS:67651006179

VL - 69

SP - 790

EP - 806

JO - Journal of Parallel and Distributed Computing

JF - Journal of Parallel and Distributed Computing

SN - 0743-7315

IS - 9

ER -