Conjugate gradient sparse solvers

Performance-power characteristics

Korad Malkowski, Ingyu Lee, Padma Raghavan, Mary Jane Irwin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

We characterize the performance and power attributes of the conjugate gradient (CG) sparse solver which is widely used in scientific applications. We use cycle-accurate simulations with SimpleScalar and Wattch, on a processor and memory architecture similar to the configuration of a node of the BlueGene/L. We first demonstrate that substantial power savings can be obtained without performance degradation if low power modes of caches can be utilized. We next show that if Dynamic Voltage Scaling (DVS) can be used, power and energy savings are possible, but these are realized only at the expense of performance penalties. We then consider two simple memory subsystem optimizations, namely memory and level-2 cache prefetching. We demonstrate that when DVS and low power modes of caches are used with these optimizations, performance can be improved significantly with reductions in power and energy. For example, execution time is reduced by 23%, power by 55% and energy by 65% in the final configuration at 500MHz relative to the original at 1GHz. We also use our codes and the CG NAS benchmark code to demonstrate that performance and power profiles can vary significantly depending on matrix properties and the level of code tuning. These results indicate that architectural evaluations can benefit if traditional benchmarks are augmented with codes more representative of tuned scientific applications.

Original languageEnglish (US)
Title of host publication20th International Parallel and Distributed Processing Symposium, IPDPS 2006
PublisherIEEE Computer Society
ISBN (Print)1424400546, 9781424400546
DOIs
StatePublished - Jan 1 2006
Event20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006 - Rhodes Island, Greece
Duration: Apr 25 2006Apr 29 2006

Publication series

Name20th International Parallel and Distributed Processing Symposium, IPDPS 2006
Volume2006

Other

Other20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006
CountryGreece
CityRhodes Island
Period4/25/064/29/06

Fingerprint

Data storage equipment
Memory architecture
Energy conservation
Tuning
Degradation
Voltage scaling

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Malkowski, K., Lee, I., Raghavan, P., & Irwin, M. J. (2006). Conjugate gradient sparse solvers: Performance-power characteristics. In 20th International Parallel and Distributed Processing Symposium, IPDPS 2006 [1639595] (20th International Parallel and Distributed Processing Symposium, IPDPS 2006; Vol. 2006). IEEE Computer Society. https://doi.org/10.1109/IPDPS.2006.1639595
Malkowski, Korad ; Lee, Ingyu ; Raghavan, Padma ; Irwin, Mary Jane. / Conjugate gradient sparse solvers : Performance-power characteristics. 20th International Parallel and Distributed Processing Symposium, IPDPS 2006. IEEE Computer Society, 2006. (20th International Parallel and Distributed Processing Symposium, IPDPS 2006).
@inproceedings{89b70b5a769940cebde7aa1e30f510d9,
title = "Conjugate gradient sparse solvers: Performance-power characteristics",
abstract = "We characterize the performance and power attributes of the conjugate gradient (CG) sparse solver which is widely used in scientific applications. We use cycle-accurate simulations with SimpleScalar and Wattch, on a processor and memory architecture similar to the configuration of a node of the BlueGene/L. We first demonstrate that substantial power savings can be obtained without performance degradation if low power modes of caches can be utilized. We next show that if Dynamic Voltage Scaling (DVS) can be used, power and energy savings are possible, but these are realized only at the expense of performance penalties. We then consider two simple memory subsystem optimizations, namely memory and level-2 cache prefetching. We demonstrate that when DVS and low power modes of caches are used with these optimizations, performance can be improved significantly with reductions in power and energy. For example, execution time is reduced by 23{\%}, power by 55{\%} and energy by 65{\%} in the final configuration at 500MHz relative to the original at 1GHz. We also use our codes and the CG NAS benchmark code to demonstrate that performance and power profiles can vary significantly depending on matrix properties and the level of code tuning. These results indicate that architectural evaluations can benefit if traditional benchmarks are augmented with codes more representative of tuned scientific applications.",
author = "Korad Malkowski and Ingyu Lee and Padma Raghavan and Irwin, {Mary Jane}",
year = "2006",
month = "1",
day = "1",
doi = "10.1109/IPDPS.2006.1639595",
language = "English (US)",
isbn = "1424400546",
series = "20th International Parallel and Distributed Processing Symposium, IPDPS 2006",
publisher = "IEEE Computer Society",
booktitle = "20th International Parallel and Distributed Processing Symposium, IPDPS 2006",
address = "United States",

}

Malkowski, K, Lee, I, Raghavan, P & Irwin, MJ 2006, Conjugate gradient sparse solvers: Performance-power characteristics. in 20th International Parallel and Distributed Processing Symposium, IPDPS 2006., 1639595, 20th International Parallel and Distributed Processing Symposium, IPDPS 2006, vol. 2006, IEEE Computer Society, 20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006, Rhodes Island, Greece, 4/25/06. https://doi.org/10.1109/IPDPS.2006.1639595

Conjugate gradient sparse solvers : Performance-power characteristics. / Malkowski, Korad; Lee, Ingyu; Raghavan, Padma; Irwin, Mary Jane.

20th International Parallel and Distributed Processing Symposium, IPDPS 2006. IEEE Computer Society, 2006. 1639595 (20th International Parallel and Distributed Processing Symposium, IPDPS 2006; Vol. 2006).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Conjugate gradient sparse solvers

T2 - Performance-power characteristics

AU - Malkowski, Korad

AU - Lee, Ingyu

AU - Raghavan, Padma

AU - Irwin, Mary Jane

PY - 2006/1/1

Y1 - 2006/1/1

N2 - We characterize the performance and power attributes of the conjugate gradient (CG) sparse solver which is widely used in scientific applications. We use cycle-accurate simulations with SimpleScalar and Wattch, on a processor and memory architecture similar to the configuration of a node of the BlueGene/L. We first demonstrate that substantial power savings can be obtained without performance degradation if low power modes of caches can be utilized. We next show that if Dynamic Voltage Scaling (DVS) can be used, power and energy savings are possible, but these are realized only at the expense of performance penalties. We then consider two simple memory subsystem optimizations, namely memory and level-2 cache prefetching. We demonstrate that when DVS and low power modes of caches are used with these optimizations, performance can be improved significantly with reductions in power and energy. For example, execution time is reduced by 23%, power by 55% and energy by 65% in the final configuration at 500MHz relative to the original at 1GHz. We also use our codes and the CG NAS benchmark code to demonstrate that performance and power profiles can vary significantly depending on matrix properties and the level of code tuning. These results indicate that architectural evaluations can benefit if traditional benchmarks are augmented with codes more representative of tuned scientific applications.

AB - We characterize the performance and power attributes of the conjugate gradient (CG) sparse solver which is widely used in scientific applications. We use cycle-accurate simulations with SimpleScalar and Wattch, on a processor and memory architecture similar to the configuration of a node of the BlueGene/L. We first demonstrate that substantial power savings can be obtained without performance degradation if low power modes of caches can be utilized. We next show that if Dynamic Voltage Scaling (DVS) can be used, power and energy savings are possible, but these are realized only at the expense of performance penalties. We then consider two simple memory subsystem optimizations, namely memory and level-2 cache prefetching. We demonstrate that when DVS and low power modes of caches are used with these optimizations, performance can be improved significantly with reductions in power and energy. For example, execution time is reduced by 23%, power by 55% and energy by 65% in the final configuration at 500MHz relative to the original at 1GHz. We also use our codes and the CG NAS benchmark code to demonstrate that performance and power profiles can vary significantly depending on matrix properties and the level of code tuning. These results indicate that architectural evaluations can benefit if traditional benchmarks are augmented with codes more representative of tuned scientific applications.

UR - http://www.scopus.com/inward/record.url?scp=33847142869&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847142869&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2006.1639595

DO - 10.1109/IPDPS.2006.1639595

M3 - Conference contribution

SN - 1424400546

SN - 9781424400546

T3 - 20th International Parallel and Distributed Processing Symposium, IPDPS 2006

BT - 20th International Parallel and Distributed Processing Symposium, IPDPS 2006

PB - IEEE Computer Society

ER -

Malkowski K, Lee I, Raghavan P, Irwin MJ. Conjugate gradient sparse solvers: Performance-power characteristics. In 20th International Parallel and Distributed Processing Symposium, IPDPS 2006. IEEE Computer Society. 2006. 1639595. (20th International Parallel and Distributed Processing Symposium, IPDPS 2006). https://doi.org/10.1109/IPDPS.2006.1639595