A novel migration-based NUCA design for chip multiprocessors

Mahmut Kandemir, Feihui Li, Mary Jane Irwin, Seung Woo Son

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Citations (Scopus)

Abstract

Chip Multiprocessors (CMPs) and Non-Uniform Cache Architectures (NUCAs) represent two emerging trends in computer architecture. Targeting future CMP based systems with NUCA type L2 caches, this paper proposes a novel data migration algorithm for parallel applications and evaluates it. The goal of this migration scheme is to determine a suitable location for each data block within a large L2 space at any given point during execution. A unique characteristic of the proposed scheme is that it models the problem of optimal data placement in the L2 cache space as a two-dimensional post office placement problem, presents a practical architectural implementation of this model, and gives a detailed evaluation of the proposed implementation. In our experimental evaluation, we also compare our approach to a previously-proposed NUCA management scheme using applications from the specomp suite, oltp, specjbb, and specweb. These experiments show that our migration approach generates about 35% improvement, on average, in average L2 access latency over the previous migration scheme, and these L2 latency savings translate, on average, to 9.5% improvement in IPC (instructions per cycle).We also observed during our experiments that both the careful initial placement of data (which itself triggers migrations within the L2 space) and subsequent migrations (due to interprocessor data sharing) play an important role in achieving our performance improvements.

Original languageEnglish (US)
Title of host publication2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
DOIs
StatePublished - Dec 1 2008
Event2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008 - Austin, TX, United States
Duration: Nov 15 2008Nov 21 2008

Publication series

Name2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008

Other

Other2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
CountryUnited States
CityAustin, TX
Period11/15/0811/21/08

Fingerprint

Post offices
Computer architecture
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Cite this

Kandemir, M., Li, F., Irwin, M. J., & Son, S. W. (2008). A novel migration-based NUCA design for chip multiprocessors. In 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008 [5216918] (2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008). https://doi.org/10.1109/SC.2008.5216918
Kandemir, Mahmut ; Li, Feihui ; Irwin, Mary Jane ; Son, Seung Woo. / A novel migration-based NUCA design for chip multiprocessors. 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008. 2008. (2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008).
@inproceedings{9f117959e47f470f926c6df85f98a9b5,
title = "A novel migration-based NUCA design for chip multiprocessors",
abstract = "Chip Multiprocessors (CMPs) and Non-Uniform Cache Architectures (NUCAs) represent two emerging trends in computer architecture. Targeting future CMP based systems with NUCA type L2 caches, this paper proposes a novel data migration algorithm for parallel applications and evaluates it. The goal of this migration scheme is to determine a suitable location for each data block within a large L2 space at any given point during execution. A unique characteristic of the proposed scheme is that it models the problem of optimal data placement in the L2 cache space as a two-dimensional post office placement problem, presents a practical architectural implementation of this model, and gives a detailed evaluation of the proposed implementation. In our experimental evaluation, we also compare our approach to a previously-proposed NUCA management scheme using applications from the specomp suite, oltp, specjbb, and specweb. These experiments show that our migration approach generates about 35{\%} improvement, on average, in average L2 access latency over the previous migration scheme, and these L2 latency savings translate, on average, to 9.5{\%} improvement in IPC (instructions per cycle).We also observed during our experiments that both the careful initial placement of data (which itself triggers migrations within the L2 space) and subsequent migrations (due to interprocessor data sharing) play an important role in achieving our performance improvements.",
author = "Mahmut Kandemir and Feihui Li and Irwin, {Mary Jane} and Son, {Seung Woo}",
year = "2008",
month = "12",
day = "1",
doi = "10.1109/SC.2008.5216918",
language = "English (US)",
isbn = "9781424428359",
series = "2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008",
booktitle = "2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008",

}

Kandemir, M, Li, F, Irwin, MJ & Son, SW 2008, A novel migration-based NUCA design for chip multiprocessors. in 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008., 5216918, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008, Austin, TX, United States, 11/15/08. https://doi.org/10.1109/SC.2008.5216918

A novel migration-based NUCA design for chip multiprocessors. / Kandemir, Mahmut; Li, Feihui; Irwin, Mary Jane; Son, Seung Woo.

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008. 2008. 5216918 (2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A novel migration-based NUCA design for chip multiprocessors

AU - Kandemir, Mahmut

AU - Li, Feihui

AU - Irwin, Mary Jane

AU - Son, Seung Woo

PY - 2008/12/1

Y1 - 2008/12/1

N2 - Chip Multiprocessors (CMPs) and Non-Uniform Cache Architectures (NUCAs) represent two emerging trends in computer architecture. Targeting future CMP based systems with NUCA type L2 caches, this paper proposes a novel data migration algorithm for parallel applications and evaluates it. The goal of this migration scheme is to determine a suitable location for each data block within a large L2 space at any given point during execution. A unique characteristic of the proposed scheme is that it models the problem of optimal data placement in the L2 cache space as a two-dimensional post office placement problem, presents a practical architectural implementation of this model, and gives a detailed evaluation of the proposed implementation. In our experimental evaluation, we also compare our approach to a previously-proposed NUCA management scheme using applications from the specomp suite, oltp, specjbb, and specweb. These experiments show that our migration approach generates about 35% improvement, on average, in average L2 access latency over the previous migration scheme, and these L2 latency savings translate, on average, to 9.5% improvement in IPC (instructions per cycle).We also observed during our experiments that both the careful initial placement of data (which itself triggers migrations within the L2 space) and subsequent migrations (due to interprocessor data sharing) play an important role in achieving our performance improvements.

AB - Chip Multiprocessors (CMPs) and Non-Uniform Cache Architectures (NUCAs) represent two emerging trends in computer architecture. Targeting future CMP based systems with NUCA type L2 caches, this paper proposes a novel data migration algorithm for parallel applications and evaluates it. The goal of this migration scheme is to determine a suitable location for each data block within a large L2 space at any given point during execution. A unique characteristic of the proposed scheme is that it models the problem of optimal data placement in the L2 cache space as a two-dimensional post office placement problem, presents a practical architectural implementation of this model, and gives a detailed evaluation of the proposed implementation. In our experimental evaluation, we also compare our approach to a previously-proposed NUCA management scheme using applications from the specomp suite, oltp, specjbb, and specweb. These experiments show that our migration approach generates about 35% improvement, on average, in average L2 access latency over the previous migration scheme, and these L2 latency savings translate, on average, to 9.5% improvement in IPC (instructions per cycle).We also observed during our experiments that both the careful initial placement of data (which itself triggers migrations within the L2 space) and subsequent migrations (due to interprocessor data sharing) play an important role in achieving our performance improvements.

UR - http://www.scopus.com/inward/record.url?scp=70350746352&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350746352&partnerID=8YFLogxK

U2 - 10.1109/SC.2008.5216918

DO - 10.1109/SC.2008.5216918

M3 - Conference contribution

AN - SCOPUS:70350746352

SN - 9781424428359

T3 - 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008

BT - 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008

ER -

Kandemir M, Li F, Irwin MJ, Son SW. A novel migration-based NUCA design for chip multiprocessors. In 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008. 2008. 5216918. (2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008). https://doi.org/10.1109/SC.2008.5216918