Locality-aware distributed loop scheduling for chip multiprocessors

L. Xue, Mahmut Kandemir, G. Chen, F. Li, O. Ozturk, R. Ramanarayanan, B. Vaidyanathan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor.

Original languageEnglish (US)
Title of host publicationProceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems
Pages251-256
Number of pages6
DOIs
StatePublished - Dec 1 2007
Event20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems, VLSID'07 - Bangalore, India
Duration: Jan 6 2007Jan 10 2007

Publication series

NameProceedings of the IEEE International Conference on VLSI Design
ISSN (Print)1063-9667

Other

Other20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems, VLSID'07
CountryIndia
CityBangalore
Period1/6/071/10/07

Fingerprint

Scheduling
Dynamic loads
Resource allocation
Experiments

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Xue, L., Kandemir, M., Chen, G., Li, F., Ozturk, O., Ramanarayanan, R., & Vaidyanathan, B. (2007). Locality-aware distributed loop scheduling for chip multiprocessors. In Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (pp. 251-256). [4092054] (Proceedings of the IEEE International Conference on VLSI Design). https://doi.org/10.1109/VLSID.2007.97
Xue, L. ; Kandemir, Mahmut ; Chen, G. ; Li, F. ; Ozturk, O. ; Ramanarayanan, R. ; Vaidyanathan, B. / Locality-aware distributed loop scheduling for chip multiprocessors. Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems. 2007. pp. 251-256 (Proceedings of the IEEE International Conference on VLSI Design).
@inproceedings{039f736fb9844682a5c9969f3868c9aa,
title = "Locality-aware distributed loop scheduling for chip multiprocessors",
abstract = "Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor.",
author = "L. Xue and Mahmut Kandemir and G. Chen and F. Li and O. Ozturk and R. Ramanarayanan and B. Vaidyanathan",
year = "2007",
month = "12",
day = "1",
doi = "10.1109/VLSID.2007.97",
language = "English (US)",
isbn = "0769527620",
series = "Proceedings of the IEEE International Conference on VLSI Design",
pages = "251--256",
booktitle = "Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems",

}

Xue, L, Kandemir, M, Chen, G, Li, F, Ozturk, O, Ramanarayanan, R & Vaidyanathan, B 2007, Locality-aware distributed loop scheduling for chip multiprocessors. in Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems., 4092054, Proceedings of the IEEE International Conference on VLSI Design, pp. 251-256, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems, VLSID'07, Bangalore, India, 1/6/07. https://doi.org/10.1109/VLSID.2007.97

Locality-aware distributed loop scheduling for chip multiprocessors. / Xue, L.; Kandemir, Mahmut; Chen, G.; Li, F.; Ozturk, O.; Ramanarayanan, R.; Vaidyanathan, B.

Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems. 2007. p. 251-256 4092054 (Proceedings of the IEEE International Conference on VLSI Design).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Locality-aware distributed loop scheduling for chip multiprocessors

AU - Xue, L.

AU - Kandemir, Mahmut

AU - Chen, G.

AU - Li, F.

AU - Ozturk, O.

AU - Ramanarayanan, R.

AU - Vaidyanathan, B.

PY - 2007/12/1

Y1 - 2007/12/1

N2 - Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor.

AB - Chip multiprocessors are becoming increasingly popular in embedded domain since they have important advantages over their single core counterparts from the parallelism, power efficiency, validation, and verification perspectives. However, extracting maximum performance from these multiprocessors requires compiler support in form of effective code parallelization. The goal of this paper is to present and experimentally evaluate a locality aware dynamic loop scheduling strategy that implements both locality aware loop iteration distribution across parallel processors and dynamic load balancing at runtime. This hybrid scheme has been implemented and tested along with four other previously-proposed loop scheduling schemes, including a locality aware one. Our experimental analysis reveals that the proposed approach generates better results than all other scheduling schemes (static or dynamic) tested. Our results also show that the improvements brought by the proposed scheduling scheme are consistent across experiments with different values of our major simulation parameters such as the number of processors and cache size per processor.

UR - http://www.scopus.com/inward/record.url?scp=48349122237&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=48349122237&partnerID=8YFLogxK

U2 - 10.1109/VLSID.2007.97

DO - 10.1109/VLSID.2007.97

M3 - Conference contribution

SN - 0769527620

SN - 9780769527628

T3 - Proceedings of the IEEE International Conference on VLSI Design

SP - 251

EP - 256

BT - Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems

ER -

Xue L, Kandemir M, Chen G, Li F, Ozturk O, Ramanarayanan R et al. Locality-aware distributed loop scheduling for chip multiprocessors. In Proceedings - 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems. 2007. p. 251-256. 4092054. (Proceedings of the IEEE International Conference on VLSI Design). https://doi.org/10.1109/VLSID.2007.97