Prefetch Tuning Optimizations

Diana Guttman, Meenakshi Arunachalam, Vlad Calina, Mahmut Kandemir

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter looks at methods to improve prefetching effectiveness, and therefore increase performance of applications, through the use of the superior knowledge of the programmer. It is known that prefetching is extremely important for good performance on in-order architectures like the Intel Xeon Phi coprocessor however, the authors surprised even themselves by being able to expose techniques which show value on out-of-order cores as well. Often simply tuning the compiler prefetching distance is an easy way for application developers to get better performance without having to rewrite their code. In some cases, the more labor-intensive method of adding intrinsics for prefetching may be worthwhile.

Original languageEnglish (US)
Title of host publicationHigh Performance Parallelism Pearls
Subtitle of host publicationMulticore and Many-core Programming Approaches
PublisherElsevier Inc.
Pages401-419
Number of pages19
Volume2
ISBN (Electronic)9780128038901
ISBN (Print)9780128038192
DOIs
StatePublished - Jul 23 2015

Fingerprint

Tuning
Personnel
Coprocessor

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Guttman, D., Arunachalam, M., Calina, V., & Kandemir, M. (2015). Prefetch Tuning Optimizations. In High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches (Vol. 2, pp. 401-419). Elsevier Inc.. https://doi.org/10.1016/B978-0-12-803819-2.00018-5
Guttman, Diana ; Arunachalam, Meenakshi ; Calina, Vlad ; Kandemir, Mahmut. / Prefetch Tuning Optimizations. High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches. Vol. 2 Elsevier Inc., 2015. pp. 401-419
@inbook{66e3cdaea06149d5a87f0e127609c2f2,
title = "Prefetch Tuning Optimizations",
abstract = "This chapter looks at methods to improve prefetching effectiveness, and therefore increase performance of applications, through the use of the superior knowledge of the programmer. It is known that prefetching is extremely important for good performance on in-order architectures like the Intel Xeon Phi coprocessor however, the authors surprised even themselves by being able to expose techniques which show value on out-of-order cores as well. Often simply tuning the compiler prefetching distance is an easy way for application developers to get better performance without having to rewrite their code. In some cases, the more labor-intensive method of adding intrinsics for prefetching may be worthwhile.",
author = "Diana Guttman and Meenakshi Arunachalam and Vlad Calina and Mahmut Kandemir",
year = "2015",
month = "7",
day = "23",
doi = "10.1016/B978-0-12-803819-2.00018-5",
language = "English (US)",
isbn = "9780128038192",
volume = "2",
pages = "401--419",
booktitle = "High Performance Parallelism Pearls",
publisher = "Elsevier Inc.",
address = "United States",

}

Guttman, D, Arunachalam, M, Calina, V & Kandemir, M 2015, Prefetch Tuning Optimizations. in High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches. vol. 2, Elsevier Inc., pp. 401-419. https://doi.org/10.1016/B978-0-12-803819-2.00018-5

Prefetch Tuning Optimizations. / Guttman, Diana; Arunachalam, Meenakshi; Calina, Vlad; Kandemir, Mahmut.

High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches. Vol. 2 Elsevier Inc., 2015. p. 401-419.

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - Prefetch Tuning Optimizations

AU - Guttman, Diana

AU - Arunachalam, Meenakshi

AU - Calina, Vlad

AU - Kandemir, Mahmut

PY - 2015/7/23

Y1 - 2015/7/23

N2 - This chapter looks at methods to improve prefetching effectiveness, and therefore increase performance of applications, through the use of the superior knowledge of the programmer. It is known that prefetching is extremely important for good performance on in-order architectures like the Intel Xeon Phi coprocessor however, the authors surprised even themselves by being able to expose techniques which show value on out-of-order cores as well. Often simply tuning the compiler prefetching distance is an easy way for application developers to get better performance without having to rewrite their code. In some cases, the more labor-intensive method of adding intrinsics for prefetching may be worthwhile.

AB - This chapter looks at methods to improve prefetching effectiveness, and therefore increase performance of applications, through the use of the superior knowledge of the programmer. It is known that prefetching is extremely important for good performance on in-order architectures like the Intel Xeon Phi coprocessor however, the authors surprised even themselves by being able to expose techniques which show value on out-of-order cores as well. Often simply tuning the compiler prefetching distance is an easy way for application developers to get better performance without having to rewrite their code. In some cases, the more labor-intensive method of adding intrinsics for prefetching may be worthwhile.

UR - http://www.scopus.com/inward/record.url?scp=84982965915&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84982965915&partnerID=8YFLogxK

U2 - 10.1016/B978-0-12-803819-2.00018-5

DO - 10.1016/B978-0-12-803819-2.00018-5

M3 - Chapter

SN - 9780128038192

VL - 2

SP - 401

EP - 419

BT - High Performance Parallelism Pearls

PB - Elsevier Inc.

ER -

Guttman D, Arunachalam M, Calina V, Kandemir M. Prefetch Tuning Optimizations. In High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches. Vol. 2. Elsevier Inc. 2015. p. 401-419 https://doi.org/10.1016/B978-0-12-803819-2.00018-5