Integer linear programming approach for optimizing cache locality

Mahmut Kandemir, P. Banerjee, A. Choudhary, J. Ramanujam, E. Ayguade

Research output: Contribution to conferencePaper

10 Citations (Scopus)

Abstract

The actual performance of programs on modern processors that employ deep memory hierarchies is closely related to the performance of the memory subsystem. Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of powerful processors. For scientific applications, several loop transformations have been shown to be useful in improving both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multi-dimensional arrays from the language-defined default such as column-major storage in Fortran. These memory layout optimizations affect the spatial locality characteristics of loop nests. This paper presents a technique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts to unify loop and data layout transformations for programs consisting of a sequence of loop nests have been based on heuristics not only for transformations for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need for such heuristics. Experimental results on a MIPS R10000 based system demonstrate the benefits of this approach, and show that the use of the ILP formulation does not increase the compilation time significantly.

Original languageEnglish (US)
Pages500-509
Number of pages10
StatePublished - Jan 1 1999
EventProceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 - Rhodes, Greece
Duration: Jun 20 1999Jun 25 1999

Conference

ConferenceProceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99
CityRhodes, Greece
Period6/20/996/25/99

Fingerprint

Linear programming
Data storage equipment

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Kandemir, M., Banerjee, P., Choudhary, A., Ramanujam, J., & Ayguade, E. (1999). Integer linear programming approach for optimizing cache locality. 500-509. Paper presented at Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99, Rhodes, Greece, .
Kandemir, Mahmut ; Banerjee, P. ; Choudhary, A. ; Ramanujam, J. ; Ayguade, E. / Integer linear programming approach for optimizing cache locality. Paper presented at Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99, Rhodes, Greece, .10 p.
@conference{ba958052479e47779bf9e2982cee6f51,
title = "Integer linear programming approach for optimizing cache locality",
abstract = "The actual performance of programs on modern processors that employ deep memory hierarchies is closely related to the performance of the memory subsystem. Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of powerful processors. For scientific applications, several loop transformations have been shown to be useful in improving both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multi-dimensional arrays from the language-defined default such as column-major storage in Fortran. These memory layout optimizations affect the spatial locality characteristics of loop nests. This paper presents a technique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts to unify loop and data layout transformations for programs consisting of a sequence of loop nests have been based on heuristics not only for transformations for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need for such heuristics. Experimental results on a MIPS R10000 based system demonstrate the benefits of this approach, and show that the use of the ILP formulation does not increase the compilation time significantly.",
author = "Mahmut Kandemir and P. Banerjee and A. Choudhary and J. Ramanujam and E. Ayguade",
year = "1999",
month = "1",
day = "1",
language = "English (US)",
pages = "500--509",
note = "Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 ; Conference date: 20-06-1999 Through 25-06-1999",

}

Kandemir, M, Banerjee, P, Choudhary, A, Ramanujam, J & Ayguade, E 1999, 'Integer linear programming approach for optimizing cache locality', Paper presented at Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99, Rhodes, Greece, 6/20/99 - 6/25/99 pp. 500-509.

Integer linear programming approach for optimizing cache locality. / Kandemir, Mahmut; Banerjee, P.; Choudhary, A.; Ramanujam, J.; Ayguade, E.

1999. 500-509 Paper presented at Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99, Rhodes, Greece, .

Research output: Contribution to conferencePaper

TY - CONF

T1 - Integer linear programming approach for optimizing cache locality

AU - Kandemir, Mahmut

AU - Banerjee, P.

AU - Choudhary, A.

AU - Ramanujam, J.

AU - Ayguade, E.

PY - 1999/1/1

Y1 - 1999/1/1

N2 - The actual performance of programs on modern processors that employ deep memory hierarchies is closely related to the performance of the memory subsystem. Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of powerful processors. For scientific applications, several loop transformations have been shown to be useful in improving both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multi-dimensional arrays from the language-defined default such as column-major storage in Fortran. These memory layout optimizations affect the spatial locality characteristics of loop nests. This paper presents a technique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts to unify loop and data layout transformations for programs consisting of a sequence of loop nests have been based on heuristics not only for transformations for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need for such heuristics. Experimental results on a MIPS R10000 based system demonstrate the benefits of this approach, and show that the use of the ILP formulation does not increase the compilation time significantly.

AB - The actual performance of programs on modern processors that employ deep memory hierarchies is closely related to the performance of the memory subsystem. Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of powerful processors. For scientific applications, several loop transformations have been shown to be useful in improving both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multi-dimensional arrays from the language-defined default such as column-major storage in Fortran. These memory layout optimizations affect the spatial locality characteristics of loop nests. This paper presents a technique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts to unify loop and data layout transformations for programs consisting of a sequence of loop nests have been based on heuristics not only for transformations for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need for such heuristics. Experimental results on a MIPS R10000 based system demonstrate the benefits of this approach, and show that the use of the ILP formulation does not increase the compilation time significantly.

UR - http://www.scopus.com/inward/record.url?scp=0032680282&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032680282&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:0032680282

SP - 500

EP - 509

ER -

Kandemir M, Banerjee P, Choudhary A, Ramanujam J, Ayguade E. Integer linear programming approach for optimizing cache locality. 1999. Paper presented at Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99, Rhodes, Greece, .