Optimizing sparse matrix vector multiplication on emerging multicores

Orhan Kislal, Wei Ding, Mahmut Kandemir, Ilteris Demirkiran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

After hitting the power wall, the dramatic change in computer architecture from single core to multicore/manycore brings us new challenges on high performance computing, especially for the data intensive applications. Sparse matrix-vector multiplication (SpMV) is one of the most important computations in this area, and has therefore received a lot of attention in recent decades. In contrast to the uniform/regular dense matrix computations, SpMV's irregular data access patterns with compact data structure for storage make the SpMV optimization more complex than optimizing regular/dense matrix computation. In this work, we look at the SpMV optimization problem in the context of emerging multicores from a different architecture conscious perspective, and propose an optimization strategy that has three key components: mapping, scheduling and data layout reorganization. Specifically, the mapping component derives a suitable iteration-to-core mapping; the scheduling component determines the execution order of loop iterations assigned to each core in the target multicore architecture; and finally, the data layout reorganization component prepares multiple memory layouts for the source (input) vector customized for different row patterns. A distinguishing characteristic of our approach is that it is cache hierarchy aware, that is, all three components take the underlying cache hierarchy of the target multicore architecture into account, and therefore, the derived solution is, in a sense, customized to the target architecture. We evaluate the proposed strategy using 10 sparse matrices with two different multicore systems. Our experimental evaluation reveals that the proposed optimization algorithm brings significant performance improvements (up to 26.5%) over the unoptimized case.

Original languageEnglish (US)
Title of host publication2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013
DOIs
StatePublished - Dec 16 2013
Event2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013 - Edinburgh, United Kingdom
Duration: Sep 7 2013Sep 7 2013

Publication series

Name2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013

Other

Other2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013
CountryUnited Kingdom
CityEdinburgh
Period9/7/139/7/13

Fingerprint

Scheduling
Computer architecture
Data structures
Data storage equipment

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Kislal, O., Ding, W., Kandemir, M., & Demirkiran, I. (2013). Optimizing sparse matrix vector multiplication on emerging multicores. In 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013 [6633600] (2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013). https://doi.org/10.1109/MuCoCoS.2013.6633600
Kislal, Orhan ; Ding, Wei ; Kandemir, Mahmut ; Demirkiran, Ilteris. / Optimizing sparse matrix vector multiplication on emerging multicores. 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013. 2013. (2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013).
@inproceedings{3b638a42721a49219d8174873f270bf7,
title = "Optimizing sparse matrix vector multiplication on emerging multicores",
abstract = "After hitting the power wall, the dramatic change in computer architecture from single core to multicore/manycore brings us new challenges on high performance computing, especially for the data intensive applications. Sparse matrix-vector multiplication (SpMV) is one of the most important computations in this area, and has therefore received a lot of attention in recent decades. In contrast to the uniform/regular dense matrix computations, SpMV's irregular data access patterns with compact data structure for storage make the SpMV optimization more complex than optimizing regular/dense matrix computation. In this work, we look at the SpMV optimization problem in the context of emerging multicores from a different architecture conscious perspective, and propose an optimization strategy that has three key components: mapping, scheduling and data layout reorganization. Specifically, the mapping component derives a suitable iteration-to-core mapping; the scheduling component determines the execution order of loop iterations assigned to each core in the target multicore architecture; and finally, the data layout reorganization component prepares multiple memory layouts for the source (input) vector customized for different row patterns. A distinguishing characteristic of our approach is that it is cache hierarchy aware, that is, all three components take the underlying cache hierarchy of the target multicore architecture into account, and therefore, the derived solution is, in a sense, customized to the target architecture. We evaluate the proposed strategy using 10 sparse matrices with two different multicore systems. Our experimental evaluation reveals that the proposed optimization algorithm brings significant performance improvements (up to 26.5{\%}) over the unoptimized case.",
author = "Orhan Kislal and Wei Ding and Mahmut Kandemir and Ilteris Demirkiran",
year = "2013",
month = "12",
day = "16",
doi = "10.1109/MuCoCoS.2013.6633600",
language = "English (US)",
isbn = "9781479910106",
series = "2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013",
booktitle = "2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013",

}

Kislal, O, Ding, W, Kandemir, M & Demirkiran, I 2013, Optimizing sparse matrix vector multiplication on emerging multicores. in 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013., 6633600, 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013, 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013, Edinburgh, United Kingdom, 9/7/13. https://doi.org/10.1109/MuCoCoS.2013.6633600

Optimizing sparse matrix vector multiplication on emerging multicores. / Kislal, Orhan; Ding, Wei; Kandemir, Mahmut; Demirkiran, Ilteris.

2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013. 2013. 6633600 (2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Optimizing sparse matrix vector multiplication on emerging multicores

AU - Kislal, Orhan

AU - Ding, Wei

AU - Kandemir, Mahmut

AU - Demirkiran, Ilteris

PY - 2013/12/16

Y1 - 2013/12/16

N2 - After hitting the power wall, the dramatic change in computer architecture from single core to multicore/manycore brings us new challenges on high performance computing, especially for the data intensive applications. Sparse matrix-vector multiplication (SpMV) is one of the most important computations in this area, and has therefore received a lot of attention in recent decades. In contrast to the uniform/regular dense matrix computations, SpMV's irregular data access patterns with compact data structure for storage make the SpMV optimization more complex than optimizing regular/dense matrix computation. In this work, we look at the SpMV optimization problem in the context of emerging multicores from a different architecture conscious perspective, and propose an optimization strategy that has three key components: mapping, scheduling and data layout reorganization. Specifically, the mapping component derives a suitable iteration-to-core mapping; the scheduling component determines the execution order of loop iterations assigned to each core in the target multicore architecture; and finally, the data layout reorganization component prepares multiple memory layouts for the source (input) vector customized for different row patterns. A distinguishing characteristic of our approach is that it is cache hierarchy aware, that is, all three components take the underlying cache hierarchy of the target multicore architecture into account, and therefore, the derived solution is, in a sense, customized to the target architecture. We evaluate the proposed strategy using 10 sparse matrices with two different multicore systems. Our experimental evaluation reveals that the proposed optimization algorithm brings significant performance improvements (up to 26.5%) over the unoptimized case.

AB - After hitting the power wall, the dramatic change in computer architecture from single core to multicore/manycore brings us new challenges on high performance computing, especially for the data intensive applications. Sparse matrix-vector multiplication (SpMV) is one of the most important computations in this area, and has therefore received a lot of attention in recent decades. In contrast to the uniform/regular dense matrix computations, SpMV's irregular data access patterns with compact data structure for storage make the SpMV optimization more complex than optimizing regular/dense matrix computation. In this work, we look at the SpMV optimization problem in the context of emerging multicores from a different architecture conscious perspective, and propose an optimization strategy that has three key components: mapping, scheduling and data layout reorganization. Specifically, the mapping component derives a suitable iteration-to-core mapping; the scheduling component determines the execution order of loop iterations assigned to each core in the target multicore architecture; and finally, the data layout reorganization component prepares multiple memory layouts for the source (input) vector customized for different row patterns. A distinguishing characteristic of our approach is that it is cache hierarchy aware, that is, all three components take the underlying cache hierarchy of the target multicore architecture into account, and therefore, the derived solution is, in a sense, customized to the target architecture. We evaluate the proposed strategy using 10 sparse matrices with two different multicore systems. Our experimental evaluation reveals that the proposed optimization algorithm brings significant performance improvements (up to 26.5%) over the unoptimized case.

UR - http://www.scopus.com/inward/record.url?scp=84890064661&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890064661&partnerID=8YFLogxK

U2 - 10.1109/MuCoCoS.2013.6633600

DO - 10.1109/MuCoCoS.2013.6633600

M3 - Conference contribution

AN - SCOPUS:84890064661

SN - 9781479910106

T3 - 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013

BT - 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013

ER -

Kislal O, Ding W, Kandemir M, Demirkiran I. Optimizing sparse matrix vector multiplication on emerging multicores. In 2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013. 2013. 6633600. (2013 IEEE 6th International Workshop on Multi-/Many-Core Computing Systems, MuCoCoS 2013). https://doi.org/10.1109/MuCoCoS.2013.6633600