TY - GEN
T1 - Locality-aware mapping and scheduling for multicores
AU - Ding, Wei
AU - Zhang, Yuanrui
AU - Kandemir, Mahmut
AU - Srinivas, Jithendra
AU - Yedlapalli, Praveen
PY - 2013/5/6
Y1 - 2013/5/6
N2 - This paper presents a cache hierarchy-aware code mapping and scheduling strategy for multicore architectures. Our mapping strategy determines a loop iteration-to-core mapping by taking into account application data access patterns and on-chip cache hierarchy. It employs a novel concept called 'core vectors' to obtain a mapping matrix which exploits data reuses at different layers of the cache hierarchy based on their reuse distances, with the goal of maximizing data locality at each level, while minimizing data dependences across the cores. Our scheduling strategy on the other hand determines a schedule for the iterations assigned to each core, with the goal of reducing data reuse distances across the cores for dependence-free loop nests. Our experimental evaluation shows that the proposed mapping scheme reduces miss rates at all levels of caches and application execution time significantly, and when supported by scheduling, the reduction in cache miss rates and execution time become much larger.
AB - This paper presents a cache hierarchy-aware code mapping and scheduling strategy for multicore architectures. Our mapping strategy determines a loop iteration-to-core mapping by taking into account application data access patterns and on-chip cache hierarchy. It employs a novel concept called 'core vectors' to obtain a mapping matrix which exploits data reuses at different layers of the cache hierarchy based on their reuse distances, with the goal of maximizing data locality at each level, while minimizing data dependences across the cores. Our scheduling strategy on the other hand determines a schedule for the iterations assigned to each core, with the goal of reducing data reuse distances across the cores for dependence-free loop nests. Our experimental evaluation shows that the proposed mapping scheme reduces miss rates at all levels of caches and application execution time significantly, and when supported by scheduling, the reduction in cache miss rates and execution time become much larger.
UR - http://www.scopus.com/inward/record.url?scp=84876914722&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876914722&partnerID=8YFLogxK
U2 - 10.1109/CGO.2013.6495009
DO - 10.1109/CGO.2013.6495009
M3 - Conference contribution
AN - SCOPUS:84876914722
SN - 9781467355254
T3 - Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2013
BT - Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2013
T2 - 11th IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2013
Y2 - 23 February 2013 through 27 February 2013
ER -