A selective protection scheme of applications using asymmetrically reliable caches

Sanem Arslan, Haluk Rahmi Topcuoglu, Mahmut Kandemir, Oguz Tosun

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Cache structures in a multicore system are highly vulnerable to soft errors. Enabling fault tolerance capabilities on all cache structures in a system is inefficient in terms of performance and power consumption. In this study, we propose an enhanced protection mechanism for code segments, which are critical in terms of reliability, by utilizing asymmetrically reliable cores under performance and power constraints. Our proposed system contains at least one high-reliability core, which has an ECC-protected L1 cache, and several low-reliability cores, which have no protection mechanisms. Reliability-based critical code regions are assumed to be high-priority functions, which are extracted by examining the execution time percentages and the program's call graph in our framework, statically. Software threads that invoke one of the high-priority functions are bound to the high-reliability cores dynamically during the execution, while the threads that execute the remaining functions are bound to the low-reliability cores. As part of the experimental analysis, our proposed framework is compared with traditional fully protected and unprotected configurations with respect to performance, power and reliability metrics for various applications of the benchmarks. Our framework exploits the benefits of providing the reliability-based critical regions of the applications exclusively by offering notable power and cost savings with close performance and reliability values for the set of functions reported in the experimental results.

Original languageEnglish (US)
Pages (from-to)133-144
Number of pages12
JournalJournal of Systems Architecture
Volume75
DOIs
StatePublished - Apr 1 2017

Fingerprint

Fault tolerance
Electric power utilization
Costs

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture

Cite this

Arslan, Sanem ; Topcuoglu, Haluk Rahmi ; Kandemir, Mahmut ; Tosun, Oguz. / A selective protection scheme of applications using asymmetrically reliable caches. In: Journal of Systems Architecture. 2017 ; Vol. 75. pp. 133-144.
@article{062c15128795464e9235ddb8b962e319,
title = "A selective protection scheme of applications using asymmetrically reliable caches",
abstract = "Cache structures in a multicore system are highly vulnerable to soft errors. Enabling fault tolerance capabilities on all cache structures in a system is inefficient in terms of performance and power consumption. In this study, we propose an enhanced protection mechanism for code segments, which are critical in terms of reliability, by utilizing asymmetrically reliable cores under performance and power constraints. Our proposed system contains at least one high-reliability core, which has an ECC-protected L1 cache, and several low-reliability cores, which have no protection mechanisms. Reliability-based critical code regions are assumed to be high-priority functions, which are extracted by examining the execution time percentages and the program's call graph in our framework, statically. Software threads that invoke one of the high-priority functions are bound to the high-reliability cores dynamically during the execution, while the threads that execute the remaining functions are bound to the low-reliability cores. As part of the experimental analysis, our proposed framework is compared with traditional fully protected and unprotected configurations with respect to performance, power and reliability metrics for various applications of the benchmarks. Our framework exploits the benefits of providing the reliability-based critical regions of the applications exclusively by offering notable power and cost savings with close performance and reliability values for the set of functions reported in the experimental results.",
author = "Sanem Arslan and Topcuoglu, {Haluk Rahmi} and Mahmut Kandemir and Oguz Tosun",
year = "2017",
month = "4",
day = "1",
doi = "10.1016/j.sysarc.2016.12.004",
language = "English (US)",
volume = "75",
pages = "133--144",
journal = "Journal of Systems Architecture",
issn = "1383-7621",
publisher = "Elsevier",

}

A selective protection scheme of applications using asymmetrically reliable caches. / Arslan, Sanem; Topcuoglu, Haluk Rahmi; Kandemir, Mahmut; Tosun, Oguz.

In: Journal of Systems Architecture, Vol. 75, 01.04.2017, p. 133-144.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A selective protection scheme of applications using asymmetrically reliable caches

AU - Arslan, Sanem

AU - Topcuoglu, Haluk Rahmi

AU - Kandemir, Mahmut

AU - Tosun, Oguz

PY - 2017/4/1

Y1 - 2017/4/1

N2 - Cache structures in a multicore system are highly vulnerable to soft errors. Enabling fault tolerance capabilities on all cache structures in a system is inefficient in terms of performance and power consumption. In this study, we propose an enhanced protection mechanism for code segments, which are critical in terms of reliability, by utilizing asymmetrically reliable cores under performance and power constraints. Our proposed system contains at least one high-reliability core, which has an ECC-protected L1 cache, and several low-reliability cores, which have no protection mechanisms. Reliability-based critical code regions are assumed to be high-priority functions, which are extracted by examining the execution time percentages and the program's call graph in our framework, statically. Software threads that invoke one of the high-priority functions are bound to the high-reliability cores dynamically during the execution, while the threads that execute the remaining functions are bound to the low-reliability cores. As part of the experimental analysis, our proposed framework is compared with traditional fully protected and unprotected configurations with respect to performance, power and reliability metrics for various applications of the benchmarks. Our framework exploits the benefits of providing the reliability-based critical regions of the applications exclusively by offering notable power and cost savings with close performance and reliability values for the set of functions reported in the experimental results.

AB - Cache structures in a multicore system are highly vulnerable to soft errors. Enabling fault tolerance capabilities on all cache structures in a system is inefficient in terms of performance and power consumption. In this study, we propose an enhanced protection mechanism for code segments, which are critical in terms of reliability, by utilizing asymmetrically reliable cores under performance and power constraints. Our proposed system contains at least one high-reliability core, which has an ECC-protected L1 cache, and several low-reliability cores, which have no protection mechanisms. Reliability-based critical code regions are assumed to be high-priority functions, which are extracted by examining the execution time percentages and the program's call graph in our framework, statically. Software threads that invoke one of the high-priority functions are bound to the high-reliability cores dynamically during the execution, while the threads that execute the remaining functions are bound to the low-reliability cores. As part of the experimental analysis, our proposed framework is compared with traditional fully protected and unprotected configurations with respect to performance, power and reliability metrics for various applications of the benchmarks. Our framework exploits the benefits of providing the reliability-based critical regions of the applications exclusively by offering notable power and cost savings with close performance and reliability values for the set of functions reported in the experimental results.

UR - http://www.scopus.com/inward/record.url?scp=85008219319&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85008219319&partnerID=8YFLogxK

U2 - 10.1016/j.sysarc.2016.12.004

DO - 10.1016/j.sysarc.2016.12.004

M3 - Article

AN - SCOPUS:85008219319

VL - 75

SP - 133

EP - 144

JO - Journal of Systems Architecture

JF - Journal of Systems Architecture

SN - 1383-7621

ER -