TY - JOUR
T1 - Scheduling opportunities for asymmetrically reliable caches
AU - Arslan, Sanem
AU - Topcuoglu, Haluk Rahmi
AU - Kandemir, Mahmut Taylan
AU - Tosun, Oguz
N1 - Funding Information:
This research was supported by The Scientific and Technological Research Council of Turkey (TUBITAK) with a research grant (Project Number: 113E530 ).
Funding Information:
Mahmut Kandemir is a professor in the Computer Science and Engineering Department at the Pennsylvania State University. He is a member of the Microsystems Design Lab. Dr. Kandemir’s research interests are in optimizing compilers, runtime systems, embedded systems, I/O and high performance storage, and power-aware computing. He is the author of more than 80 journal publications and over 300 conference/workshop papers in these areas. He has graduated 11 Ph.D. and 8 masters students so far, and is currently supervising 15 Ph.D. students and 1 masters student. He has served in the program committees of 40 conferences and workshops. His research is funded by NSF, DARPA, and SRC. He is a recipient of NSF Career Award and the Penn State Engineering Society Outstanding Research Award. He currently serves as the Graduate Coordinator of the Computer Science and Engineering Department at Penn State.
Publisher Copyright:
© 2019 Elsevier Inc.
PY - 2019/4
Y1 - 2019/4
N2 - Modern systems become more vulnerable to soft errors with technology scaling. Providing fault tolerance strategies on all structures in a system may lead to high energy consumption. Our framework with asymmetrically reliable caches with at least one protected core and several unprotected cores dynamically assigns the software threads executing critical code fragments to the protected core(s) with the FCFS-based algorithm. The framework can provide good reliability, performance, and power consumption trade-offs compared with the fully protected and unprotected systems. However, FCFS-based scheduling algorithm may degrade the system performance and unfairly slow down applications for some workloads. In this paper, a set of scheduling algorithms is proposed to improve both the system performance and fairness perspectives. Various static priority techniques that require preliminary information about the applications (such as their execution order, cache usage, number of requests sent to the protected core(s), and total burst time spent on the protected core(s)) are implemented and evaluated. On the other hand, dynamic priority techniques that target to equalize the total time spent of applications on the protected core(s) or the progress of the applications’ requests are presented. Extensive evaluations using multi-application workloads validate significant improvements of our static and dynamic priority scheduling techniques on system performance and fairness over the FCFS algorithm.
AB - Modern systems become more vulnerable to soft errors with technology scaling. Providing fault tolerance strategies on all structures in a system may lead to high energy consumption. Our framework with asymmetrically reliable caches with at least one protected core and several unprotected cores dynamically assigns the software threads executing critical code fragments to the protected core(s) with the FCFS-based algorithm. The framework can provide good reliability, performance, and power consumption trade-offs compared with the fully protected and unprotected systems. However, FCFS-based scheduling algorithm may degrade the system performance and unfairly slow down applications for some workloads. In this paper, a set of scheduling algorithms is proposed to improve both the system performance and fairness perspectives. Various static priority techniques that require preliminary information about the applications (such as their execution order, cache usage, number of requests sent to the protected core(s), and total burst time spent on the protected core(s)) are implemented and evaluated. On the other hand, dynamic priority techniques that target to equalize the total time spent of applications on the protected core(s) or the progress of the applications’ requests are presented. Extensive evaluations using multi-application workloads validate significant improvements of our static and dynamic priority scheduling techniques on system performance and fairness over the FCFS algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85060194747&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060194747&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2019.01.005
DO - 10.1016/j.jpdc.2019.01.005
M3 - Article
AN - SCOPUS:85060194747
VL - 126
SP - 134
EP - 151
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
SN - 0743-7315
ER -