Transient faults are expected a be a major design consideration in future microprocessors. Recent proposals for transient fault detection in processor cores have revolved around the idea of redundant threading, which involves redundant execution of a program across multiple execution contexts. This paper presents a new approach to redundant threading by bringing together the concepts of slice-level execution and value and control-flow locality into a novel partial redundant threading mechanism called SlicK. The purpose of redundant execution is to check the integrity of the outputs propagating out of the core (typically through stores). SlicK implements redundancy at the granularity of backward-slices of these output instructions and exploits value and control-flow locality to avoid redundantly executing slices that lead to predictable outputs, thereby avoiding redundant execution of a significant fraction of instructions while maintaining extremely low vulnerabilities for critical processor structures. We propose the microarchitecture of a backward-slice extractor called SliceEM that is able to identify backward slices without interrupting the instruction flow, and show how this extractor and a set of predictors can be integrated into a redundant threading mechanism to form SlicK. Detailed simulations with SPEC CPU2000 benchmarks show that SlicK can provide around 10.2% performance improvement over a well known redundant threading mechanism, buying back over 50% of the loss suffered due to redundant execution. SlicK can keep the Architectural Vulnerability Factors of processor structures to typically 0%-2%. More importantly, SlicK's slice-based mechanisms provide future opportunities for exploring interesting points in the performance-reliability design space based on market segment needs.
|Original language||English (US)|
|Number of pages||11|
|Journal||ACM SIGPLAN Notices|
|State||Published - Nov 1 2006|
All Science Journal Classification (ASJC) codes
- Computer Science(all)