RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques

Gurunath Kadam, Danfeng Zhang, Adwait Jog

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Recently, GPUs have also shown significant speedups for a variety of security-sensitive applications such as encryptions. These speedups have largely benefited from the high memory bandwidth and compute throughput of GPUs. One of the key features to optimize the memory bandwidth consumption in GPUs is intra-warp memory access coalescing, which merges memory requests originating from different threads of a single warp into as few cache lines as possible. However, this coalescing feature is also shown to make the GPUs prone to the correlation timing attacks as it exposes the relationship between the execution time and the number of coalesced accesses. Consequently, an attacker is able to correctly reveal an AES private key via repeatedly gathering encrypted data and execution time on a GPU. In this work, we propose a series of defense mechanisms to alleviate such timing attacks by carefully trading off performance for improved security. Specifically, we propose to randomize the coalescing logic such that the attacker finds it hard to guess the correct number of coalesced accesses generated. To this end, we propose to randomize: a) the granularity (called as subwarp) at which warp threads are grouped together for coalescing, and b) the threads selected by each subwarp for coalescing. Such randomization techniques result in three mechanisms: fixed-sized subwarp (FSS), random-sized subwarp (RSS), and random-threaded subwarp (RTS). We find that the combination of these security mechanisms offers 24- to 961-times improvement in the security against the correlation timing attacks with 5 to 28% performance degradation.

Original languageEnglish (US)
Title of host publicationProceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018
PublisherIEEE Computer Society
Pages156-167
Number of pages12
ISBN (Electronic)9781538636596
DOIs
StatePublished - Mar 27 2018
Event24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018 - Vienna, Austria
Duration: Feb 24 2018Feb 28 2018

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
Volume2018-February
ISSN (Print)1530-0897

Other

Other24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018
CountryAustria
CityVienna
Period2/24/182/28/18

Fingerprint

Data storage equipment
Bandwidth
Augmented reality
Cryptography
Particle accelerators
Side channel attack
Graphics processing unit
Throughput
Degradation
Deep learning

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Cite this

Kadam, G., Zhang, D., & Jog, A. (2018). RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques. In Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018 (pp. 156-167). (Proceedings - International Symposium on High-Performance Computer Architecture; Vol. 2018-February). IEEE Computer Society. https://doi.org/10.1109/HPCA.2018.00023
Kadam, Gurunath ; Zhang, Danfeng ; Jog, Adwait. / RCoal : Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques. Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018. IEEE Computer Society, 2018. pp. 156-167 (Proceedings - International Symposium on High-Performance Computer Architecture).
@inproceedings{4bcd98a196fc4ffca731b4fd08215ec3,
title = "RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques",
abstract = "Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Recently, GPUs have also shown significant speedups for a variety of security-sensitive applications such as encryptions. These speedups have largely benefited from the high memory bandwidth and compute throughput of GPUs. One of the key features to optimize the memory bandwidth consumption in GPUs is intra-warp memory access coalescing, which merges memory requests originating from different threads of a single warp into as few cache lines as possible. However, this coalescing feature is also shown to make the GPUs prone to the correlation timing attacks as it exposes the relationship between the execution time and the number of coalesced accesses. Consequently, an attacker is able to correctly reveal an AES private key via repeatedly gathering encrypted data and execution time on a GPU. In this work, we propose a series of defense mechanisms to alleviate such timing attacks by carefully trading off performance for improved security. Specifically, we propose to randomize the coalescing logic such that the attacker finds it hard to guess the correct number of coalesced accesses generated. To this end, we propose to randomize: a) the granularity (called as subwarp) at which warp threads are grouped together for coalescing, and b) the threads selected by each subwarp for coalescing. Such randomization techniques result in three mechanisms: fixed-sized subwarp (FSS), random-sized subwarp (RSS), and random-threaded subwarp (RTS). We find that the combination of these security mechanisms offers 24- to 961-times improvement in the security against the correlation timing attacks with 5 to 28{\%} performance degradation.",
author = "Gurunath Kadam and Danfeng Zhang and Adwait Jog",
year = "2018",
month = "3",
day = "27",
doi = "10.1109/HPCA.2018.00023",
language = "English (US)",
series = "Proceedings - International Symposium on High-Performance Computer Architecture",
publisher = "IEEE Computer Society",
pages = "156--167",
booktitle = "Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018",
address = "United States",

}

Kadam, G, Zhang, D & Jog, A 2018, RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques. in Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018. Proceedings - International Symposium on High-Performance Computer Architecture, vol. 2018-February, IEEE Computer Society, pp. 156-167, 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018, Vienna, Austria, 2/24/18. https://doi.org/10.1109/HPCA.2018.00023

RCoal : Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques. / Kadam, Gurunath; Zhang, Danfeng; Jog, Adwait.

Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018. IEEE Computer Society, 2018. p. 156-167 (Proceedings - International Symposium on High-Performance Computer Architecture; Vol. 2018-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - RCoal

T2 - Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques

AU - Kadam, Gurunath

AU - Zhang, Danfeng

AU - Jog, Adwait

PY - 2018/3/27

Y1 - 2018/3/27

N2 - Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Recently, GPUs have also shown significant speedups for a variety of security-sensitive applications such as encryptions. These speedups have largely benefited from the high memory bandwidth and compute throughput of GPUs. One of the key features to optimize the memory bandwidth consumption in GPUs is intra-warp memory access coalescing, which merges memory requests originating from different threads of a single warp into as few cache lines as possible. However, this coalescing feature is also shown to make the GPUs prone to the correlation timing attacks as it exposes the relationship between the execution time and the number of coalesced accesses. Consequently, an attacker is able to correctly reveal an AES private key via repeatedly gathering encrypted data and execution time on a GPU. In this work, we propose a series of defense mechanisms to alleviate such timing attacks by carefully trading off performance for improved security. Specifically, we propose to randomize the coalescing logic such that the attacker finds it hard to guess the correct number of coalesced accesses generated. To this end, we propose to randomize: a) the granularity (called as subwarp) at which warp threads are grouped together for coalescing, and b) the threads selected by each subwarp for coalescing. Such randomization techniques result in three mechanisms: fixed-sized subwarp (FSS), random-sized subwarp (RSS), and random-threaded subwarp (RTS). We find that the combination of these security mechanisms offers 24- to 961-times improvement in the security against the correlation timing attacks with 5 to 28% performance degradation.

AB - Graphics processing units (GPUs) are becoming default accelerators in many domains such as high-performance computing (HPC), deep learning, and virtual/augmented reality. Recently, GPUs have also shown significant speedups for a variety of security-sensitive applications such as encryptions. These speedups have largely benefited from the high memory bandwidth and compute throughput of GPUs. One of the key features to optimize the memory bandwidth consumption in GPUs is intra-warp memory access coalescing, which merges memory requests originating from different threads of a single warp into as few cache lines as possible. However, this coalescing feature is also shown to make the GPUs prone to the correlation timing attacks as it exposes the relationship between the execution time and the number of coalesced accesses. Consequently, an attacker is able to correctly reveal an AES private key via repeatedly gathering encrypted data and execution time on a GPU. In this work, we propose a series of defense mechanisms to alleviate such timing attacks by carefully trading off performance for improved security. Specifically, we propose to randomize the coalescing logic such that the attacker finds it hard to guess the correct number of coalesced accesses generated. To this end, we propose to randomize: a) the granularity (called as subwarp) at which warp threads are grouped together for coalescing, and b) the threads selected by each subwarp for coalescing. Such randomization techniques result in three mechanisms: fixed-sized subwarp (FSS), random-sized subwarp (RSS), and random-threaded subwarp (RTS). We find that the combination of these security mechanisms offers 24- to 961-times improvement in the security against the correlation timing attacks with 5 to 28% performance degradation.

UR - http://www.scopus.com/inward/record.url?scp=85046741902&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046741902&partnerID=8YFLogxK

U2 - 10.1109/HPCA.2018.00023

DO - 10.1109/HPCA.2018.00023

M3 - Conference contribution

AN - SCOPUS:85046741902

T3 - Proceedings - International Symposium on High-Performance Computer Architecture

SP - 156

EP - 167

BT - Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018

PB - IEEE Computer Society

ER -

Kadam G, Zhang D, Jog A. RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques. In Proceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018. IEEE Computer Society. 2018. p. 156-167. (Proceedings - International Symposium on High-Performance Computer Architecture). https://doi.org/10.1109/HPCA.2018.00023