PhiRSA: Exploiting the Computing Power of Vector Instructions on Intel Xeon Phi for RSA

Yuan Zhao, Wuqiong Pan, Jingqiang Lin, Peng Liu, Cong Xue, Fangyu Zheng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Efficient implementations of public-key cryptographic algorithms on general-purpose computing devices, facilitate the applications of cryptography in communication security. Existing solutions work in two different directions: implementations on GPUs achieve high throughput but great latency, while those on CPUs are with low throughput and small latency. Intel Xeon Phi is the first highly parallel coprocessor of Many Integrated Core (MIC) architecture, with up, to 61 cores and one 512-bit Vector Processing Unit (VPU) in each core, which offers the potential to achieve both high throughput and small latency. In this paper, we propose a vector-oriented Montgomery multiplication design based on vector carry propagation chain (VCPC) method to fully exploit the computing power of vector instructions on Intel Xeon Phi. Two key features of our design sharply reduce the number of instructions: (1) organizing the additions in Montgomery multiplication to be four VCPCs for saving the overhead of handling carry bits; (2) computing the intermediate scalar variable q in every round without breaking the flow of VCPCs. Furthermore, we offer the optimal Montgomery multiplication implementation of our design on Intel Xeon Phi, which make VPUs fully pipelined and maintain carry bits in vector mask registers. Based on the above, we implement RSA named PhiRSA and evaluate it on Intel Xeon Phi 7120P. For 1024, 2048 and 4096-bit RSA, PhiRSA performs 258,370, 41,803 and 5,358 decryptions per second, and the latencies are 0.94, 5.84 and 45.54, ms, respectively. These results achieve 4.1 to 8.5 times performance of the existing RSA implementations on Intel Xeon Phi, exhibit high throughput comparable to those on GPUs but with much less parallel tasks, and small latency comparable to those on CPUs.

Original languageEnglish (US)
Title of host publicationSelected Areas in Cryptography – SAC 2016 - 23rd International Conference, Revised Selected Papers
EditorsRoberto Avanzi, Howard Heys
PublisherSpringer Verlag
Pages482-500
Number of pages19
ISBN (Print)9783319694528
DOIs
StatePublished - Jan 1 2017
Event23rd International Conference on Selected Areas in Cryptography, SAC 2016 - St. John's, Canada
Duration: Aug 10 2016Aug 12 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10532 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other23rd International Conference on Selected Areas in Cryptography, SAC 2016
CountryCanada
CitySt. John's
Period8/10/168/12/16

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Zhao, Y., Pan, W., Lin, J., Liu, P., Xue, C., & Zheng, F. (2017). PhiRSA: Exploiting the Computing Power of Vector Instructions on Intel Xeon Phi for RSA. In R. Avanzi, & H. Heys (Eds.), Selected Areas in Cryptography – SAC 2016 - 23rd International Conference, Revised Selected Papers (pp. 482-500). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10532 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-69453-5_26