Architecture-Centric Bottleneck Analysis for Deep Neural Network Applications

Jihyun Ryoo, Mengran Fan, Xulong Tang, Huaipan Jiang, Meena Arunachalam, Sharada Naveen, Mahmut T. Kandemir

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The ever-growing complexity and popularity of machine learning and deep learning applications have motivated an urgent need of effective and efficient support for these applications on contemporary computing systems. In this paper, we thoroughly analyze the various DNN algorithms on three widely used architectures (CPU, GPU, and Xeon Phi). The DNN algorithms we choose for evaluation include i) Unet-for biomedical image segmentation, based on Convolutional Neural Network (CNN), ii) NMT-for neural machine translation based on Recurrent Neural Network (RNN), iii) ResNet-50, and iv) DenseNet-both for image processing based on CNNs. The ultimate goal of this paper is to answer four fundamental questions: i) whether the different DNN networks exhibit similar behavior on a given execution platform? ii) whether, across different platforms, a given DNN network exhibits different behaviors? iii) for the same execution platform and the same DNN network, whether different execution phases have different behaviors? and iv) are the current major general-purpose platforms tuned sufficiently well for different DNN algorithms? Motivated by these questions, we conduct an in-depth investigation of running DNN applications on modern systems. Specifically, we first identify the most time-consuming functions (hotspot functions) across different networks and platforms. Next, we characterize performance bottlenecks and discuss them in detail. Finally, we port selected hotspot functions to a cycle-accurate simulator, and use the results to direct architectural optimizations to better support DNN applications.

Original languageEnglish (US)
Title of host publicationProceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages205-214
Number of pages10
ISBN (Electronic)9781728145358
DOIs
StatePublished - Dec 2019
Event26th Annual IEEE International Conference on High Performance Computing, HiPC 2019 - Hyderabad, India
Duration: Dec 17 2019Dec 20 2019

Publication series

NameProceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019

Conference

Conference26th Annual IEEE International Conference on High Performance Computing, HiPC 2019
CountryIndia
CityHyderabad
Period12/17/1912/20/19

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'Architecture-Centric Bottleneck Analysis for Deep Neural Network Applications'. Together they form a unique fingerprint.

  • Cite this

    Ryoo, J., Fan, M., Tang, X., Jiang, H., Arunachalam, M., Naveen, S., & Kandemir, M. T. (2019). Architecture-Centric Bottleneck Analysis for Deep Neural Network Applications. In Proceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019 (pp. 205-214). [8990516] (Proceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HiPC.2019.00034