Learning to quantize deep neural networks: A competitive-collaborative approach

Md Fahim Faysal Khan, Mohammad Mahdi Kamani, Mehrdad Mahdavi, Vijaykrishnan Narayanan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Reducing the model size and computation costs for dedicated AI accelerator designs, neural network quantization methods have attracted momentous attention recently. Unfortunately, merely minimizing quantization loss using constant discretization causes accuracy deterioration. In this paper, we propose an iterative accuracy-driven learning framework of competitive-collaborative quantization (CCQ) to gradually adapt the bit-precision of each individual layer. Orthogonal to prior quantization policies working with full precision for the first and last layers of the network, CCQ offers layer-wise competition for any target quantization policy with holistic layer fine-tuning to recover accuracy, where the state-of-the-art networks can be entirely quantized without any significant accuracy degradation.

Original languageEnglish (US)
Title of host publication2020 57th ACM/IEEE Design Automation Conference, DAC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781450367257
DOIs
StatePublished - Jul 2020
Event57th ACM/IEEE Design Automation Conference, DAC 2020 - Virtual, San Francisco, United States
Duration: Jul 20 2020Jul 24 2020

Publication series

NameProceedings - Design Automation Conference
Volume2020-July
ISSN (Print)0738-100X

Conference

Conference57th ACM/IEEE Design Automation Conference, DAC 2020
CountryUnited States
CityVirtual, San Francisco
Period7/20/207/24/20

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Learning to quantize deep neural networks: A competitive-collaborative approach'. Together they form a unique fingerprint.

Cite this