A homotopy training algorithm for fully connected neural networks

Qipin Chen, Wenrui Hao

Research output: Contribution to journalArticle

Abstract

In this paper, we present a homotopy training algorithm (HTA) to solve optimization problems arising from fully connected neural networks with complicated structures. The HTA dynamically builds the neural network starting from a simplified version and ending with the fully connected network via adding layers and nodes adaptively. Therefore, the corresponding optimization problem is easy to solve at the beginning and connects to the original model via a continuous path guided by the HTA, which provides a high probability of obtaining a global minimum. By gradually increasing the complexity of the model along the continuous path, the HTA provides a rather good solution to the original loss function. This is confirmed by various numerical results including VGG models on CIFAR-10. For example, on the VGG13 model with batch normalization, HTA reduces the error rate by 11.86% on the test dataset compared with the traditional method. Moreover, the HTA also allows us to find the optimal structure for a fully connected neural network by building the neutral network adaptively.

Original languageEnglish (US)
Article number20190662
JournalProceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
Volume475
Issue number2231
DOIs
StatePublished - Nov 1 2019

Fingerprint

Training Algorithm
Homotopy
education
Neural Networks
Neural networks
Neutral Networks
Optimization Problem
Path
optimization
Global Minimum
Loss Function
Model
Batch
Normalization
Error Rate
Numerical Results
Vertex of a graph

All Science Journal Classification (ASJC) codes

  • Mathematics(all)
  • Engineering(all)
  • Physics and Astronomy(all)

Cite this

@article{48a68cddb7dd49b9bf5bcc8af68832d3,
title = "A homotopy training algorithm for fully connected neural networks",
abstract = "In this paper, we present a homotopy training algorithm (HTA) to solve optimization problems arising from fully connected neural networks with complicated structures. The HTA dynamically builds the neural network starting from a simplified version and ending with the fully connected network via adding layers and nodes adaptively. Therefore, the corresponding optimization problem is easy to solve at the beginning and connects to the original model via a continuous path guided by the HTA, which provides a high probability of obtaining a global minimum. By gradually increasing the complexity of the model along the continuous path, the HTA provides a rather good solution to the original loss function. This is confirmed by various numerical results including VGG models on CIFAR-10. For example, on the VGG13 model with batch normalization, HTA reduces the error rate by 11.86{\%} on the test dataset compared with the traditional method. Moreover, the HTA also allows us to find the optimal structure for a fully connected neural network by building the neutral network adaptively.",
author = "Qipin Chen and Wenrui Hao",
year = "2019",
month = "11",
day = "1",
doi = "10.1098/rspa.2019.0662",
language = "English (US)",
volume = "475",
journal = "Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences",
issn = "0080-4630",
publisher = "Royal Society of London",
number = "2231",

}

A homotopy training algorithm for fully connected neural networks. / Chen, Qipin; Hao, Wenrui.

In: Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 475, No. 2231, 20190662, 01.11.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A homotopy training algorithm for fully connected neural networks

AU - Chen, Qipin

AU - Hao, Wenrui

PY - 2019/11/1

Y1 - 2019/11/1

N2 - In this paper, we present a homotopy training algorithm (HTA) to solve optimization problems arising from fully connected neural networks with complicated structures. The HTA dynamically builds the neural network starting from a simplified version and ending with the fully connected network via adding layers and nodes adaptively. Therefore, the corresponding optimization problem is easy to solve at the beginning and connects to the original model via a continuous path guided by the HTA, which provides a high probability of obtaining a global minimum. By gradually increasing the complexity of the model along the continuous path, the HTA provides a rather good solution to the original loss function. This is confirmed by various numerical results including VGG models on CIFAR-10. For example, on the VGG13 model with batch normalization, HTA reduces the error rate by 11.86% on the test dataset compared with the traditional method. Moreover, the HTA also allows us to find the optimal structure for a fully connected neural network by building the neutral network adaptively.

AB - In this paper, we present a homotopy training algorithm (HTA) to solve optimization problems arising from fully connected neural networks with complicated structures. The HTA dynamically builds the neural network starting from a simplified version and ending with the fully connected network via adding layers and nodes adaptively. Therefore, the corresponding optimization problem is easy to solve at the beginning and connects to the original model via a continuous path guided by the HTA, which provides a high probability of obtaining a global minimum. By gradually increasing the complexity of the model along the continuous path, the HTA provides a rather good solution to the original loss function. This is confirmed by various numerical results including VGG models on CIFAR-10. For example, on the VGG13 model with batch normalization, HTA reduces the error rate by 11.86% on the test dataset compared with the traditional method. Moreover, the HTA also allows us to find the optimal structure for a fully connected neural network by building the neutral network adaptively.

UR - http://www.scopus.com/inward/record.url?scp=85076210360&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85076210360&partnerID=8YFLogxK

U2 - 10.1098/rspa.2019.0662

DO - 10.1098/rspa.2019.0662

M3 - Article

C2 - 31824229

AN - SCOPUS:85076210360

VL - 475

JO - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

JF - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

SN - 0080-4630

IS - 2231

M1 - 20190662

ER -