Pruning Recurrent Neural Networks for Improved Generalization Performance

C. Lee Giles, Christian W. Omlin

Research output: Contribution to journalArticle

53 Citations (Scopus)

Abstract

Determining the architecture of a neural network is an important issue for any learning task. For recurrent neural networks no general methods exist that permit the estimation of the number of layers of hidden neurons, the size of layers or the number of weights. We present a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks. We illustrate this heuristic by training a fully recurrent neural network on positive and negative strings of a regular grammar. We also show that rules extracted from networks trained with this pruning heuristic are more consistent with the rules to be learned. This performance improvement is obtained by pruning and retraining the networks. Simulations are shown for training and pruning a recurrent neural net on strings generated by two regular grammars, a randomly-generated 10-state grammar and an 8-state, triple-parity grammar. Further simulations indicate that this pruning method can have generalization performance superior to that obtained by training with weight decay.

Original languageEnglish (US)
Pages (from-to)848-851
Number of pages4
JournalIEEE Transactions on Neural Networks
Volume5
Issue number5
DOIs
StatePublished - Sep 1994

Fingerprint

Recurrent neural networks
Neural networks
Neurons

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

@article{6ccfe1eedb5b424f807c2094b0490600,
title = "Pruning Recurrent Neural Networks for Improved Generalization Performance",
abstract = "Determining the architecture of a neural network is an important issue for any learning task. For recurrent neural networks no general methods exist that permit the estimation of the number of layers of hidden neurons, the size of layers or the number of weights. We present a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks. We illustrate this heuristic by training a fully recurrent neural network on positive and negative strings of a regular grammar. We also show that rules extracted from networks trained with this pruning heuristic are more consistent with the rules to be learned. This performance improvement is obtained by pruning and retraining the networks. Simulations are shown for training and pruning a recurrent neural net on strings generated by two regular grammars, a randomly-generated 10-state grammar and an 8-state, triple-parity grammar. Further simulations indicate that this pruning method can have generalization performance superior to that obtained by training with weight decay.",
author = "Giles, {C. Lee} and Omlin, {Christian W.}",
year = "1994",
month = "9",
doi = "10.1109/72.317740",
language = "English (US)",
volume = "5",
pages = "848--851",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",
number = "5",

}

Pruning Recurrent Neural Networks for Improved Generalization Performance. / Giles, C. Lee; Omlin, Christian W.

In: IEEE Transactions on Neural Networks, Vol. 5, No. 5, 09.1994, p. 848-851.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Pruning Recurrent Neural Networks for Improved Generalization Performance

AU - Giles, C. Lee

AU - Omlin, Christian W.

PY - 1994/9

Y1 - 1994/9

N2 - Determining the architecture of a neural network is an important issue for any learning task. For recurrent neural networks no general methods exist that permit the estimation of the number of layers of hidden neurons, the size of layers or the number of weights. We present a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks. We illustrate this heuristic by training a fully recurrent neural network on positive and negative strings of a regular grammar. We also show that rules extracted from networks trained with this pruning heuristic are more consistent with the rules to be learned. This performance improvement is obtained by pruning and retraining the networks. Simulations are shown for training and pruning a recurrent neural net on strings generated by two regular grammars, a randomly-generated 10-state grammar and an 8-state, triple-parity grammar. Further simulations indicate that this pruning method can have generalization performance superior to that obtained by training with weight decay.

AB - Determining the architecture of a neural network is an important issue for any learning task. For recurrent neural networks no general methods exist that permit the estimation of the number of layers of hidden neurons, the size of layers or the number of weights. We present a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks. We illustrate this heuristic by training a fully recurrent neural network on positive and negative strings of a regular grammar. We also show that rules extracted from networks trained with this pruning heuristic are more consistent with the rules to be learned. This performance improvement is obtained by pruning and retraining the networks. Simulations are shown for training and pruning a recurrent neural net on strings generated by two regular grammars, a randomly-generated 10-state grammar and an 8-state, triple-parity grammar. Further simulations indicate that this pruning method can have generalization performance superior to that obtained by training with weight decay.

UR - http://www.scopus.com/inward/record.url?scp=0028495332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028495332&partnerID=8YFLogxK

U2 - 10.1109/72.317740

DO - 10.1109/72.317740

M3 - Article

C2 - 18267860

AN - SCOPUS:0028495332

VL - 5

SP - 848

EP - 851

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 5

ER -