Using a random forest to inspire a neural network and improving on it

Suhang Wang, Charu Aggarwal, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
EditorsNitesh Chawla, Wei Wang
PublisherSociety for Industrial and Applied Mathematics Publications
Pages1-9
Number of pages9
ISBN (Electronic)9781611974874
StatePublished - Jan 1 2017
Event17th SIAM International Conference on Data Mining, SDM 2017 - Houston, United States
Duration: Apr 27 2017Apr 29 2017

Publication series

NameProceedings of the 17th SIAM International Conference on Data Mining, SDM 2017

Other

Other17th SIAM International Conference on Data Mining, SDM 2017
CountryUnited States
CityHouston
Period4/27/174/29/17

Fingerprint

Neural networks
Image recognition
Backpropagation algorithms
Decision trees
Speech recognition
Learning systems

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications

Cite this

Wang, S., Aggarwal, C., & Liu, H. (2017). Using a random forest to inspire a neural network and improving on it. In N. Chawla, & W. Wang (Eds.), Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 (pp. 1-9). (Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017). Society for Industrial and Applied Mathematics Publications.
Wang, Suhang ; Aggarwal, Charu ; Liu, Huan. / Using a random forest to inspire a neural network and improving on it. Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017. editor / Nitesh Chawla ; Wei Wang. Society for Industrial and Applied Mathematics Publications, 2017. pp. 1-9 (Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017).
@inproceedings{305895c38cb74b8395ed9f3ba86d58a6,
title = "Using a random forest to inspire a neural network and improving on it",
abstract = "Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.",
author = "Suhang Wang and Charu Aggarwal and Huan Liu",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017",
publisher = "Society for Industrial and Applied Mathematics Publications",
pages = "1--9",
editor = "Nitesh Chawla and Wei Wang",
booktitle = "Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017",
address = "United States",

}

Wang, S, Aggarwal, C & Liu, H 2017, Using a random forest to inspire a neural network and improving on it. in N Chawla & W Wang (eds), Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017. Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017, Society for Industrial and Applied Mathematics Publications, pp. 1-9, 17th SIAM International Conference on Data Mining, SDM 2017, Houston, United States, 4/27/17.

Using a random forest to inspire a neural network and improving on it. / Wang, Suhang; Aggarwal, Charu; Liu, Huan.

Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017. ed. / Nitesh Chawla; Wei Wang. Society for Industrial and Applied Mathematics Publications, 2017. p. 1-9 (Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Using a random forest to inspire a neural network and improving on it

AU - Wang, Suhang

AU - Aggarwal, Charu

AU - Liu, Huan

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.

AB - Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.

UR - http://www.scopus.com/inward/record.url?scp=85027852106&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027852106&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85027852106

T3 - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017

SP - 1

EP - 9

BT - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017

A2 - Chawla, Nitesh

A2 - Wang, Wei

PB - Society for Industrial and Applied Mathematics Publications

ER -

Wang S, Aggarwal C, Liu H. Using a random forest to inspire a neural network and improving on it. In Chawla N, Wang W, editors, Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017. Society for Industrial and Applied Mathematics Publications. 2017. p. 1-9. (Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017).