Cloud-Based Parallel Machine Learning for Tool Wear Prediction

Dazhong Wu, Connor Jennings, Janis Terpenny, Soundar Kumara, Robert X. Gao

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

The emergence of cloud computing, industrial internet of things (IIoT), and new machine learning techniques have shown the potential to advance prognostics and health management (PHM) in smart manufacturing. While model-based PHM techniques provide insight into the progression of faults in mechanical components, certain assumptions on the underlying physical mechanisms for fault development are required to develop predictive models. In situations where there is a lack of adequate prior knowledge of the underlying physics, data-driven PHM techniques have been increasingly applied in the field of smart manufacturing. One of the limitations of current data-driven methods is that large volumes of training data are required to make accurate predictions. Consequently, computational efficiency remains a primary challenge, especially when large volumes of sensor-generated data need to be processed in real-time applications. The objective of this research is to introduce a cloud-based parallel machine learning algorithm that is capable of training large-scale predictive models more efficiently. The random forests (RFs) algorithm is parallelized using the MapReduce data processing scheme. The MapReduce-based parallel random forests (PRFs) algorithm is implemented on a scalable cloud computing system with varying combinations of processors and memories. The effectiveness of this new method is demonstrated using condition monitoring data collected from milling experiments. By implementing RFs in parallel on the cloud, a significant increase in the processing speed (14.7 times in terms of increase in training time) has been achieved, with a high prediction accuracy of tool wear (eight times in terms of reduction in mean squared error (MSE)).

Original languageEnglish (US)
Article number041005
JournalJournal of Manufacturing Science and Engineering, Transactions of the ASME
Volume140
Issue number4
DOIs
StatePublished - Jan 1 2018

Fingerprint

Learning systems
Wear of materials
Health
Cloud computing
Condition monitoring
Computational efficiency
Learning algorithms
Computer systems
Physics
Data storage equipment
Sensors
Processing
Experiments
Internet of things

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Mechanical Engineering
  • Computer Science Applications
  • Industrial and Manufacturing Engineering

Cite this

@article{98a11465edeb4bbb8d032e104712c2b8,
title = "Cloud-Based Parallel Machine Learning for Tool Wear Prediction",
abstract = "The emergence of cloud computing, industrial internet of things (IIoT), and new machine learning techniques have shown the potential to advance prognostics and health management (PHM) in smart manufacturing. While model-based PHM techniques provide insight into the progression of faults in mechanical components, certain assumptions on the underlying physical mechanisms for fault development are required to develop predictive models. In situations where there is a lack of adequate prior knowledge of the underlying physics, data-driven PHM techniques have been increasingly applied in the field of smart manufacturing. One of the limitations of current data-driven methods is that large volumes of training data are required to make accurate predictions. Consequently, computational efficiency remains a primary challenge, especially when large volumes of sensor-generated data need to be processed in real-time applications. The objective of this research is to introduce a cloud-based parallel machine learning algorithm that is capable of training large-scale predictive models more efficiently. The random forests (RFs) algorithm is parallelized using the MapReduce data processing scheme. The MapReduce-based parallel random forests (PRFs) algorithm is implemented on a scalable cloud computing system with varying combinations of processors and memories. The effectiveness of this new method is demonstrated using condition monitoring data collected from milling experiments. By implementing RFs in parallel on the cloud, a significant increase in the processing speed (14.7 times in terms of increase in training time) has been achieved, with a high prediction accuracy of tool wear (eight times in terms of reduction in mean squared error (MSE)).",
author = "Dazhong Wu and Connor Jennings and Janis Terpenny and Soundar Kumara and Gao, {Robert X.}",
year = "2018",
month = "1",
day = "1",
doi = "10.1115/1.4038002",
language = "English (US)",
volume = "140",
journal = "Journal of Manufacturing Science and Engineering, Transactions of the ASME",
issn = "1087-1357",
publisher = "American Society of Mechanical Engineers(ASME)",
number = "4",

}

Cloud-Based Parallel Machine Learning for Tool Wear Prediction. / Wu, Dazhong; Jennings, Connor; Terpenny, Janis; Kumara, Soundar; Gao, Robert X.

In: Journal of Manufacturing Science and Engineering, Transactions of the ASME, Vol. 140, No. 4, 041005, 01.01.2018.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Cloud-Based Parallel Machine Learning for Tool Wear Prediction

AU - Wu, Dazhong

AU - Jennings, Connor

AU - Terpenny, Janis

AU - Kumara, Soundar

AU - Gao, Robert X.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The emergence of cloud computing, industrial internet of things (IIoT), and new machine learning techniques have shown the potential to advance prognostics and health management (PHM) in smart manufacturing. While model-based PHM techniques provide insight into the progression of faults in mechanical components, certain assumptions on the underlying physical mechanisms for fault development are required to develop predictive models. In situations where there is a lack of adequate prior knowledge of the underlying physics, data-driven PHM techniques have been increasingly applied in the field of smart manufacturing. One of the limitations of current data-driven methods is that large volumes of training data are required to make accurate predictions. Consequently, computational efficiency remains a primary challenge, especially when large volumes of sensor-generated data need to be processed in real-time applications. The objective of this research is to introduce a cloud-based parallel machine learning algorithm that is capable of training large-scale predictive models more efficiently. The random forests (RFs) algorithm is parallelized using the MapReduce data processing scheme. The MapReduce-based parallel random forests (PRFs) algorithm is implemented on a scalable cloud computing system with varying combinations of processors and memories. The effectiveness of this new method is demonstrated using condition monitoring data collected from milling experiments. By implementing RFs in parallel on the cloud, a significant increase in the processing speed (14.7 times in terms of increase in training time) has been achieved, with a high prediction accuracy of tool wear (eight times in terms of reduction in mean squared error (MSE)).

AB - The emergence of cloud computing, industrial internet of things (IIoT), and new machine learning techniques have shown the potential to advance prognostics and health management (PHM) in smart manufacturing. While model-based PHM techniques provide insight into the progression of faults in mechanical components, certain assumptions on the underlying physical mechanisms for fault development are required to develop predictive models. In situations where there is a lack of adequate prior knowledge of the underlying physics, data-driven PHM techniques have been increasingly applied in the field of smart manufacturing. One of the limitations of current data-driven methods is that large volumes of training data are required to make accurate predictions. Consequently, computational efficiency remains a primary challenge, especially when large volumes of sensor-generated data need to be processed in real-time applications. The objective of this research is to introduce a cloud-based parallel machine learning algorithm that is capable of training large-scale predictive models more efficiently. The random forests (RFs) algorithm is parallelized using the MapReduce data processing scheme. The MapReduce-based parallel random forests (PRFs) algorithm is implemented on a scalable cloud computing system with varying combinations of processors and memories. The effectiveness of this new method is demonstrated using condition monitoring data collected from milling experiments. By implementing RFs in parallel on the cloud, a significant increase in the processing speed (14.7 times in terms of increase in training time) has been achieved, with a high prediction accuracy of tool wear (eight times in terms of reduction in mean squared error (MSE)).

UR - http://www.scopus.com/inward/record.url?scp=85042063538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042063538&partnerID=8YFLogxK

U2 - 10.1115/1.4038002

DO - 10.1115/1.4038002

M3 - Article

AN - SCOPUS:85042063538

VL - 140

JO - Journal of Manufacturing Science and Engineering, Transactions of the ASME

JF - Journal of Manufacturing Science and Engineering, Transactions of the ASME

SN - 1087-1357

IS - 4

M1 - 041005

ER -