### Abstract

This paper presents reduced-order modeling of time-series data for a special class of Markov models using symbolic dynamics. These models are constructed from the time-series signal by partitioning the data and then inferring a probabilistic finite state automaton (PFSA) from the resulting symbol sequence, capturing a finite history (or memory) of symbol strings. In the proposed approach, the size of the temporal memory of a symbol sequence is estimated from spectral properties of the resulting stochastic matrix corresponding to a first-order Markov model of the symbol sequence. Then, agglomerative hierarchical clustering is used to cluster states of the corresponding full-order Markov model to construct a reduced-order Markov model based on information-theoretic criteria with a non-deterministic algebraic structure; the parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The model size is inferred using an information-theoretic inspired criteria; the Markov parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The paper also identifies theoretical bounds on the error induced in the reduced-size model in terms of expected Hamming distance between the sequences generated by the original and final reduced-size models. The proposed concept is elucidated and validated by two examples on different data sets. The first example analyzes a set of time series of pressure oscillations in a swirl-stabilized combustor, where controlled protocols are used to induce flame instabilities. Variations in the complexity of the derived Markov model represent how the system operating condition changes from stable to an unstable combustion regime. The second example is built upon a public data set of NASA's repository for prognosis of rolling-element bearings. It is shown that: (i) even with a small state-space, the reduced-order models are able to achieve comparable performance, and (ii) the proposed approach provides flexibility in the selection of a reduced-order model for data representation and learning.

Original language | English (US) |
---|---|

Pages (from-to) | 68-81 |

Number of pages | 14 |

Journal | Signal Processing |

Volume | 149 |

DOIs | |

State | Published - Aug 1 2018 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Control and Systems Engineering
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering

### Cite this

*Signal Processing*,

*149*, 68-81. https://doi.org/10.1016/j.sigpro.2018.03.004

}

*Signal Processing*, vol. 149, pp. 68-81. https://doi.org/10.1016/j.sigpro.2018.03.004

**Symbolic analysis-based reduced order Markov modeling of time series data.** / Jha, Devesh K.; Virani, Nurali; Reimann, Jan Severin; Srivastav, Abhishek; Ray, Asok.

Research output: Contribution to journal › Article

TY - JOUR

T1 - Symbolic analysis-based reduced order Markov modeling of time series data

AU - Jha, Devesh K.

AU - Virani, Nurali

AU - Reimann, Jan Severin

AU - Srivastav, Abhishek

AU - Ray, Asok

PY - 2018/8/1

Y1 - 2018/8/1

N2 - This paper presents reduced-order modeling of time-series data for a special class of Markov models using symbolic dynamics. These models are constructed from the time-series signal by partitioning the data and then inferring a probabilistic finite state automaton (PFSA) from the resulting symbol sequence, capturing a finite history (or memory) of symbol strings. In the proposed approach, the size of the temporal memory of a symbol sequence is estimated from spectral properties of the resulting stochastic matrix corresponding to a first-order Markov model of the symbol sequence. Then, agglomerative hierarchical clustering is used to cluster states of the corresponding full-order Markov model to construct a reduced-order Markov model based on information-theoretic criteria with a non-deterministic algebraic structure; the parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The model size is inferred using an information-theoretic inspired criteria; the Markov parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The paper also identifies theoretical bounds on the error induced in the reduced-size model in terms of expected Hamming distance between the sequences generated by the original and final reduced-size models. The proposed concept is elucidated and validated by two examples on different data sets. The first example analyzes a set of time series of pressure oscillations in a swirl-stabilized combustor, where controlled protocols are used to induce flame instabilities. Variations in the complexity of the derived Markov model represent how the system operating condition changes from stable to an unstable combustion regime. The second example is built upon a public data set of NASA's repository for prognosis of rolling-element bearings. It is shown that: (i) even with a small state-space, the reduced-order models are able to achieve comparable performance, and (ii) the proposed approach provides flexibility in the selection of a reduced-order model for data representation and learning.

AB - This paper presents reduced-order modeling of time-series data for a special class of Markov models using symbolic dynamics. These models are constructed from the time-series signal by partitioning the data and then inferring a probabilistic finite state automaton (PFSA) from the resulting symbol sequence, capturing a finite history (or memory) of symbol strings. In the proposed approach, the size of the temporal memory of a symbol sequence is estimated from spectral properties of the resulting stochastic matrix corresponding to a first-order Markov model of the symbol sequence. Then, agglomerative hierarchical clustering is used to cluster states of the corresponding full-order Markov model to construct a reduced-order Markov model based on information-theoretic criteria with a non-deterministic algebraic structure; the parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The model size is inferred using an information-theoretic inspired criteria; the Markov parameters of the reduced-order model are identified from the original model by making use of a Bayesian inference rule. The paper also identifies theoretical bounds on the error induced in the reduced-size model in terms of expected Hamming distance between the sequences generated by the original and final reduced-size models. The proposed concept is elucidated and validated by two examples on different data sets. The first example analyzes a set of time series of pressure oscillations in a swirl-stabilized combustor, where controlled protocols are used to induce flame instabilities. Variations in the complexity of the derived Markov model represent how the system operating condition changes from stable to an unstable combustion regime. The second example is built upon a public data set of NASA's repository for prognosis of rolling-element bearings. It is shown that: (i) even with a small state-space, the reduced-order models are able to achieve comparable performance, and (ii) the proposed approach provides flexibility in the selection of a reduced-order model for data representation and learning.

UR - http://www.scopus.com/inward/record.url?scp=85044126828&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044126828&partnerID=8YFLogxK

U2 - 10.1016/j.sigpro.2018.03.004

DO - 10.1016/j.sigpro.2018.03.004

M3 - Article

AN - SCOPUS:85044126828

VL - 149

SP - 68

EP - 81

JO - Signal Processing

JF - Signal Processing

SN - 0165-1684

ER -