TY - GEN

T1 - A locally optimal algorithm for estimating a generating partition from an observed time series

AU - Miller, David J.

AU - Ghalyan, Najah F.

AU - Ray, Asok

PY - 2017/12/5

Y1 - 2017/12/5

N2 - Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet uniquely specifies the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. Previous work attemps to minimize a clustering objective function that measures discrepancy between a set of reconstruction values and the points from the time series. Unfortunately, the resulting algorithm is non-convergent, with no guarantee of finding even locally optimal solutions. The problem is a heuristic 'nearest neighbor' symbol assignment step. Alternatively, we introduce a new, locally optimal algorithm. We apply iterative 'nearest neighbor' symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the time series is achieved. While some approaches use vector quantization to partition the state space, our approach only ensures a partition in the space consisting of the entire time series (effectively, clustering in an infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. We demonstrate improvement, with respect to several measures, over a popular method used in the literature.

AB - Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet uniquely specifies the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. Previous work attemps to minimize a clustering objective function that measures discrepancy between a set of reconstruction values and the points from the time series. Unfortunately, the resulting algorithm is non-convergent, with no guarantee of finding even locally optimal solutions. The problem is a heuristic 'nearest neighbor' symbol assignment step. Alternatively, we introduce a new, locally optimal algorithm. We apply iterative 'nearest neighbor' symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the time series is achieved. While some approaches use vector quantization to partition the state space, our approach only ensures a partition in the space consisting of the entire time series (effectively, clustering in an infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. We demonstrate improvement, with respect to several measures, over a popular method used in the literature.

UR - http://www.scopus.com/inward/record.url?scp=85042254824&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042254824&partnerID=8YFLogxK

U2 - 10.1109/MLSP.2017.8168162

DO - 10.1109/MLSP.2017.8168162

M3 - Conference contribution

AN - SCOPUS:85042254824

T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP

SP - 1

EP - 6

BT - 2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Proceedings

A2 - Ueda, Naonori

A2 - Chien, Jen-Tzung

A2 - Matsui, Tomoko

A2 - Larsen, Jan

A2 - Watanabe, Shinji

PB - IEEE Computer Society

T2 - 2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017

Y2 - 25 September 2017 through 28 September 2017

ER -