Generalized projected clustering in high-dimensional data streams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Consider the problem of identifying dense subgroups of data points exhibiting strong correlations in data stream. Such correlation connected clusters are meaningful in many applications. However, the inherent sparsity of high-dimensional space means that the correlations are local for specific subspace, and moreover, the correlation itself can be of arbitrarily complex direction, which blinds most traditional methods. We present ACID, a framework that can effectively detect correlation connected clusters in high dimensional stream. It has high scalability on both the size of stream and the dimension of data, and is robust against noise. Experiments on synthetic and real datasets are done to show its effectiveness and efficiency.

Original languageEnglish (US)
Title of host publicationFrontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings
Pages772-778
Number of pages7
StatePublished - Jul 6 2006
Event8th Asia-Pacific Web Conference, APWeb 2006: Frontiers of WWW Research and Development - Harbin, China
Duration: Jan 16 2006Jan 18 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3841 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Asia-Pacific Web Conference, APWeb 2006: Frontiers of WWW Research and Development
CountryChina
CityHarbin
Period1/16/061/18/06

Fingerprint

High-dimensional Data
Data Streams
Scalability
Clustering
Experiments
High-dimensional
Sparsity
Subspace
Subgroup
Experiment

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Wang, T. (2006). Generalized projected clustering in high-dimensional data streams. In Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings (pp. 772-778). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3841 LNCS).
Wang, Ting. / Generalized projected clustering in high-dimensional data streams. Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings. 2006. pp. 772-778 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8b8ad743095047e7b0350cea713c137d,
title = "Generalized projected clustering in high-dimensional data streams",
abstract = "Consider the problem of identifying dense subgroups of data points exhibiting strong correlations in data stream. Such correlation connected clusters are meaningful in many applications. However, the inherent sparsity of high-dimensional space means that the correlations are local for specific subspace, and moreover, the correlation itself can be of arbitrarily complex direction, which blinds most traditional methods. We present ACID, a framework that can effectively detect correlation connected clusters in high dimensional stream. It has high scalability on both the size of stream and the dimension of data, and is robust against noise. Experiments on synthetic and real datasets are done to show its effectiveness and efficiency.",
author = "Ting Wang",
year = "2006",
month = "7",
day = "6",
language = "English (US)",
isbn = "3540311424",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "772--778",
booktitle = "Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings",

}

Wang, T 2006, Generalized projected clustering in high-dimensional data streams. in Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3841 LNCS, pp. 772-778, 8th Asia-Pacific Web Conference, APWeb 2006: Frontiers of WWW Research and Development, Harbin, China, 1/16/06.

Generalized projected clustering in high-dimensional data streams. / Wang, Ting.

Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings. 2006. p. 772-778 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3841 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Generalized projected clustering in high-dimensional data streams

AU - Wang, Ting

PY - 2006/7/6

Y1 - 2006/7/6

N2 - Consider the problem of identifying dense subgroups of data points exhibiting strong correlations in data stream. Such correlation connected clusters are meaningful in many applications. However, the inherent sparsity of high-dimensional space means that the correlations are local for specific subspace, and moreover, the correlation itself can be of arbitrarily complex direction, which blinds most traditional methods. We present ACID, a framework that can effectively detect correlation connected clusters in high dimensional stream. It has high scalability on both the size of stream and the dimension of data, and is robust against noise. Experiments on synthetic and real datasets are done to show its effectiveness and efficiency.

AB - Consider the problem of identifying dense subgroups of data points exhibiting strong correlations in data stream. Such correlation connected clusters are meaningful in many applications. However, the inherent sparsity of high-dimensional space means that the correlations are local for specific subspace, and moreover, the correlation itself can be of arbitrarily complex direction, which blinds most traditional methods. We present ACID, a framework that can effectively detect correlation connected clusters in high dimensional stream. It has high scalability on both the size of stream and the dimension of data, and is robust against noise. Experiments on synthetic and real datasets are done to show its effectiveness and efficiency.

UR - http://www.scopus.com/inward/record.url?scp=33745659451&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745659451&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33745659451

SN - 3540311424

SN - 9783540311423

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 772

EP - 778

BT - Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings

ER -

Wang T. Generalized projected clustering in high-dimensional data streams. In Frontiers of WWW Research and Development - APWeb 2006 - 8th Asia-Pacific Web Conference, Proceedings. 2006. p. 772-778. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).