PENS

An algorithm for density-based clustering in peer-to-peer systems

Mei Li, Guanling Lee, Wang-chien Lee, Anand Sivasubramaniam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06
Volume152
DOIs
StatePublished - 2006
Event1st International Conference on Scalable Information Systems, InfoScale '06 - Hong Kong, China
Duration: May 30 2006Jun 1 2006

Other

Other1st International Conference on Scalable Information Systems, InfoScale '06
CountryChina
CityHong Kong
Period5/30/066/1/06

Fingerprint

Data mining
Parallel algorithms
Clustering algorithms

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction

Cite this

Li, M., Lee, G., Lee, W., & Sivasubramaniam, A. (2006). PENS: An algorithm for density-based clustering in peer-to-peer systems. In Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06 (Vol. 152). [1146886] https://doi.org/10.1145/1146847.1146886
Li, Mei ; Lee, Guanling ; Lee, Wang-chien ; Sivasubramaniam, Anand. / PENS : An algorithm for density-based clustering in peer-to-peer systems. Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. Vol. 152 2006.
@inproceedings{f9199643b20c4d46b8acd125ec1233a3,
title = "PENS: An algorithm for density-based clustering in peer-to-peer systems",
abstract = "Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.",
author = "Mei Li and Guanling Lee and Wang-chien Lee and Anand Sivasubramaniam",
year = "2006",
doi = "10.1145/1146847.1146886",
language = "English (US)",
isbn = "1595934286",
volume = "152",
booktitle = "Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06",

}

Li, M, Lee, G, Lee, W & Sivasubramaniam, A 2006, PENS: An algorithm for density-based clustering in peer-to-peer systems. in Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. vol. 152, 1146886, 1st International Conference on Scalable Information Systems, InfoScale '06, Hong Kong, China, 5/30/06. https://doi.org/10.1145/1146847.1146886

PENS : An algorithm for density-based clustering in peer-to-peer systems. / Li, Mei; Lee, Guanling; Lee, Wang-chien; Sivasubramaniam, Anand.

Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. Vol. 152 2006. 1146886.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - PENS

T2 - An algorithm for density-based clustering in peer-to-peer systems

AU - Li, Mei

AU - Lee, Guanling

AU - Lee, Wang-chien

AU - Sivasubramaniam, Anand

PY - 2006

Y1 - 2006

N2 - Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

AB - Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

UR - http://www.scopus.com/inward/record.url?scp=34547380798&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547380798&partnerID=8YFLogxK

U2 - 10.1145/1146847.1146886

DO - 10.1145/1146847.1146886

M3 - Conference contribution

SN - 1595934286

SN - 9781595934284

VL - 152

BT - Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06

ER -

Li M, Lee G, Lee W, Sivasubramaniam A. PENS: An algorithm for density-based clustering in peer-to-peer systems. In Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. Vol. 152. 2006. 1146886 https://doi.org/10.1145/1146847.1146886