PENS: An algorithm for density-based clustering in peer-to-peer systems

Mei Li, Guanling Lee, Wang-chien Lee, Anand Sivasubramaniam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06
DOIs
StatePublished - Dec 1 2006
Event1st International Conference on Scalable Information Systems, InfoScale '06 - Hong Kong, China
Duration: May 30 2006Jun 1 2006

Publication series

NameACM International Conference Proceeding Series
Volume152

Other

Other1st International Conference on Scalable Information Systems, InfoScale '06
CountryChina
CityHong Kong
Period5/30/066/1/06

Fingerprint

Data mining
Parallel algorithms
Clustering algorithms

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Li, M., Lee, G., Lee, W., & Sivasubramaniam, A. (2006). PENS: An algorithm for density-based clustering in peer-to-peer systems. In Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06 [1146886] (ACM International Conference Proceeding Series; Vol. 152). https://doi.org/10.1145/1146847.1146886
Li, Mei ; Lee, Guanling ; Lee, Wang-chien ; Sivasubramaniam, Anand. / PENS : An algorithm for density-based clustering in peer-to-peer systems. Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. 2006. (ACM International Conference Proceeding Series).
@inproceedings{f9199643b20c4d46b8acd125ec1233a3,
title = "PENS: An algorithm for density-based clustering in peer-to-peer systems",
abstract = "Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.",
author = "Mei Li and Guanling Lee and Wang-chien Lee and Anand Sivasubramaniam",
year = "2006",
month = "12",
day = "1",
doi = "10.1145/1146847.1146886",
language = "English (US)",
isbn = "1595934286",
series = "ACM International Conference Proceeding Series",
booktitle = "Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06",

}

Li, M, Lee, G, Lee, W & Sivasubramaniam, A 2006, PENS: An algorithm for density-based clustering in peer-to-peer systems. in Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06., 1146886, ACM International Conference Proceeding Series, vol. 152, 1st International Conference on Scalable Information Systems, InfoScale '06, Hong Kong, China, 5/30/06. https://doi.org/10.1145/1146847.1146886

PENS : An algorithm for density-based clustering in peer-to-peer systems. / Li, Mei; Lee, Guanling; Lee, Wang-chien; Sivasubramaniam, Anand.

Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. 2006. 1146886 (ACM International Conference Proceeding Series; Vol. 152).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - PENS

T2 - An algorithm for density-based clustering in peer-to-peer systems

AU - Li, Mei

AU - Lee, Guanling

AU - Lee, Wang-chien

AU - Sivasubramaniam, Anand

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

AB - Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a vide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.

UR - http://www.scopus.com/inward/record.url?scp=34547380798&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547380798&partnerID=8YFLogxK

U2 - 10.1145/1146847.1146886

DO - 10.1145/1146847.1146886

M3 - Conference contribution

AN - SCOPUS:34547380798

SN - 1595934286

SN - 9781595934284

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06

ER -

Li M, Lee G, Lee W, Sivasubramaniam A. PENS: An algorithm for density-based clustering in peer-to-peer systems. In Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale '06. 2006. 1146886. (ACM International Conference Proceeding Series). https://doi.org/10.1145/1146847.1146886