Supporting K nearest neighbors query on high-dimensional data in P2P systems

Mei Li, Wang-chien Lee, Anand Sivasubramaniam, Jizhong Zhao

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.

Original languageEnglish (US)
Pages (from-to)234-247
Number of pages14
JournalFrontiers of Computer Science in China
Volume2
Issue number3
DOIs
StatePublished - Sep 1 2008

Fingerprint

P2P Systems
High-dimensional Data
Nearest Neighbor
Genes
Query
Search Space
Peer-to-peer Systems
Feature Vector
Leverage
Dimensionality
Sharing
Genome
Refinement
High-dimensional
Attribute
Resources
Object
Vertex of a graph
Demonstrate
Simulation

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{8c36172786304c16a10a7ba1f6cf7497,
title = "Supporting K nearest neighbors query on high-dimensional data in P2P systems",
abstract = "Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.",
author = "Mei Li and Wang-chien Lee and Anand Sivasubramaniam and Jizhong Zhao",
year = "2008",
month = "9",
day = "1",
doi = "10.1007/s11704-008-0026-7",
language = "English (US)",
volume = "2",
pages = "234--247",
journal = "Frontiers of Computer Science",
issn = "2095-2228",
publisher = "Springer Science + Business Media",
number = "3",

}

Supporting K nearest neighbors query on high-dimensional data in P2P systems. / Li, Mei; Lee, Wang-chien; Sivasubramaniam, Anand; Zhao, Jizhong.

In: Frontiers of Computer Science in China, Vol. 2, No. 3, 01.09.2008, p. 234-247.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Supporting K nearest neighbors query on high-dimensional data in P2P systems

AU - Li, Mei

AU - Lee, Wang-chien

AU - Sivasubramaniam, Anand

AU - Zhao, Jizhong

PY - 2008/9/1

Y1 - 2008/9/1

N2 - Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.

AB - Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.

UR - http://www.scopus.com/inward/record.url?scp=49549093429&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49549093429&partnerID=8YFLogxK

U2 - 10.1007/s11704-008-0026-7

DO - 10.1007/s11704-008-0026-7

M3 - Article

AN - SCOPUS:49549093429

VL - 2

SP - 234

EP - 247

JO - Frontiers of Computer Science

JF - Frontiers of Computer Science

SN - 2095-2228

IS - 3

ER -