TY - JOUR
T1 - Distributed In-Memory Processing of All k Nearest Neighbor Queries
AU - Chatzimilioudis, Georgios
AU - Costa, Constantinos
AU - Zeinalipour-Yazti, Demetrios
AU - Lee, Wang Chien
AU - Pitoura, Evaggelia
N1 - Funding Information:
This work was financially supported through an Appcampus Award by Microsoft, Nokia, and Aalto University (Finland) as well as a industrial sponsorship by MTN (Cyprus). It has also been supported by the third author's startup grant at the University of Cyprus, EU's FP7 "Mobility, Data Mining, and Privacy" project, EU's COST Action MOVE "Knowledge Discovery for Moving Objects".
Publisher Copyright:
© 2015 IEEE.
PY - 2016/4/1
Y1 - 2016/4/1
N2 - A wide spectrum of Internet-scale mobile applications, ranging from social networking, gaming and entertainment to emergency response and crisis management, all require efficient and scalable All k Nearest Neighbor (AkNN) computations over millions of moving objects every few seconds to be operational. Most traditional techniques for computing AkNN queries are centralized, lacking both scalability and efficiency. Only recently, distributed techniques for shared-nothing cloud infrastructures have been proposed to achieve scalability for large datasets. These batch-oriented algorithms are sub-optimal due to inefficient data space partitioning and data replication among processing units. In this paper, we present Spitfire, a distributed algorithm that provides a scalable and high-performance AkNN processing framework. Our proposed algorithm deploys a fast load-balanced partitioning scheme along with an efficient replication-set selection algorithm, to provide fast main-memory computations of the exact AkNN results in a batch-oriented manner. We evaluate, both analytically and experimentally, how the pruning efficiency of the Spitfire algorithm plays a pivotal role in reducing communication and response time up to an order of magnitude, compared to three other state-of-the-art distributed AkNN algorithms executed in distributed main-memory.
AB - A wide spectrum of Internet-scale mobile applications, ranging from social networking, gaming and entertainment to emergency response and crisis management, all require efficient and scalable All k Nearest Neighbor (AkNN) computations over millions of moving objects every few seconds to be operational. Most traditional techniques for computing AkNN queries are centralized, lacking both scalability and efficiency. Only recently, distributed techniques for shared-nothing cloud infrastructures have been proposed to achieve scalability for large datasets. These batch-oriented algorithms are sub-optimal due to inefficient data space partitioning and data replication among processing units. In this paper, we present Spitfire, a distributed algorithm that provides a scalable and high-performance AkNN processing framework. Our proposed algorithm deploys a fast load-balanced partitioning scheme along with an efficient replication-set selection algorithm, to provide fast main-memory computations of the exact AkNN results in a batch-oriented manner. We evaluate, both analytically and experimentally, how the pruning efficiency of the Spitfire algorithm plays a pivotal role in reducing communication and response time up to an order of magnitude, compared to three other state-of-the-art distributed AkNN algorithms executed in distributed main-memory.
UR - http://www.scopus.com/inward/record.url?scp=84963728902&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963728902&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2015.2503768
DO - 10.1109/TKDE.2015.2503768
M3 - Article
AN - SCOPUS:84963728902
VL - 28
SP - 925
EP - 938
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
SN - 1041-4347
IS - 4
M1 - 7337428
ER -