Complex network analysis using parallel approximate motif counting

George M. Slota, Kamesh Madduri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014
PublisherIEEE Computer Society
Pages405-414
Number of pages10
ISBN (Print)9780769552071
DOIs
StatePublished - Jan 1 2014
Event28th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2014 - Phoenix, AZ, United States
Duration: May 19 2014May 23 2014

Publication series

NameProceedings of the International Parallel and Distributed Processing Symposium, IPDPS
ISSN (Print)1530-2075
ISSN (Electronic)2332-1237

Other

Other28th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2014
CountryUnited States
CityPhoenix, AZ
Period5/19/145/23/14

Fingerprint

Complex networks
Electric network analysis
Dynamic programming
Data storage equipment
Parallel algorithms
Communication

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Cite this

Slota, G. M., & Madduri, K. (2014). Complex network analysis using parallel approximate motif counting. In Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014 (pp. 405-414). [6877274] (Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS). IEEE Computer Society. https://doi.org/10.1109/IPDPS.2014.50
Slota, George M. ; Madduri, Kamesh. / Complex network analysis using parallel approximate motif counting. Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014. IEEE Computer Society, 2014. pp. 405-414 (Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS).
@inproceedings{f0d9bb4bf8d74d9790783230ae0464f4,
title = "Complex network analysis using parallel approximate motif counting",
abstract = "Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.",
author = "Slota, {George M.} and Kamesh Madduri",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/IPDPS.2014.50",
language = "English (US)",
isbn = "9780769552071",
series = "Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS",
publisher = "IEEE Computer Society",
pages = "405--414",
booktitle = "Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014",
address = "United States",

}

Slota, GM & Madduri, K 2014, Complex network analysis using parallel approximate motif counting. in Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014., 6877274, Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS, IEEE Computer Society, pp. 405-414, 28th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2014, Phoenix, AZ, United States, 5/19/14. https://doi.org/10.1109/IPDPS.2014.50

Complex network analysis using parallel approximate motif counting. / Slota, George M.; Madduri, Kamesh.

Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014. IEEE Computer Society, 2014. p. 405-414 6877274 (Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Complex network analysis using parallel approximate motif counting

AU - Slota, George M.

AU - Madduri, Kamesh

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.

AB - Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.

UR - http://www.scopus.com/inward/record.url?scp=84906674349&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906674349&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2014.50

DO - 10.1109/IPDPS.2014.50

M3 - Conference contribution

AN - SCOPUS:84906674349

SN - 9780769552071

T3 - Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS

SP - 405

EP - 414

BT - Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014

PB - IEEE Computer Society

ER -

Slota GM, Madduri K. Complex network analysis using parallel approximate motif counting. In Proceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014. IEEE Computer Society. 2014. p. 405-414. 6877274. (Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS). https://doi.org/10.1109/IPDPS.2014.50