Parallel breadth-first search on distributed memory systems

Aydin Buluç, Kamesh Madduri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

124 Scopus citations

Abstract

Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned parallel approaches for BFS on large parallel systems: a levelsynchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execution regimes in which these approaches will be competitive, and we demonstrate extremely high performance on leading distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.

Original languageEnglish (US)
Title of host publicationProceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
DOIs
StatePublished - 2011
Event2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC11 - Seattle, WA, United States
Duration: Nov 12 2011Nov 18 2011

Publication series

NameProceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

Other

Other2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC11
CountryUnited States
CitySeattle, WA
Period11/12/1111/18/11

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Parallel breadth-first search on distributed memory systems'. Together they form a unique fingerprint.

Cite this