A global communication optimization technique based on data-flow analysis and linear algebra

Mahmut Kandemir, P. Banerjee, A. Choudhary, J. Ramanujam, N. Shenoy

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements. The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distributions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (an average of 32% reduction), the volume of the data communicated (an average of 37% reduction), and the execution time (an average of 26% reduction).

Original languageEnglish (US)
Pages (from-to)1251-1297
Number of pages47
JournalACM Transactions on Programming Languages and Systems
Volume21
Issue number6
DOIs
StatePublished - Jan 1 1999

Fingerprint

Data flow analysis
Linear algebra
Communication
Message passing
Redundancy
Data storage equipment

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Kandemir, Mahmut ; Banerjee, P. ; Choudhary, A. ; Ramanujam, J. ; Shenoy, N. / A global communication optimization technique based on data-flow analysis and linear algebra. In: ACM Transactions on Programming Languages and Systems. 1999 ; Vol. 21, No. 6. pp. 1251-1297.
@article{b53e92c6259e439bbbd5262e841524b6,
title = "A global communication optimization technique based on data-flow analysis and linear algebra",
abstract = "Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements. The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distributions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (an average of 32{\%} reduction), the volume of the data communicated (an average of 37{\%} reduction), and the execution time (an average of 26{\%} reduction).",
author = "Mahmut Kandemir and P. Banerjee and A. Choudhary and J. Ramanujam and N. Shenoy",
year = "1999",
month = "1",
day = "1",
doi = "10.1145/330643.330647",
language = "English (US)",
volume = "21",
pages = "1251--1297",
journal = "ACM Transactions on Programming Languages and Systems",
issn = "0164-0925",
publisher = "Association for Computing Machinery (ACM)",
number = "6",

}

A global communication optimization technique based on data-flow analysis and linear algebra. / Kandemir, Mahmut; Banerjee, P.; Choudhary, A.; Ramanujam, J.; Shenoy, N.

In: ACM Transactions on Programming Languages and Systems, Vol. 21, No. 6, 01.01.1999, p. 1251-1297.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A global communication optimization technique based on data-flow analysis and linear algebra

AU - Kandemir, Mahmut

AU - Banerjee, P.

AU - Choudhary, A.

AU - Ramanujam, J.

AU - Shenoy, N.

PY - 1999/1/1

Y1 - 1999/1/1

N2 - Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements. The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distributions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (an average of 32% reduction), the volume of the data communicated (an average of 37% reduction), and the execution time (an average of 26% reduction).

AB - Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements. The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distributions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (an average of 32% reduction), the volume of the data communicated (an average of 37% reduction), and the execution time (an average of 26% reduction).

UR - http://www.scopus.com/inward/record.url?scp=0000529068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0000529068&partnerID=8YFLogxK

U2 - 10.1145/330643.330647

DO - 10.1145/330643.330647

M3 - Article

VL - 21

SP - 1251

EP - 1297

JO - ACM Transactions on Programming Languages and Systems

JF - ACM Transactions on Programming Languages and Systems

SN - 0164-0925

IS - 6

ER -