SDCI HPC: Improvement: Parallel I/O Software Infrastructure for Petascale Systems

  • Beckman, Peter (CoPI)
  • Kandemir, Mahmut (CoPI)
  • Liao, Wei Keng (CoPI)
  • Ross, Robert (CoPI)
  • Choudhary, Alok (PI)

Project: Research project

Project Details

Description

Technical Merit: This project proposes to address the software problem for petascale parallel machines, and it especially targets for scalable I/O, storage and systems with deep memory hierarchy accesses. In particular, this project proposes to improve, enhance, develop, and deploy robust software infrastructure to provide end-to-end scalable I/O performance that utilizes the understanding of high-level access patterns (?intent?), and uses that information through runtime layers to enable optimizations at different levels. We propose mechanisms that allow different software layers to interact and cooperate with each other to achieve end-to-end performance objectives. Specifically, the objectives of this project, are to develop, improve and deploy (1) scalable software for end-to-end I/O performance optimizations; (2) Parallel netCDF (PnetCDF) enhancements providing statistical functions and data mining functions; (3) PnetCDF software optimizations using non-blocking I/O mechanisms; (4) MPI-IO caching mechanisms to optimize I/O software stack performance; (5) I/O forwarding and dedicated caching mechanisms important to effectively utilize the structures of upcoming petascale systems; (6) effective benchmarking and testing suites for the I/O stack; (7) an optimization assist tool that, through program analysis, can identify and guide a user to optimize I/O; (8) testing leveraging the mechanisms and tools developed as part of the NMI; and (9) tutorials and tools for helping application scientists incorporate these I/O stack optimizations into their production applications. We also believe that the software and techniques developed in this project will be directly applicable to and useful in other high-level software libraries and formats such as the Hierarchical Data Format (HDF). Broader Impact: We will build upon and leverage our team's collective experience (which includes distribution of widely used and robust software systems for HPC such as ROMIO, MPICH2, PVFS,PnetCDF and NU-Minebench) to distribute software developed in this project for cyberinfrastructure, and therefore, directly impact the scalability of applications in many domains. Through our team's active participation in multiple infrastructure centers (e.g., teragrid), we will deploy the software on production systems. We will also incorporate the results and lessons from this project into the various tutorials that are presented by our team members in the area of parallel computing, parallel I/O and systems software in most leading conferences in HPC throughout the world. Through this project and utilizing summer internships, we will provide an opportunity to students to work with application scientists, thereby fostering interdisciplinary collaboration. This project will also support graduate students work towards advanced degrees. PI Choudhary has graduated more than 23 PhDs, many of whom have joined academia and national labs. Multiple PIs in this project have graduated several female and underrepresented PhDs, and we will continue to enhance this tradition. In addition to incorporating the lessons from this project into various tutorials, we will also incorporate them into classroom material both for undergraduate and graduate level courses as we have done in the past. Finally, we have a strong collaboration with industry in the HPC area and we will leverage that collaboration to provide the outcomes and results of this project to them.

StatusFinished
Effective start/end date8/1/077/31/12

Funding

  • National Science Foundation: $1,528,111.00
  • National Science Foundation: $1,528,111.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.