NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures

Jie Zhang, David Donofrio, John Shalf, Mahmut Kandemir, Myoungsoo Jung

Research output: Contribution to journalConference article

8 Citations (Scopus)

Abstract

Thanks to massive parallelism in modern Graphics Processing Units (GPUs), emerging data processing applications in GPU computing exhibit ten-fold speedups compared to CPU-only systems. However, this GPU-based acceleration is limited in many cases by the significant data movement overheads and inefficient memory management for host-side storage accesses. To address these shortcomings, this paper proposes a non-volatile memory management unit (NVMMU) that reduces the file datamovement overheads by directly connecting the Solid State Disk (SSD) to the GPU. We implemented our proposed NVMMU on a real hardware with commercially available GPU and SSD devices by considering different types of storage interfaces and configurations. In this work, NVMMU unifies two discrete software stacks (one for the SSD and other for the GPU) in two major ways. While a new interface provided by our NVMMU directly forwards file data between the GPU runtime library and the I/O runtime library, it supports non-volatile direct memory access (NDMA) that pairs those GPU and SSD devices via physically shared system memory blocks. This unification in turn can eliminate unnecessary user/kernel-mode switching, improve memory management, and remove data copy overheads. Our evaluation results demonstrate that NVMMU can reduce the overheads of file data movement by 95% on average, improving overall system performance by 78% compared to a conventional IOMMU approach.

Original languageEnglish (US)
Article number7429291
Pages (from-to)13-24
Number of pages12
JournalParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
DOIs
StatePublished - Jan 1 2015
Event24th International Conference on Parallel Architecture and Compilation, PACT 2015 - San Francisco, United States
Duration: Oct 18 2015Oct 21 2015

Fingerprint

Memory management units
Memory Management
Graphics Processing Unit
Unit
Data storage equipment
Architecture
Graphics processing unit
Unification
Computer hardware
Interfaces (computer)
Parallelism
Program processors
System Performance
Fold
Eliminate
Hardware
kernel

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Cite this

@article{ac9e37050b87431ead21001029c45942,
title = "NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures",
abstract = "Thanks to massive parallelism in modern Graphics Processing Units (GPUs), emerging data processing applications in GPU computing exhibit ten-fold speedups compared to CPU-only systems. However, this GPU-based acceleration is limited in many cases by the significant data movement overheads and inefficient memory management for host-side storage accesses. To address these shortcomings, this paper proposes a non-volatile memory management unit (NVMMU) that reduces the file datamovement overheads by directly connecting the Solid State Disk (SSD) to the GPU. We implemented our proposed NVMMU on a real hardware with commercially available GPU and SSD devices by considering different types of storage interfaces and configurations. In this work, NVMMU unifies two discrete software stacks (one for the SSD and other for the GPU) in two major ways. While a new interface provided by our NVMMU directly forwards file data between the GPU runtime library and the I/O runtime library, it supports non-volatile direct memory access (NDMA) that pairs those GPU and SSD devices via physically shared system memory blocks. This unification in turn can eliminate unnecessary user/kernel-mode switching, improve memory management, and remove data copy overheads. Our evaluation results demonstrate that NVMMU can reduce the overheads of file data movement by 95{\%} on average, improving overall system performance by 78{\%} compared to a conventional IOMMU approach.",
author = "Jie Zhang and David Donofrio and John Shalf and Mahmut Kandemir and Myoungsoo Jung",
year = "2015",
month = "1",
day = "1",
doi = "10.1109/PACT.2015.43",
language = "English (US)",
pages = "13--24",
journal = "Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT",
issn = "1089-795X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

NVMMU : A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures. / Zhang, Jie; Donofrio, David; Shalf, John; Kandemir, Mahmut; Jung, Myoungsoo.

In: Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT, 01.01.2015, p. 13-24.

Research output: Contribution to journalConference article

TY - JOUR

T1 - NVMMU

T2 - A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures

AU - Zhang, Jie

AU - Donofrio, David

AU - Shalf, John

AU - Kandemir, Mahmut

AU - Jung, Myoungsoo

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Thanks to massive parallelism in modern Graphics Processing Units (GPUs), emerging data processing applications in GPU computing exhibit ten-fold speedups compared to CPU-only systems. However, this GPU-based acceleration is limited in many cases by the significant data movement overheads and inefficient memory management for host-side storage accesses. To address these shortcomings, this paper proposes a non-volatile memory management unit (NVMMU) that reduces the file datamovement overheads by directly connecting the Solid State Disk (SSD) to the GPU. We implemented our proposed NVMMU on a real hardware with commercially available GPU and SSD devices by considering different types of storage interfaces and configurations. In this work, NVMMU unifies two discrete software stacks (one for the SSD and other for the GPU) in two major ways. While a new interface provided by our NVMMU directly forwards file data between the GPU runtime library and the I/O runtime library, it supports non-volatile direct memory access (NDMA) that pairs those GPU and SSD devices via physically shared system memory blocks. This unification in turn can eliminate unnecessary user/kernel-mode switching, improve memory management, and remove data copy overheads. Our evaluation results demonstrate that NVMMU can reduce the overheads of file data movement by 95% on average, improving overall system performance by 78% compared to a conventional IOMMU approach.

AB - Thanks to massive parallelism in modern Graphics Processing Units (GPUs), emerging data processing applications in GPU computing exhibit ten-fold speedups compared to CPU-only systems. However, this GPU-based acceleration is limited in many cases by the significant data movement overheads and inefficient memory management for host-side storage accesses. To address these shortcomings, this paper proposes a non-volatile memory management unit (NVMMU) that reduces the file datamovement overheads by directly connecting the Solid State Disk (SSD) to the GPU. We implemented our proposed NVMMU on a real hardware with commercially available GPU and SSD devices by considering different types of storage interfaces and configurations. In this work, NVMMU unifies two discrete software stacks (one for the SSD and other for the GPU) in two major ways. While a new interface provided by our NVMMU directly forwards file data between the GPU runtime library and the I/O runtime library, it supports non-volatile direct memory access (NDMA) that pairs those GPU and SSD devices via physically shared system memory blocks. This unification in turn can eliminate unnecessary user/kernel-mode switching, improve memory management, and remove data copy overheads. Our evaluation results demonstrate that NVMMU can reduce the overheads of file data movement by 95% on average, improving overall system performance by 78% compared to a conventional IOMMU approach.

UR - http://www.scopus.com/inward/record.url?scp=84975469963&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975469963&partnerID=8YFLogxK

U2 - 10.1109/PACT.2015.43

DO - 10.1109/PACT.2015.43

M3 - Conference article

AN - SCOPUS:84975469963

SP - 13

EP - 24

JO - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

JF - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

SN - 1089-795X

M1 - 7429291

ER -