TY - GEN
T1 - On the development of a high-order, multi-GPU enabled, compressible viscous flow solver for mixed unstructured grids
AU - Castonguay, Patrice
AU - Williams, David M.
AU - Vincent, Peter E.
AU - Lopez, Manuel
AU - Jameson, Antony
PY - 2011/12/1
Y1 - 2011/12/1
N2 - This work discusses the development of a three-dimensional, high-order, compressible viscous flow solver for mixed unstructured grids that can run on multiple GPUs. The solver utilizes a range of so-called Vincent-Castonguay-Jameson-Huynh (VCJH) flux reconstruction schemes in both tensor-product and simplex elements. Such schemes are linearly stable for all orders of accuracy and encompass several well known high-order methods as special cases. Because of the high arithmetic intensity associated with VCJH schemes and their element-local nature, they are well suited for GPUs. The single-GPU solver developed in this work achieves speed-ups of up to 45x relative to a serial computation on a current generation CPU. Additionally, the multi-GPU solver scales well, and when running on 32 GPUs achieves a sustained performance of 2.8 Teraops (double precision) for 6th-order ac- curate simulations with tetrahedral elements. In this paper, the techniques used to achieve this level of performance are discussed and a performance analysis is presented. To the authors' knowledge, the aforementioned flow solver is the first high-order, three-dimensional, compressible Navier-Stokes solver for mixed unstructured grids that can run on multiple GPUs.
AB - This work discusses the development of a three-dimensional, high-order, compressible viscous flow solver for mixed unstructured grids that can run on multiple GPUs. The solver utilizes a range of so-called Vincent-Castonguay-Jameson-Huynh (VCJH) flux reconstruction schemes in both tensor-product and simplex elements. Such schemes are linearly stable for all orders of accuracy and encompass several well known high-order methods as special cases. Because of the high arithmetic intensity associated with VCJH schemes and their element-local nature, they are well suited for GPUs. The single-GPU solver developed in this work achieves speed-ups of up to 45x relative to a serial computation on a current generation CPU. Additionally, the multi-GPU solver scales well, and when running on 32 GPUs achieves a sustained performance of 2.8 Teraops (double precision) for 6th-order ac- curate simulations with tetrahedral elements. In this paper, the techniques used to achieve this level of performance are discussed and a performance analysis is presented. To the authors' knowledge, the aforementioned flow solver is the first high-order, three-dimensional, compressible Navier-Stokes solver for mixed unstructured grids that can run on multiple GPUs.
UR - http://www.scopus.com/inward/record.url?scp=84880646381&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880646381&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781624101489
T3 - 20th AIAA Computational Fluid Dynamics Conference 2011
BT - 20th AIAA Computational Fluid Dynamics Conference 2011
T2 - 20th AIAA Computational Fluid Dynamics Conference 2011
Y2 - 27 June 2011 through 30 June 2011
ER -