Abstract
Recent advances in GPUs opened a new opportunity in harnessing their computing power for general purpose computing. CUDA, an extension to C programming, is developed for programming NVIDIA GPUs. However, efficiently programming GPUs using CUDA is very tedious and error prone even for the expert programmers. Programmer has to optimize the resource occupancy and manage the data transfers between host and GPU, and across the memory system. This paper presents the basic architectural optimizations and explore their implementations in research and industry compilers. The focus of the presented review is on accelerating computational science applications such as the class of structured grid computation (SGC). It also discusses the mismatch between current compiler techniques and the requirements for implementing efficient iterative linear solvers. It explores the approaches used by computational scientists to program SGCs. Finally, a set of tools with the main optimization functionalities for an integrated library are proposed to ease the process of defining complex SGC data structure and optimizing solver code using intelligent high-level interface and domain specific annotations.
| Original language | English |
|---|---|
| Pages (from-to) | 977-1003 |
| Number of pages | 27 |
| Journal | Computing (Vienna/New York) |
| Volume | 102 |
| Issue number | 4 |
| DOIs | |
| State | Published - 1 Apr 2020 |
Bibliographical note
Publisher Copyright:© 2019, Springer-Verlag GmbH Austria, part of Springer Nature.
Keywords
- CUDA
- Kernel optimizations
- Massively parallel programming
- Scientific simulations
- Structured grid computing (SGC)
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Numerical Analysis
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics