IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism


Jungwon Kim, Seyong Lee, and Jeffrey S. Vetter, IMPACC: A Tightly Integrated MPI+OpenACC Framework Exploiting Shared Memory Parallelism, Proceedings of the ACM Symposium on High-Performance and Distributed Computing (HPDC), 2016.


Accelerator-based heterogeneous computing is gaining momentum in the High-Performance Computing arena. However, the increased complexity of the heterogeneous architectures demands more generic, high-level programming models. OpenACC is one such attempt to tackle this problem. While the abstraction provided by OpenACC offers productivity, it raises questions on both functional and performance portability. In this article, we propose HeteroIR, a high-level, architecture-independent intermediate representation, to map high-level programming models, such as OpenACC, to heterogeneous architectures. We present a compiler approach that translates OpenACC programs into HeteroIR and accelerator kernels, in order to obtain OpenACC functional portability. Then, we evaluate the performance portability obtained by OpenACC with our approach on twelve OpenACC programs on NVIDIA CUDA, AMD GCN, and Intel Xeon Phi architectures. We study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.

Read Publication