We present the CCAMP framework, with the goal of allowing programmers to seamlessly flow between different directive sets, enabling programmers to execute directive-based code on previously incompatible devices. We introduce two primary translation passes and show that these passes can generate output code in a different directive context that performs similarly to hand-coded programs. We also provide a commentary on the current status of the different devices and compilers commonly used in heterogeneous programming.
Significance and Impact
Heterogeneous systems have become a staple of the HPC environment. Several directive-based solutions, such as OpenMP and OpenACC, have been developed to alleviate the challenges of programming heterogeneous systems, and these standards strive to provide a single portable programming solution across heterogeneous environments. However, in many ways this goal has yet to be realized due to device- specific implementations and different levels of language support across compilers. In this framework we aim to analyze and address the different levels of optimization and compatibility between OpenACC and OpenMP programs and device compilers. We introduce the CCAMP framework, built on the OpenARC compiler, which implements language translation between OpenACC and OpenMP, with the goal of exploiting the maturity of different device-specific compilers to maximize performance for a given architecture. We show that CCAMP allows us to generate code for a specific device-compiler combination given a device- agnostic OpenMP or OpenACC program, allowing compilation and execution of programs with specific directives on otherwise incompatible devices.
- We introduce a novel baseline directive-translation framework, allowing programmers to automatically flow between standards to utilize the maturity of single-standard compilers on different devices.
- We provide a commentary on the current status of the popular OpenACC and OpenMP compilers and their levels of support for the directive-based standards across an array of devices.
- We evaluate the effectiveness of CCAMP’s baseline translation using an array of different heterogeneous ecosystems. We demonstrate how our compiler- translated code can perform similarly or even better than hand-written code, and how CCAMP can allow programmers to execute translated code in ecosystems that may not support the original source language.
Citation and DOI:
Jacob Lambert, Seyong Lee, Allen D. Malony, and Jeffrey S. Vetter, CCAMP: OpenMP and OpenACC Interoperable Framework, Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar), in conjunction with Euro-Par19, 2019.
To evaluate the effectiveness of CCAMP’s OpenACC to OpenMP 4.X+ baseline translation pass, we evaluated the hand-coded OpenMP 4.X+ applications in the SPEC Accel benchmark suite without applying any transformations or optimizations. We used the resulting execution times as a baseline by which to compare the execution times of our translated (from OpenACC) OpenMP 4.X+ code. In Figure 1, we see the results of this comparison. Each bar represents the average (across all benchmarks) ratio of the translated runtime divided by the hand-coded runtime. Values below 1 represent cases where the translated code performed better, while values above 1 represent cases where further improvements need to be made to the translation pass to match the hand-coded performance. While the translation pass still has room for improvement on some device+compiler combinations, it performs acceptably well for many of the other combinations.
Last Updated: April 6, 2021 - 12:32 pm