Highlights

We present a quantum-classical algorithm to study the dynamics of the two-spatial-site Schwinger model on IBM's quantum computers. Using rotational symmetries,…

Sets and Groups provide useful abstractions for the User to create and manipulate persistent groups of PEs for collective operations. However, existing collective operations expect active sets to…

Reliability, availability and serviceability (RAS) logs of HPC resources, when closely investigated in spatial and temporal dimensions, can provide invaluable information regarding system status,…

High performance computing (HPC) applications are affected by multiple types of errors occurring in HPC systems which hinder with their ability to make forward progress and…

The assessment of application performance is a fundamental task in high-performance computing (HPC). The OpenSHMEM Benchmark (OSB) suite is a collection…

This paper describes how the OpenACC data model is implemented in current OpenACC compilers, ranging from research compilers (OpenUH and OpenARC) to a commercial compiler (the PGI…

EMU is a novel architecture that provides scalable access to a com- mon partitioned global address space (PGAS) through a simple programming interface. The hardware…

Substantial advances in nonvolatile memory (NVM) technologies have motivated widespread integration of NVM into mobile, enterprise, and HPC systems.  Recently, considerable research has…

Computer architecture experts expect that NVM hierarchies will play a more significant role in future systems including mobile, enterprise, and HPC architectures. With this expectation in mind, we…

The frequency of hardware errors in HPC systems continues to grow as system designs evolve toward exascale. Tolerating these errors efficiently and effectively will require software-based…

Heterogeneous computing with accelerators is growing in importance in high performance computing (HPC), deep learning (DL), and other areas. Recently, application datasets have expanded beyond the…

Wavefront loops are widely used in many scientific applications, e.g., partial differential equation (PDE) solvers and sequence alignment tools. However,…

GPU performance of the lattice Boltzmann method (LBM) depends heavily on memory access patterns. When LBM is advanced with GPUs on complex computational…

Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations from several domains because of their unique…