Future Technologies

Highlights

Scaling off-chip bandwidth is challenging due to fundamental limitations such as fixed pin count and plateauing signaling rates. Recently, vendors have turned to 2.5D and 3D stacking to closely…

Recently, persistent data structures, like key-value stores (KVSs), which are stored in an HPC system's nonvolatile memory, provide an attractive solution for a number of emerging challenges like…

The slowdown of Moore’s law has caused an escalation in architectural diversity over the last decade, and agile development of domain-specific heterogeneous…

This paper describes how the OpenACC data model is implemented in current OpenACC compilers, ranging from research compilers (OpenUH and OpenARC) to a commercial compiler (the PGI…

EMU is a novel architecture that provides scalable access to a com- mon partitioned global address space (PGAS) through a simple programming interface. The hardware…

Substantial advances in nonvolatile memory (NVM) technologies have motivated widespread integration of NVM into mobile, enterprise, and HPC systems.  Recently, considerable research has…

Computer architecture experts expect that NVM hierarchies will play a more significant role in future systems including mobile, enterprise, and HPC architectures. With this expectation in mind, we…

The frequency of hardware errors in HPC systems continues to grow as system designs evolve toward exascale. Tolerating these errors efficiently and effectively will require software-based…

Heterogeneous computing with accelerators is growing in importance in high performance computing (HPC), deep learning (DL), and other areas. Recently, application datasets have expanded beyond the…

Wavefront loops are widely used in many scientific applications, e.g., partial differential equation (PDE) solvers and sequence alignment tools. However,…

GPU performance of the lattice Boltzmann method (LBM) depends heavily on memory access patterns. When LBM is advanced with GPUs on complex computational…

Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations from several domains because of their unique…

Memory technologies are under active development. Meanwhile, workloads on contemporary computing systems are increasing rapidly in size and diversity.…

When studying large-scale systems, researchers often face additional complication due to the scarcity of resources. Performing tens of thousands of fault injection…

In this study we have proposed Juggler, a new, dynamic task-based execution scheme for GPGPU applications with data dependences. Different from previous studies, Juggler implements an in-GPU…