Event

High-performance and Intelligent Scientific Data Management for Exascale Computing

Oak Ridge, TN

June 28, 2018

Lipeng Wan

Scientific Data Group

Thursday, June 28, 2018
Building 4100, C301
11:00 a.m. – 12:00 p.m.

 

Abstract:
As the era of exascale computing is coming, massive volume of data are collected from scientific instruments or generated by computational simulations.  Managing these data becomes an increasingly difficult task for scientific applications running on high-performance computing (HPC) systems.  First, although the Input/Output (I/O) bandwidth of HPC systems has increased significantly during the past decade, the ratio of compute capability to I/O bandwidth continues to grow, which means the amount of data can be generated by HPC systems still increases faster than I/O bandwidth. Second, since the limited I/O bandwidth is shared by many applications, and each of these applications has different I/O characteristics, I/O contention is already common on current HPC systems and might become even worse in exascale computing environments. Therefore, in order to mitigate the impact of I/O contention and fully utilize the available I/O bandwidth, we need to develop high-performance and intelligent scientific data management systems for applications running at exascale.

In this talk, I will introduce my recent and on-going research work which mainly focuses on how to make scientific data management more efficient and intelligent. First, I will present results of I/O performance studies I conducted on two DoE’s leadership-class supercomputers (Titan and Cori) and demonstrate the I/O performance of jobs running on these systems could be highly variable due to intra- and inter-job I/O contention. Second, I will discuss how to model and predict the highly variable I/O performance on Titan and Cori by leveraging machine learning techniques. Finally, I will talk about my work on building a new metadata handling utility for one of the ECP projects, Adaptable Input/Output System (ADIOS).

Biography

Lipeng Wan is a postdoctoral research associate in the Scientific Data Group, Computer Science and Mathematics Division at Oak Ridge National Laboratory. He received his Ph.D. in computer science from the University of Tennessee, Knoxville in 2016. His research interests include scientific data management, high-performance computing, parallel file systems and I/O.