Project Status: Active
The objective of this research is to design and evaluate a new distributed data storage paradigm that unifies the traditionally distinct application views of memory- and file-based data storage into a single scalable and resilient data environment. As we move into the exascale computing era, four challenges drive the need for a unified data environment:
- exascale computing systems will contain unprecedented concurrency, both across all nodes of a system and within individual computing nodes due to the emergence of many-core processors and accelerators,
- exascale computing environments will leverage multi-tier memory and storage hierarchies that increase the complexity of managing and efficiently accessing data,
- exascale computing environments will have limited power budgets that require efficient, power-aware data movement, and
- exascale computing systems will experience a nearly constant stream of hardware faults due the sheer number of components in the systems.
Together, these exascale challenges demand a radically new data environment that has two key qualities: (1) explicit application-level semantics for data sharing and persistence among large collections of concurrent data users, and (2) intelligent management of data placement, movement, and durability within multi-tier memory and storage systems. A unified data environment frees applications from the complexity of directly placing and moving data within multi-tier storage hierarchies, while still meeting application-prescribed requirements for data access performance, efficient data sharing within and between applications (e.g., coupled applications, analytics and co-visualization), and data durability.
The objective of this research is to design and evaluate a new distributed data storage paradigm that unifies the traditionally distinct application views of memory- and file-based data storage into a single scalable and resilient data environment.