Project

Hobbes – OS and Runtime Support for Application Composition

Project Status: Active

This project intends to deliver an operating system and runtime (OS/R) environment for extreme-scale scientific computing. With application composition as the fundamental driving force, we will develop the necessary OS/R interfaces and low-level system services required to support the isolation and sharing needed to design and implement applications, as well as, performance and correctness tools. Our approach also will support complex simulation and analysis workflows. A workflow’s components will likely consist of a wide range of parallel codes with different OS/R requirements, e.g., a relatively complicated multi-physics workflow that incorporates data from three different types of legacy codes that use MPI only, PGAS languages, and MPI with threading, and requires components for analytics, visualization, uncertainty quantification, memory profiling, and performance analysis. Instead of a single unified OS/R to support every conceivable requirement, we propose a lightweight OS/R system with the flexibility to custom build runtimes for any particular purpose. Each component executes in its own “enclave” with a specialized runtime and isolation properties. A global runtime system provides the software required to compose applications out of a collection of enclaves, join them through secure and low-latency communication, and schedule them to avoid contention and maximize resource utilization. The benefits gained from lightweight and customizable runtimes include predictable and consistent memory and network patterns, manageable resilience properties, and measurable power and energy characteristics. These benefits simplify algorithm design and development issues at a large scale. Project deliverables are: (1) a OS/R stack based on the Kitten OS and Palacios virtual machine monitor and (2) high-value, high risk research that leverages the architecture of the base OS/R to explore issues of specific interest to exascale, e.g., virtualization, analytics, networking, energy/power, scheduling/parallelism, architecture, resilience, programming models, and tools. For more information, please visit http://xstack.sandia.gov/hobbes .

This project intends to deliver an operating system and runtime (OS/R) environment for extreme-scale scientific computing. With application composition as the fundamental driving force, we will develop the necessary OS/R interfaces and low-level system services required to support the isolation and sharing needed to design and implement applications, as well as, performance and correctness tools. Our approach also will support complex simulation and analysis workflows. A workflow’s components will likely consist of a wide range of parallel codes with different OS/R requirements, e.g., a relatively complicated multi-physics workflow that incorporates data from three different types of legacy codes that use MPI only, PGAS languages, and MPI with threading, and requires components for analytics, visualization, uncertainty quantification, memory profiling, and performance analysis. Instead of a single unified OS/R to support every conceivable requirement, we propose a lightweight OS/R system with the flexibility to custom build runtimes for any particular purpose. Each component executes in its own “enclave” with a specialized runtime and isolation properties. A global runtime system provides the software required to compose applications out of a collection of enclaves, join them through secure and low-latency communication, and schedule them to avoid contention and maximize resource utilization. The benefits gained from lightweight and customizable runtimes include predictable and consistent memory and network patterns, manageable resilience properties, and measurable power and energy characteristics. These benefits simplify algorithm design and development issues at a large scale. Project deliverables are: (1) a OS/R stack based on the Kitten OS and Palacios virtual machine monitor and (2) high-value, high risk research that leverages the architecture of the base OS/R to explore issues of specific interest to exascale, e.g., virtualization, analytics, networking, energy/power, scheduling/parallelism, architecture, resilience, programming models, and tools. For more information, please visit http://xstack.sandia.gov/hobbes .

Last Updated: May 28, 2020 - 4:02 pm