Highlight

Processing Full-Scale Square Kilometre Array Data on the Summit Supercomputer

An artist rendering of the SKA’s low-frequency, cone-shaped antennas in Western Australia. Credit: SKA Project Office.
An artist rendering of the SKA’s low-frequency, cone-shaped antennas in Western Australia. Credit: SKA Project Office.

Achievement

The first ever end-to-end workflow for processing the Square Kilometre Array (SKA) data, composed and verified on the Summit Supercomputer. 
 

Significance and Impact

For the first time, radio astronomy data were generated and processed at 130 PFLOPS peak and 247 GB/s. The results are being used to reveal critical design factors for the next-generation radio telescopes and processing facilities.

Research Details

  • Designed and developed core components of an end-to-end data processing workflow for SKA, including the I/O sub-system using ADIOS.
  • Executed a typical full-scale SKA observation on Summit, using 4,560 nodes and 27,360 GPUs.
  • Verified the capability for processing a typical SKA observation in real time.
  • Achieved 925 GB/s pure I/O throughput for storing table-based radio astronomy data.
     

Citation and DOI

R. Wang, et al., "Processing Full-Scale Square Kilometre Array Data on the Summit Supercomputer," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA, US, 2020 pp. 11-22. doi: 10.1109/SC41405.2020.00006
 

Overview

This work presents a workflow for simulating and processing the full-scale low-frequency telescope data of the Square Kilometre Array (SKA) Phase 1. The SKA project will enter the construction phase soon, and once completed, it will be the world’s largest radio telescope and one of the world’s largest data generators. The authors used Summit to mimic an end- to-end SKA workflow, simulating a dataset of a typical 6 hour observation and then processing that dataset with an imaging pipeline. This workflow was deployed and run on 4,560 compute nodes, and used 27,360 GPUs to generate 2.6 PB of data. This was the first time that radio astronomical data were processed at this scale. Results show that the workflow has the capability to process one of the key SKA science cases, an Epoch of Reionization observation. This analysis also helps reveal critical design factors for the next-generation radio telescopes and the required dedicated processing facilities. 

The team ran Summit simulations because researchers cannot collect enough observational data to practice analyzing the SKA’s output. To study elusive radio light waves emanating from galaxies, the surroundings of black holes, and other objects of interest in outer space, the SKA will employ more than 130,000 low-frequency, cone-shaped antennas located in Western Australia and about 200 higher frequency, dish-shaped antennas located in South Africa. To emulate the Western Australia portion, the researchers ran two models on Summit—one of the antenna array and one of the early universe—through a software simulator designed by scientists from the University of Oxford that mimics the SKA’s data collection. The team also used an optimized ADIOS based I/O sub-system to tackle the data I/O challenges that was developed at ORNL, and a dedicated workflow management system developed at the International Centre for Radio Astronomy Research. 
 

Last Updated: January 17, 2021 - 3:57 pm