Data federation challenges in remote near-real-time fusion experiment data processing

Jong Choi, Ruonan Wang, R. Michael Churchill, Ralph Kube, Minjun Choi, Jinseop Park, Jeremy Logan, Kshitij Mehta, Greg Eisenhauer, Norbert Podhorszki, Matthew Wolf, C. S. Chang, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Fusion energy experiments and simulations provide critical information needed to plan future fusion reactors. As next-generation devices like ITER move toward long-pulse experiments, analyses, including AI and ML, should be performed in a wide range of time and computing constraints, from near-real-time constraints, between-shot analysis, and to campaign-wide long-term analysis. However, the data volume, velocity, and variety make it extremely challenging for analyses using only local computational resources. Researchers need the ability to compose and execute workflows spanning edge resources to large-scale highperformance computing facilities. We present Delta, a system to address data analysis challenges, including AI/ML, in fusion science, by leveraging the ADIOS I/O library and middleware, to support executing science workflows over the wide area network for near-real-time streaming. We discuss the data federation challenges in performing remote workflows, focusing on on-going research work in (1) managing, reducing, and streaming data to minimize I/O and data movement overheads, (2) decompressing and reorganizing data for analysis, and (3) executing workflows for automated data analysis. We introduce examples for deep-learning based data analysis for the fusion domain and demonstrate how we use Delta to construct end-to-end workflows for a fusion device in Korea, connecting a remote DOE facility in the USA. The capability demonstrated by this project is the basis for improving the state of the art for near-real-time data federation amongst remote facilities.

Original languageEnglish (US)
Title of host publicationDriving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI - 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020, Revised Selected Papers
EditorsJeffrey Nichols, Arthur ‘Barney’ Maccabe, Suzanne Parete-Koon, Becky Verastegui, Oscar Hernandez, Theresa Ahearn
PublisherSpringer Science and Business Media Deutschland GmbH
Pages285-299
Number of pages15
ISBN (Print)9783030633929
DOIs
StatePublished - 2021
Event17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020 - Virtual, Online
Duration: Aug 26 2020Aug 28 2020

Publication series

NameCommunications in Computer and Information Science
Volume1315 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020
CityVirtual, Online
Period8/26/208/28/20

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Keywords

  • Data federation
  • Data streams
  • Fusion
  • Remote data analysis

Fingerprint

Dive into the research topics of 'Data federation challenges in remote near-real-time fusion experiment data processing'. Together they form a unique fingerprint.

Cite this