Abstract
The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to address the evolving needs of machine-learning-driven fusion research. Our system provides a framework to organize MAST, MAST upgrade (MAST-U), and Joint European Torus (JET) experimental data in accordance with findability, accessibility, interoperability, and reuse (FAIR) principles, using distributed object storage for scalability and a relational database for efficient metadata indexing. In addition, it offers simplified abstractions through an application programming interface (API), facilitating seamless data access and integration with data analysis and machine learning workflows. Performance evaluation of metrics such as data load time and throughput, across varying numbers of parallel workers, demonstrates the data pipeline's optimization for efficient machine learning application development. Our solution significantly enhances support for data-driven research and machine learning applications in fusion by laying the groundwork for open, FAIR-compliant fusion data, which enables cross-machine analysis, prompts international collaboration, and potentially accelerates advancements in fusion energy research.
| Original language | English (US) |
|---|---|
| Journal | IEEE Transactions on Plasma Science |
| DOIs | |
| State | Accepted/In press - 2025 |
All Science Journal Classification (ASJC) codes
- Nuclear and High Energy Physics
- Condensed Matter Physics
Keywords
- Fusion data
- scientific data management
- web service application programming interface (API)