An Open Data Service for Supporting Research in Machine Learning on Tokamak Data

Samuel Jackson, Saiful Khan, Nathan Cummings, James Hodson, Shaun De Witt, Stanislas Pamela, Rob Akers, Jeyan Thiyagalingam, A. Kirk, J. Adamek, R. J. Akers, S. Allan, L. Appel, F. Arese Lucini, M. Barnes, T. Barrett, N. Ben Ayed, W. Boeglin, J. Bradley, P. K. BrowningJ. Brunner, P. Cahyna, S. Cardnell, M. Carr, F. Casson, M. Cecconello, C. Challis, I. T. Chapman, S. Chapman, J. Chorley, S. Conroy, N. Conway, W. A. Cooper, M. Cox, N. Crocker, B. Crowley, G. Cunningham, A. Danilov, D. Darrow, R. Dendy, D. Dickinson, W. Dorland, B. Dudson, D. Dunai, L. Easy, S. Elmore, M. Evans, T. Farley, N. Fedorczak, A. Field, G. Fishpool, I. Fitzgerald, M. Fox, S. Freethy, L. Garzotti, Y. C. Ghim, K. Gi, K. Gibson, M. Gorelenkova, W. Gracias, C. Gurl, W. Guttenfelder, C. Ham, J. Harrison, D. Harting, E. Havlickova, N. Hawkes, T. Hender, S. Henderson, E. Highcock, J. Hillesheim, B. Hnat, J. Horacek, J. Howard, D. Howell, B. Huang, K. Imada, M. Inomoto, R. Imazawa, O. Jones, K. Kadowaki, S. Kaye, D. Keeling, I. Klimek, M. Kocan, L. Kogan, M. Komm, W. Lai, J. Leddy, H. Leggate, J. Hollocombe, B. Lipschultz, S. Lisgo, Y. Q. Liu, B. Lloyd, B. Lomanowski, V. Lukin, I. Lupelli, G. Maddison, J. Madsen, J. Mailloux, R. Martin, G. McArdle, K. McClements, B. McMillan, A. Meakins, H. Meyer, C. Michael, F. Militello, J. Milnes, A. W. Morris, G. Motojima, D. Muir, G. Naylor, A. Nielsen, M. O'Brien, T. O'Gorman, M. O'Mullane, J. Olsen, J. Omotani, Y. Ono, S. Pamela, L. Pangione, F. Parra, A. Patel, W. Peebles, R. Perez, S. Pinches, L. Piron, M. Price, M. Reinke, P. Ricci, F. Riva, C. Roach, M. Romanelli, D. Ryan, S. Saarelma, A. Saveliev, R. Scannell, A. Schekochihin, S. Sharapov, R. Sharples, V. Shevchenko, K. Shinohara, S. Silburn, J. Simpson, A. Stanier, J. Storrs, H. Summers, Y. Takase, P. Tamain, H. Tanabe, H. Tanaka, K. Tani, D. Taylor, D. Thomas, N. Thomas-Davies, A. Thornton, M. Turnyanskiy, M. Valovic, R. Vann, F. Van Wyk, N. Walkden, T. Watanabe, H. Wilson, M. Wischmeier, T. Yamada, J. Young, S. Zoletnik

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to address the evolving needs of machine-learning-driven fusion research. Our system provides a framework to organize MAST, MAST upgrade (MAST-U), and Joint European Torus (JET) experimental data in accordance with findability, accessibility, interoperability, and reuse (FAIR) principles, using distributed object storage for scalability and a relational database for efficient metadata indexing. In addition, it offers simplified abstractions through an application programming interface (API), facilitating seamless data access and integration with data analysis and machine learning workflows. Performance evaluation of metrics such as data load time and throughput, across varying numbers of parallel workers, demonstrates the data pipeline's optimization for efficient machine learning application development. Our solution significantly enhances support for data-driven research and machine learning applications in fusion by laying the groundwork for open, FAIR-compliant fusion data, which enables cross-machine analysis, prompts international collaboration, and potentially accelerates advancements in fusion energy research.

Original languageEnglish (US)
JournalIEEE Transactions on Plasma Science
DOIs
StateAccepted/In press - 2025

All Science Journal Classification (ASJC) codes

  • Nuclear and High Energy Physics
  • Condensed Matter Physics

Keywords

  • Fusion data
  • scientific data management
  • web service application programming interface (API)

Fingerprint

Dive into the research topics of 'An Open Data Service for Supporting Research in Machine Learning on Tokamak Data'. Together they form a unique fingerprint.

Cite this