TY - JOUR
T1 - The Multimodal Universe
T2 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024
AU - The Multimodal Universe Collaboration
AU - Angeloudi, Eirini
AU - Audenaert, Jeroen
AU - Bowles, Micah
AU - Boyd, Benjamin M.
AU - Chemaly, David
AU - Cherinka, Brian
AU - Ciucă, Ioana
AU - Cranmer, Miles
AU - Do, Aaron
AU - Grayling, Matthew
AU - Hayes, Erin E.
AU - Hehir, Tom
AU - Ho, Shirley
AU - Huertas-Company, Marc
AU - Iyer, Kartheik G.
AU - Jablonska, Maja
AU - Lanusse, Francois
AU - Leung, Henry W.
AU - Mandel, Kaisey
AU - Martínez-Galarza, Juan Rafael
AU - Melchior, Peter
AU - Meyer, Lucas
AU - Parker, Liam H.
AU - Qu, Helen
AU - Shen, Jeff
AU - Smith, Michael J.
AU - Stone, Connor
AU - Walmsley, Mike
AU - Wu, John F.
N1 - Publisher Copyright:
© 2024 Neural information processing systems foundation. All rights reserved.
PY - 2024
Y1 - 2024
N2 - We present the Multimodal Universe, a large-scale multimodal dataset of scientific astronomical data, compiled specifically to facilitate machine learning research. Overall, the Multimodal Universe contains hundreds of millions of astronomical observations, constituting 100 TB of multi-channel and hyper-spectral images, spectra, multivariate time series, as well as a wide variety of associated scientific measurements and “metadata”. In addition, we include a range of benchmark tasks representative of standard practices for machine learning methods in astrophysics. This massive dataset will enable the development of large multi-modal models specifically targeted towards scientific applications. All codes used to compile the Multimodal Universe and a description of how to access the data is available at https://github.com/MultimodalUniverse/MultimodalUniverse.
AB - We present the Multimodal Universe, a large-scale multimodal dataset of scientific astronomical data, compiled specifically to facilitate machine learning research. Overall, the Multimodal Universe contains hundreds of millions of astronomical observations, constituting 100 TB of multi-channel and hyper-spectral images, spectra, multivariate time series, as well as a wide variety of associated scientific measurements and “metadata”. In addition, we include a range of benchmark tasks representative of standard practices for machine learning methods in astrophysics. This massive dataset will enable the development of large multi-modal models specifically targeted towards scientific applications. All codes used to compile the Multimodal Universe and a description of how to access the data is available at https://github.com/MultimodalUniverse/MultimodalUniverse.
UR - http://www.scopus.com/inward/record.url?scp=105000557459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105000557459&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:105000557459
SN - 1049-5258
VL - 37
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 9 December 2024 through 15 December 2024
ER -