TY - JOUR
T1 - An Adaptable Seismic Data Format
AU - Krischer, Lion
AU - Smith, James A.
AU - Lei, Wenjie
AU - Lefebvre, Matthieu
AU - Ruan, Youyi
AU - de Andrade, Elliott Sales
AU - Podhorszki, Norbert
AU - Bozdağ, Ebru
AU - Tromp, Jeroen
N1 - Funding Information:
This research was partially supported by the EU-FP7 VERCE project (number 283543) and US NSF grant 1112906. We are grateful for the QUEST Initial Training Network (Marie Curie Actions, http://www.quest-itn.org) and the Computational Infrastructure for Geodynamics (CIG, https://geodynamics.org/) organization for holding a joint workshop that sparked the creation of the ASDF format. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. The authors also recognize support from the NSERC G8 Research Councils Initiative on Multilateral Research Funding and the Discovery Grant No. 487237. Additionally, we thank Chad Trabant and Tim Ahern from the Incorporated Research Institutions for Seismology (IRIS) as well as Emiliano Russo, Peter Danecek and Rodolfo Puglia for fruitful discussions and useful tips. We also thank editor Andrea Morelli and two anonymous reviewers for their thoughtful comments which helped improve the manuscript. Finally, we gratefully acknowledge conversations with HDF5Director of Earth Science Ted Habermann and help from Mohamad Chaarawi via the HDF5 User's Forum.
Publisher Copyright:
© The Author 2015. Published by Oxford University Press on behalf of The Royal Astronomical Society.
PY - 2016/11/1
Y1 - 2016/11/1
N2 - We present ASDF, the Adaptable Seismic Data Format, a modern and practical data format for all branches of seismology and beyond. The growing volume of freely available data coupled with ever expanding computational power opens avenues to tackle larger and more complex problems. Current bottlenecks include inefficient resource usage and insufficient data organization. Properly scaling a problem requires the resolution of both these challenges, and existing data formats are no longer up to the task. ASDF stores any number of synthetic, processed or unaltered waveforms in a single file. A key improvement compared to existing formats is the inclusion of comprehensive meta information, such as event or station information, in the same file. Additionally, it is also usable for any non-waveform data, for example, cross-correlations, adjoint sources or receiver functions. Last but not least, full provenance information can be stored alongside each item of data, thereby enhancing reproducibility and accountability. Any data set in our proposed format is self-describing and can be readily exchanged with others, facilitating collaboration. The utilization of the HDF5 container format grants efficient and parallel I/O operations, integrated compression algorithms and check sums to guard against data corruption. To not reinvent the wheel and to build upon past developments, we use existing standards like QuakeML, StationXML, W3C PROV and HDF5 wherever feasible. Usability and tool support are crucial for any new format to gain acceptance. We developed mature C/Fortran and Python based APIs coupling ASDF to the widely used SPECFEM3D_GLOBE and ObsPy toolkits.
AB - We present ASDF, the Adaptable Seismic Data Format, a modern and practical data format for all branches of seismology and beyond. The growing volume of freely available data coupled with ever expanding computational power opens avenues to tackle larger and more complex problems. Current bottlenecks include inefficient resource usage and insufficient data organization. Properly scaling a problem requires the resolution of both these challenges, and existing data formats are no longer up to the task. ASDF stores any number of synthetic, processed or unaltered waveforms in a single file. A key improvement compared to existing formats is the inclusion of comprehensive meta information, such as event or station information, in the same file. Additionally, it is also usable for any non-waveform data, for example, cross-correlations, adjoint sources or receiver functions. Last but not least, full provenance information can be stored alongside each item of data, thereby enhancing reproducibility and accountability. Any data set in our proposed format is self-describing and can be readily exchanged with others, facilitating collaboration. The utilization of the HDF5 container format grants efficient and parallel I/O operations, integrated compression algorithms and check sums to guard against data corruption. To not reinvent the wheel and to build upon past developments, we use existing standards like QuakeML, StationXML, W3C PROV and HDF5 wherever feasible. Usability and tool support are crucial for any new format to gain acceptance. We developed mature C/Fortran and Python based APIs coupling ASDF to the widely used SPECFEM3D_GLOBE and ObsPy toolkits.
KW - Computational seismology
KW - Seismic tomography
KW - Time-series analysis
KW - Wave propagation
UR - http://www.scopus.com/inward/record.url?scp=84994718663&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994718663&partnerID=8YFLogxK
U2 - 10.1093/gji/ggw319
DO - 10.1093/gji/ggw319
M3 - Article
AN - SCOPUS:84994718663
SN - 0956-540X
VL - 207
SP - 1003
EP - 1011
JO - Geophysical Journal International
JF - Geophysical Journal International
IS - 2
ER -