Nested data structures in array frameworks

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

The need for nested data structures and combinatorial operations on arbitrary length lists has prevented particle physicists from adopting array-based data analysis frameworks, such as R, MATLAB, Numpy, and Pandas. These array frameworks work well for purely rectangular tables and hypercubes, but arrays of variable length arrays, called "jagged arrays," are out of their scope. However, jagged arrays are a fundamental feature of particle physics data, as well as combining them to search for particle decays. To bridge this gap, we developed the awkward-array library, and in this paper we present feedback from some of the first physics groups using it for their analyses. They report similar computational performance to analysis code written in C++, but are split on the ease-of-use of array syntax. In a series of four phone interviews, all users noted how different array programming is from imperative programming, but whereas some found it easier in all aspects, others said it was more difficult to write, yet easier to read.

Original languageEnglish (US)
Article number012053
JournalJournal of Physics: Conference Series
Volume1525
Issue number1
DOIs
StatePublished - Jul 7 2020
Event19th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2019 - Saas-Fee, Switzerland
Duration: Mar 11 2019Mar 15 2019

All Science Journal Classification (ASJC) codes

  • General Physics and Astronomy

Fingerprint

Dive into the research topics of 'Nested data structures in array frameworks'. Together they form a unique fingerprint.

Cite this