Particle physics has an ambitious and broad experimental programme for the
coming decades. This programme requires large investments in detector hardware,
either to build new facilities and experiments, or to upgrade existing ones.
Similarly, it requires commensurate investment in the R&D of software to
acquire, manage, process, and analyse the shear amounts of data to be recorded.
In planning for the HL-LHC in particular, it is critical that all of the
collaborating stakeholders agree on the software goals and priorities, and that
the efforts complement each other. In this spirit, this white paper describes
the R&D activities required to prepare for this software upgrade.
The ATLAS detector at CERN has completed its first full year of recording
collisions at 7 TeV, resulting in billions of events and petabytes of data. At
these scales, physicists must have the capability to read only the data of
interest to their analyses, with the importance of efficient selective access
increasing as data taking continues. ATLAS has developed a sophisticated
event-level metadata infrastructure and supporting I/O framework allowing event
selections by explicit specification, by back navigation, and by selection
queries to a TAG database via an integrated web interface. These systems and
their performance have been reported on elsewhere. The ultimate success of such
a system, however, depends significantly upon the efficiency of selective event
retrieval. Supporting such retrieval can be challenging, as ATLAS stores its
event data in column-wise orientation using ROOT trees for a number of reasons,
including compression considerations, histogramming use cases, and more. For
2011 data, ATLAS will utilize new capabilities in ROOT to tune the persistent
storage layout of event data, and to significantly speed up selective event
reading. The new persistent layout strategy and its implications for I/O
performance are described in this paper.