• The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.
  • The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton University, which provides HPC resources in a university context.
  • Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
  • Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We present the results of our initial investigations into the use of ARM processors for scientific computing applications. In particular we report the results from our work with a current generation ARMv7 development board to explore ARM-specific issues regarding the software development environment, operating system, performance benchmarks and issues for porting High Energy Physics software.
  • We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing.
  • IGUANA is a generic interactive visualisation framework based on a C++ component model. It provides powerful user interface and visualisation primitives in a way that is not tied to any particular physics experiment or detector design. The article describes interactive visualisation tools built using IGUANA for the CMS and D0 experiments, as well as generic GEANT4 and GEANT3 applications. It covers features of the graphical user interfaces, 3D and 2D graphics, high-quality vector graphics output for print media, various textual, tabular and hierarchical data views, and integration with the application through control panels, a command line and different multi-threading models.
  • In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have resolved not only issues regarding integration with other EDG service components but also many of the interoperability issues with components of our partner projects in Europe and the U.S. The paper concludes with the basic lessons learned during this operation. These conclusions provide the motivation for the architecture of the next generation of Data Management Services that will be deployed in EDG during 2003.