Particle physics has an ambitious and broad experimental programme for the
coming decades. This programme requires large investments in detector hardware,
either to build new facilities and experiments, or to upgrade existing ones.
Similarly, it requires commensurate investment in the R&D of software to
acquire, manage, process, and analyse the shear amounts of data to be recorded.
In planning for the HL-LHC in particular, it is critical that all of the
collaborating stakeholders agree on the software goals and priorities, and that
the efforts complement each other. In this spirit, this white paper describes
the R&D activities required to prepare for this software upgrade.
The primary motivation for uptake of virtualization has been resource
isolation, capacity management and resource customization allowing resource
providers to consolidate their resources in virtual machines. Various
approaches have been taken to integrate virtualization in to scientific Grids
especially in the arena of High Performance Computing (HPC) to run grid jobs in
virtual machines, thus enabling better provisioning of the underlying resources
and customization of the execution environment on runtime. Despite the gains,
virtualization layer also incur a performance penalty and its not very well
understood that how such an overhead will impact the performance of systems
where jobs are scheduled with tight deadlines. Since this overhead varies the
types of workload whether they are memory intensive, CPU intensive or network
I/O bound, and could lead to unpredictable deadline estimation for the running
jobs in the system. In our study, we have attempted to tackle this problem by
developing an intelligent scheduling technique for virtual machines which
monitors the workload types and deadlines, and calculate the system over head
in real time to maximize number of jobs finishing within their agreed
Virtualization technology has enabled applications to be decoupled from the
underlying hardware providing the benefits of portability, better control over
execution environment and isolation. It has been widely adopted in scientific
grids and commercial clouds. Since virtualization, despite its benefits incurs
a performance penalty, which could be significant for systems dealing with
uncertainty such as High Performance Computing (HPC) applications where jobs
have tight deadlines and have dependencies on other jobs before they could run.
The major obstacle lies in bridging the gap between performance requirements of
a job and performance offered by the virtualization technology if the jobs were
to be executed in virtual machines. In this paper, we present a novel approach
to optimize job deadlines when run in virtual machines by developing a
deadline-aware algorithm that responds to job execution delays in real time,
and dynamically optimizes jobs to meet their deadline obligations. Our
approaches borrowed concepts both from signal processing and statistical
techniques, and their comparative performance results are presented later in
the paper including the impact on utilization rate of the hardware resources.
The primary motivation for uptake of virtualization have been resource
isolation, capacity management and resource customization: isolation and
capacity management allow providers to isolate users from the site and control
their resources usage while customization allows end-users to easily project
the required environment onto a variety of sites. Various approaches have been
taken to integrate virtualization with Grid technologies. In this paper, we
propose an approach that combines virtualization on the existing software
infrastructure such as Pilot Jobs with minimum change on the part of resource
During the last years large farms have been built using commodity hardware.
This hardware lacks components for remote and automated administration.
Products that can be retrofitted to these systems are either costly or
inherently insecure. We present a system based on serial ports and simple
machine controlled relays. We report on experience gained by setting up a
50-machine test environment as well as current work in progress in the area.