Simulated Events Production on the Grid for the BaBar Experiment

Simulated Events Production on the Grid for the BaBar Experiment

Daniele Andreotti (Ferrara University, Italy), Armando Fella (CNAF Bologna Centre, Italy) and Eleonora Luppi (Ferrara University, Italy)
DOI: 10.4018/978-1-60566-184-1.ch022
OnDemand PDF Download:
$37.50

Abstract

The BaBar experiment uses data since 1999 in examining the violation of charge and parity (CP) symmetry in the field of high energy physics. This event simulation experiment is a compute intensive task due to the complexity of the Monte-Carlo simulation implemented on the GEANT engine. Data needed as input for the simulation (stored in the ROOT format), are classified into two categories: conditions data for describing the detector status when data are recorded, and background triggers data for noise signal necessary to obtain a realistic simulation. In this chapter, the grid approach is applied to the BaBar production framework using the INFN-GRID network.
Chapter Preview
Top

Background

The BaBar experiment (Cowan, 2007), developed at SLACSLAC (Stanford Linear Accelerator Center), Stanford University, studies the violation of charge and parity (CP) symmetry, a well known topic in the high energy physics field. The universe presents a composition where the difference between matter and anti-matter is subtle, and thus the experiment is geared towards understanding why matter prevails on anti-matter. High-energy electrons and positrons continuously collide every 250 million times per second to create rare B-meson and anti-B-meson. Such events are recorded for further analysis.

High speed electronics events require about 30kB of storage for each event. Some events are reconstructed from raw data and then separated (”skimmed”) into approximately 200 data streams according to their physics properties. These data streams are made available as datasets for analysis and used by 600 researchers based at 75 institutes in 10 countries. The data streams result in increased storage requirements as each event is duplicated in different streams but each data stream can be analyzed more quickly. The BaBar experiment has accumulated to date about 525 fb-1 integrated luminosity.

Another important task, called Simulation Production (SP), is focused on the simulation of the experiment to reconstruct events produced through a simulation based on the Monte Carlo method that compares real data with the theoretical model. Accurate simulations, based on the Monte Carlo method, need fast reprocessing of data for distributing a large amount of simulated events for analysis purpose.

All information concerning the detector, like calibrations and efficiencies, represent its status during data acquisition and are called condition data. This information is mandatory for describing the real state of the system during the generation of simulated events. Along with condition data, other important information is represented by the background triggers component, the noise recorded when data are taken, that addresses the requirement for a realistic reconstruction of simulated events. At least three times as many simulated events are needed as data events. With the traditional production system, each simulated event takes 4 seconds on a modern processor and results in 20kB of storage.

In order to speed up data access for the huge amount of events produced, both types of data are stored following the ROOT (Brun and Rademakers, 1997) framework schema that allows one to represent data as objects, describing parameters like energy, speed and trajectory as attributes that can be easily accessed through specific methods, by the code in charge of analyzing them.

Key Terms in this Chapter

ROOT: A set of frameworks on data analysis that can be easily extended to the Object Oriented approach implemented. Data are represented as objects that can be accessed to retrieve all information needed for further computation.

LFC: LCG File Catalog. It is a high performance file catalogue that addresses availability and scalability issues storing both logical and physical file mappings.

BaBar Software: BaBar software is organized in terms of packages. A package is a self-contained piece of software intended to perform a well defined task. Some packages may not be usable on their own, requiring integration with others. A software release consists of a coherent set of packages together with the libraries and binaries created for various machine architectures.

Distributed Computing: The distributed computing paradigm envisages the execution of particular software on two or more computational systems. The software can be developed for a pure parallel computation (e.g. parallelism on data, on task, instruction level parallelism) or can be managed by a dependency structure as a pipeline or farm model. In the BaBar scenario the computational tasks are performed using data parallelism on a distributed computer farm (twenty farms in 5 countries) and on LCG/gLite based Grid in Italy and UK.

Monte Carlo Method: A statistical approach ideal for those kind of problems that are too complicated to solve using analytical methods and that guarantees accurate results when applied several times to the problem domain. A classical example is the computation of the value of p generating couples of random numbers, (x,y). The ratio of the number of couples that satisfy the rule: x²+y²<=1 and the total number of couples generated is an approximation of p/4. The more random couples that are generated, the more accurate the approximation is.

BaBar Import/Export Tools: The main tasks the BaBar collaboration carries out at remote sites are the event reconstruction, the simulation production and the physics data analysis. Despite the fact that each duty needs to access a different data type and database metadata information, only one import/export software suite is shared between all the site managers.

Grid Monitoring: The INFN-Grid` infrastructure includes several monitoring systems for different purposes and granularity. Monitor activity can focus on resources status and services availability at each site or across the grid as a whole and display useful data aggregated per site, service and VO.

Complete Chapter List

Search this Book:
Reset
Editorial Advisory Board
Table of Contents
Foreword
Ruth E. Shaw
Preface
Emmanuel Udoh, Frank Zhigang Wang
Acknowledgment
Emmanuel Udoh
Chapter 1
Emmanuel Udoh, Frank Zhigang Wang, Vineet R. Khare
This chapter presents a historical record of the advent of Grid with a recourse to some basic definitions commonly accepted by most researchers. It... Sample PDF
Overview of Grid Computing
$37.50
Chapter 2
Eric Aubanel
The problem of load balancing parallel applications is particularly challenging on computational grids, since the characteristics of both the... Sample PDF
Resource-Aware Load Balancing of Parallel Applications
$37.50
Chapter 3
Enis Afgan, Purushotham Bangalore
Grid computing has emerged as the next generation computing platform. Because of the resource heterogeneity that exists in the grid environment... Sample PDF
Assisting Efficient Job Planning and Scheduling in the Grid
$37.50
Chapter 4
Kuo-Chan Huang, Po-Chi Shih, Yeh-Ching Chung
Most current grid environments are established through collaboration among a group of participating sites which volunteer to provide free computing... Sample PDF
Effective Resource Allocation and Job Scheduling Mechanisms for Load Sharing in a Computational Grid
$37.50
Chapter 5
Tevfik Kosar
As the data requirements of scientific distributed applications increase, the access to remote data becomes the main performance bottleneck for... Sample PDF
Data-Aware Distributed Batch Scheduling
$37.50
Chapter 6
Gianni Pucciani, Flavia Donno, Andrea Domenici, Heinz Stockinger
Data replication is a well-known technique used in distributed systems in order to improve fault tolerance and make data access faster. Several... Sample PDF
Consistency of Replicated Datasets in Grid Computing
$37.50
Chapter 7
Ming Wu, Xian-He Sun
Rapid advancement of communication technology has changed the landscape of computing. New models of computing, such as business-on-demand, Web... Sample PDF
Quality of Service of Grid Computing
$37.50
Chapter 8
QoS in Grid Computing  (pages 75-83)
Zhihui Du, Zhili Cheng, Xiaoying Wang, Chuang Lin
This chapter first summarizes popular terms of QoS related concepts and technologies in grid computing, including SLA, End-to-End QoS Provision and... Sample PDF
QoS in Grid Computing
$37.50
Chapter 9
Kris Bubendorfer, Ben Palmer, Ian Welch
A Grid resource broker is the arbiter for access to a Grid’s computational resources and therefore its performance and functionality has a... Sample PDF
Trust and Privacy in Grid Resource Auctions
$37.50
Chapter 10
Sandro Fiore, Alessandro Negro, Salvatore Vadacca, Massimo Cafaro, Giovanni Aloisio, Roberto Barbera
Grid computing is an emerging and enabling technology allowing organizations to easily share, integrate and manage resources in a distributed... Sample PDF
An Architectural Overview of the GRelC Data Access Service
$37.50
Chapter 11
Man Wang, Zhihui Du, Zhili Cheng
Resource Management System (RMS), which manages the Grid resources and matches the applications’ requests to the proper resources, is one of the... Sample PDF
Adaptive Resource Management in Grid Environment
$37.50
Chapter 12
Vineet R. Khare, Frank Zhigang Wang
The need for a dynamic and scalable expansion of the grid infrastructure and resources and other scalability issues in terms of execution efficiency... Sample PDF
Bio-Inspired Grid Resource Management
$37.50
Chapter 13
Yuhui Deng, Frank Zhigang Wang, Na Helian
Storage Grid is a new model for deploying and managing the heterogeneous, dynamic, large-scale, and geographically distributed storage resources.... Sample PDF
Service Oriented Storage System Grid
$37.50
Chapter 14
Dominic Cherry, Maozhen Li, Man Qi
This chapter presents MediaGrid, a distributed storage system for archiving broadcast media contents. MediaGrid utilizes storage resources donated... Sample PDF
A Distributed Storage System for Archiving Broadcast Media Content
$37.50
Chapter 15
Maozhen Li, Man Qi, Bin Yu
The computational grid is rapidly evolving into a service-oriented computing infrastructure that facilitates resource sharing and large-scale... Sample PDF
Service Discovery with Rough Sets
$37.50
Chapter 16
Irfan Habib, Ashiq Anjum, Richard McClatchey
Due to some barriers to adoption we have not seen a proliferation of Grid Computing technologies throughout e-Science or other domains. This chapter... Sample PDF
On the Pervasive Adoption of Grid Technologies: A Grid Operating System
$37.50
Chapter 17
Kurt Vanmechelen, Jan Broeckhove, Wim Depoorter, Khalid Abdelkader
As grid computing technology moves further up the adoption curve, the issues of dealing with conflicting user requirements formulated by different... Sample PDF
Pricing Computational Resources in Grid Economies
$37.50
Chapter 18
Rosario M. Piro
Large, geographically distributed and heterogeneous computing infrastructures, such as the Grid, often span multiple organizations and... Sample PDF
Resource Usage Accounting in Grid Computing
$37.50
Chapter 19
Frans Arickx, Jan Broeckhove, Peter Hellinckx, David Dewolfs, Kurt Vanmechelen
Quantum structure or scattering calculations often belong to a class of computational problems involving the aggregation of a set of matrices... Sample PDF
Grid-Based Nuclear Physics Applications
$37.50
Chapter 20
Gabriel Aparicio, Fernando Blanco, Ignacio Blanquer, César Bonavides, Juan Luis Chaves, Miguel Embid, Álvaro Hernández
In the last years an increasing demand for Grid Infrastructures has resulted in several international collaborations. This is the case of the EELA... Sample PDF
Developing Biomedical Applications in the Framework of EELA
$37.50
Chapter 21
Gerald Schaefer, Roger Tait
Efficient approaches to computationally intensive image processing tasks are currently highly sought after. In this chapter, the authors show how a... Sample PDF
Distributed Image Processing on a Blackboard System
$37.50
Chapter 22
Daniele Andreotti, Armando Fella, Eleonora Luppi
The BaBar experiment uses data since 1999 in examining the violation of charge and parity (CP) symmetry in the field of high energy physics. This... Sample PDF
Simulated Events Production on the Grid for the BaBar Experiment
$37.50
Chapter 23
Diego Liberati
A framework is proposed that creates, uses, and communicates information, whose organizational dynamics allows performing a distributed cooperative... Sample PDF
A Framework for Semantic Grid in E-Science
$37.50
Chapter 24
Roberto Barbera, Valeria Ardizzone, Leandro Ciuffo
The Grid INFN virtual Laboratory for Dissemination Activities (GILDA) is a fully working Grid test-bed devoted to training and dissemination... Sample PDF
Grid INFN Virtual Laboratory for Dissemination Activities (GILDA)
$37.50
Chapter 25
Dirk Gorissen, Tom Dhaene, Piet Demeester, Jan Broeckhove
The simulation and optimization of complex systems is a very time consuming and computationally intensive task. Therefore, global surrogate modeling... Sample PDF
Grid Enabled Surrogate Modeling
$37.50
Chapter 26
Patrik Skogster
Grid computing is becoming as essential part of different business analysis. In traditional business computing infrastructures data transfer occurs... Sample PDF
GIS Grids and the Business Use of GIS Data
$37.50
Chapter 27
Gokop Goteng, Ashutosh Tiwari, Rajkumar Roy
The emerging grid technology provides a secured platform for multidisciplinary experts in the security intelligence profession to collaborate and... Sample PDF
Grid Computing: Combating Global Terrorism with the World Wide Grid
$37.50
Chapter 28
Salvatore Scifo
This chapter focuses on the efforts to design and develop a standard pure Java API to access the metadata service of the EGEE Grid middleware, and... Sample PDF
Accessing Grid Metadata through a Web Interface
$37.50
Chapter 29
Jyotsna Sharma
Efforts in Grid Computing, both in academia and industry, continue to grow rapidly worldwide for research, scientific and commercial purposes.... Sample PDF
Grid Computing Initiatives in India
$37.50
Chapter 30
Hai Jin, Li Qi, Jie Dai, Yaqin Luo
A grid system is usually composed of thousands of nodes which are broadly distributed in different virtual organizations. Owing to geographical... Sample PDF
Dynamic Maintenance in ChinaGrid Support Platform
$37.50
About the Contributors