Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Towards Next Generation Provenance Systems for e-Science

Fakhri Alam Khan, Sardar Hussain, Ivan Janciak, Peter Brezany

Source Title: International Journal of Information System Modeling and Design (IJISMD) 2(3)

DOI: 10.4018/jismd.2011070102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

e-Science helps scientists to automate scientific discovery processes and experiments, and promote collaboration across organizational boundaries and disciplines. These experiments involve data discovery, knowledge discovery, integration, linking, and analysis through different software tools and activities. Scientific workflow is one technique through which such activities and processes can be interlinked, automated, and ultimately shared amongst the collaborating scientists. Workflows are realized by the workflow enactment engine, which interprets the process definition and interacts with the workflow participants. Since workflows are typically executed on a shared and distributed infrastructure, the information on the workflow activities, data processed, and results generated (also known as provenance), needs to be recorded in order to be reproduced and reused. A range of solutions and techniques have been suggested for the provenance of data collection and analysis; however, these are predominantly workflow enactment engine and domain dependent. This paper includes taxonomy of existing provenance techniques and a novel solution named VePS (The Vienna e-Science Provenance System) for e-Science provenance collection.

Article Preview

Top

Introduction

The main theme of e-Science (Schroeder, 2008) is to promote collaboration amongst researchers across their organizational boundaries and disciplines - to reduce coupleness and dependencies and encourage modular, distributed, and independent systems. This has resulted in dry-lab experiments also known as in-silico experiments (Cavalcanti et al., 2005). Unlike wet-lab experiments, the dry-lab experiments enable a researcher to plan an experiment, locate suitable activities via resource directories, combine them into a workflow, and execute it. e-Science workflows (Taylor et al., 2006) are used to specify the execution order of tasks (i.e. activities). A task may take data input, process it, and produce data output. Real world workflows are complex in nature and may contain several hundreds of activities. Scientists need their experimental activities to be recorded in order to be re-usable and re-producible, similar to the used annotation and book logging in wet-lab experiments. Workflow provenance (Khan et al., 2008) describes the workflow service invocations during its execution, information about services, input data, and data produced to help keeping track of workflow activities (Simmhan et al., 2005). It gives not only insight into the workflows, but enables re-execution of workflows as well. Provenance of workflows includes information about the underlying infrastructure, input and output of workflow activities, their transformations, and context used. e-Science workflows are typically executed on a distributed and dynamic infrastructure provided by different institutions - i.e. resources may join and leave continuously. Therefore, provenance, metadata, and annotations of workflows are of paramount importance for reliable and trustworthy e-Science workflows. There is a strong need to propose and build a provenance system that is in-line with the e-Science core theme of modularity and de-coupleness, which ultimately means domain and application independent provenance system. Key requirements for e-Science provenance systems are interoperability, domain independence, light weight, visualization, and report generation. Interoperability means that an e-Science provenance system should readily work across different domains, applications, and workflow enactment engines.

However, the existing research and development work is mainly focused on provenance collection tightly coupled with the workflow enactment engines, often specific to their projects. With the growing e-Science infrastructures there is a strong need for a provenance system that works across multiple domains and enactment engines. We call such a system loosely coupled provenance system. Not only portability is an important issue to address, but also the performance impact of the provenance collection process on the overall infrastructure as well, as provenance collection is an additional task to the core computational processing in e-Science workflows so that it should be lightweight.

The major contribution of this paper is twofold. First, various possible ways and scenarios through which provenance can be collected are discussed. Taxonomy of existing work according to those scenarios is elaborated based on the coupling of the provenance system to a concrete workflow enactment engine. Secondly, the Vienna e-Science Provenance System (VePS) focusing on workflow enactment engine independence, domain independence, portability, and less performance overhead is introduced together with its design, architecture, and the performance evaluation of our prototype implementation.

The rest of the paper is organized as follows. First, the concepts and terminologies used in our approach are introduced, and then the taxonomy of existing solutions for a provenance system is discussed. Introduction to the VePS architecture, design, and implementation is provided. Next we detail and share performance evaluation, experiences, and observed issues. Finally, we conclude our work and outline future development directions.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 8 Issues (2022): 7 Released, 1 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Towards Next Generation Provenance Systems for e-Science

Abstract

Introduction

Complete Article List