Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Integrating Data Management and Collaborative Sharing with Computational Science Research Processes

Kerstin Kleese van Dam, Mark James, Andrew M. Walker

Source Title: Handbook of Research on Computational Science and Engineering: Theory and Practice

DOI: 10.4018/978-1-61350-116-0.ch021

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This chapter describes the key principles and components of a good data management system, provides real world examples of how these can be successfully integrated with scientific research processes and enable successful data sharing, provides an outlook on future developments, and discusses lessons learned. We conclude with a short section on how to get started for those whose interest has been piqued.

Chapter Preview

Top

Introduction

Scientific research can be characterized by its aim to make descriptive, explanatory and predictive inferences on the basis of observed or simulated information about the real world. Ideally it uses explicit, codified and public methods and rules for its data collection and analysis. Repeatability, reproducibility and transparency are seen as the main pillars of good scientific research (King, 1994). In current computational science research these aims are often difficult to achieve because of its inherent complexities and distributed nature.

Scientists today rarely engage directly with their research object, but do so via digitally captured, reduced, calibrated, analyzed, synthesized and visualized data in combination with computer simulations of the processes of interest. Advances in experimental and computational technologies have led to an exponential growth in the volumes, variety and complexity of this data (Southan, 2009; Goble, 2009), and whilst the data deluge is not found everywhere in an absolute sense, it is seen in a relative sense within most research groups. Many lack the methods, tools and infrastructure to deal effectively with the increasing volumes, complexity and geographical distribution of the relevant data. But it is not data alone that challenges the scientific community. Scientists use a much more varied and extensive array of software products to engage with their data, combined in ever more complex workflows that are executed on very different platforms, at times unknown to the user (grids or clouds). This makes it much more difficult to follow the aims of good scientific research practices in terms of repeatability, reproducibility and transparency.

Leaving the aspirational aspects of scientific investigations aside, research practice has become much more collaborative than it was even a few years ago (Jones, 2008; Guimera, 2005), and few research projects do not rely on the sharing of processes and data amongst different group members or groups to accomplish their scientific goals. The increasing complexity of scientific challenges requires more interdisciplinary and multidisciplinary information and knowledge exchange (Committee on Facilitating Interdisciplinary Research, 2004). Whilst multidisciplinary data sharing is still rare, sharing of key data sets within particular research communities has become more mainstream in a range of scientific domains such as environmental sciences or biology (Field, 2009). This is often facilitated through dedicated data centers and expert data collections. In other fields, and specifically computational sciences, working practices around the sharing of research results have, however, not changed much over the past years. Research publications are still the main sources of information exchange in the wider community. Unfortunately publications have certain limitations in conveying comprehensive information on a particular subject; there is the limitation in length and thus detail, its main purpose is to convey the scientists’ point of view rather than a comprehensive, objective representation of all facts (Shotton, 2009; de Waard, 2006; Kuhn, 1962; Latour, 1987). Publications thus provide at best a very coarse and high level summary of the research work undertaken by the authors. The associated raw and derived data should be a rich source of supporting information, in particular, if coupled with the appropriate metadata and documented scientific workflows, forming a complete research object (DeRoure, 2009). In recognition of the desire by the research community to have access not only to the summary of a research project, but also the underpinning data, more publishers today require from their authors that they share their raw and derived data by depositing it into publicly accessible archives or by providing it on request. However, recent studies have shown (Savage, 2009; Wicherts, 2006) that few authors comply with the journals data deposition requirement and only the enforced deposition before publication seems to provide the desired result, indicating a continued reluctance to share in-depth research results with the general research community.

Key Terms in this Chapter

Semantic Technologies: Encoding of meaning separately from data files, content files, and application codes to enable machines as well as people to understand, share and reason with them at execution time

Data Management: All disciplines related to managing data as a valuable resource, here in particular we refer to Scientific Data Management, the management of storage, access, usage, lifecycle, content and meaning for scientific data

High Performance Computing: The use of parallel processing for running advanced application programs efficiently, reliably and quickly on leadership class computer systems

Collaborative: The term refers here both to working practices and supporting tools that allow and further the joint working of researchers with common interests that are often in geographically distributed locations.

Data Curation: The preservation and management of scientific data specifically for continuous reuse by identified, dedicated user groups.

Metadata: Data about Data

Data Lifecycle Management: Effective management and exploitation of data from its creation until it is obsolete.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Integrating Data Management and Collaborative Sharing with Computational Science Research Processes

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List