Data Sharing in CSCR: Towards In-Depth Long Term Collaboration

Data Sharing in CSCR: Towards In-Depth Long Term Collaboration

Christophe Reffay (Ecole Normale Supérieure de Cachan, France & Ecole Normale Supérieure de Lyon, France), Gregory Dyke (Carnegie Mellon University, USA) and Marie-Laure Betbeder (Université de Franche-Comté, France)
DOI: 10.4018/978-1-4666-0125-3.ch006
OnDemand PDF Download:
List Price: $37.50


In this chapter, the authors show the importance of data in the research process and the potential benefit for communities to share research data. Although most of their references are taken from the fields of Computer Supported Collaborative Learning and Intelligent Tutoring Systems, they claim that their argument applies to any other field studying complex situations that need to be analyzed by different disciplines, methods, and instruments. The authors point out the evolution of scientific publication, especially its openness and the variety of its emerging forms. This leads them to propose corpora as boundary objects for various communities in the scientific sphere. Data release being itself a complex problem, the authors use the Mulce1 experience to show how sharable data can be built and made available. Once corpora are considered available, they discuss the potential of their reuse for multiple analyses or derivation. They focus on analytic representations and their combination with initial data or complementary analytic representations by presenting a tool named Tatiana. Finally, the authors propose their vision of data sharing in a world where scientists would use social network applications.
Chapter Preview


In the research process, data is crucial and often hard to collect. Researchers spend a lot of time designing studies and collecting, transforming, analyzing, or interpreting data. Once analyzed and communicated in some publication by a local research team, data is often lost and can’t be re-used by anybody. This means that other researchers have no access to original data to deepen their understanding by replicating an analysis or comparing their own results on the same data with a slightly different analysis method.

In this chapter, we would like to draw the state of the art in data sharing among research communities and, in particular, to report the results of the Mulce project1. This project’s main results are the design and creation of a data structure and a corresponding platform to share learning and teaching corpora. These results give the community a way to access, share, analyze and visualize learning and teaching corpora.

This work has been motivated by the lack of impact of research results in the real world of online learning. In the CSCL (Computer Supported Collaborative Learning) research for example, a very wide range of indicators on collaboration have been designed and prototyped in a particular context but almost none of them are reused in other situations or contexts. We argue in this chapter that our research community should be able to widen the validity of its results by sharing data, tools and analyses performed with these tools.

In their work on the coding and counting analysis methodology, Rourke, Anderson, Garrisson, and Archer (2001) have pointed out the weakness of our research domain. Replicability, reliability and objectivity need to be improved in our work. The main idea of research collaboration is well expressed by (Chan, et al., 2006) in the following terms:

“There is urgent need of putting together complementary strengths and contexts and combining our insights as rapidly as possible to make a greater impact and further elevate our research quality at the same time. Research generally has had a small voice in national educational outcomes; we can speak louder if we speak together.” (Chan et al., 2006)

Considering e-Research as an efficient way to meet and collaborate, this chapter suggests that e-collaboration could provide emerging communities with tools and virtual places to actually share their data, analyses and results in order to improve their theories, knowledge and tools. Although the focus of our work is on CSCL, we argue that this proposal is not limited to this domain or even to its contributing disciplines, and that the core ideas and benefits of our proposal can be extrapolated to other fields of research.

In the remainder of the chapter, we first examine current trends in scientific publication and the central role played by data in the scientific process. We then highlight the particular problem posed by data collection and replication in learning-related research and examine the state of the art for data sharing within this context. The Mulce proposal for constructing and sharing learning and teaching corpora is presented in detail, followed by the Tatiana framework for creating and re-using analytic representations. We conclude by drawing up our vision of data sharing within the learning sciences field and describe how other fields can draw upon our experience to construct data and analysis sharing models of their own.

Complete Chapter List

Search this Book: