Crowdsourced Knowledge Acquisition: Towards Hybrid-Genre Workflows

Crowdsourced Knowledge Acquisition: Towards Hybrid-Genre Workflows

Marta Sabou (Department of New Media Technology, MODUL University Vienna, Vienna, Austria), Arno Scharl (Department of New Media Technology, MODUL University Vienna, Vienna, Austria) and Michael Föls (Research Institute for Computational Methods, Vienna University of Economics and Business, Vienna, Austria)
Copyright: © 2013 |Pages: 28
DOI: 10.4018/ijswis.2013070102


Novel social media collaboration platforms, such as games with a purpose and mechanised labour marketplaces, are increasingly used for enlisting large populations of non-experts in crowdsourced knowledge acquisition processes. Climate Quiz uses this paradigm for acquiring environmental domain knowledge from non-experts. The game’s usage statistics and the quality of the produced data show that Climate Quiz has managed to attract a large number of players but noisy input data and task complexity led to low player engagement and suboptimal task throughput and data quality. To address these limitations, the authors propose embedding the game into a hybrid-genre workflow, which supplements the game with a set of tasks outsourced to micro-workers, thus leveraging the complementary nature of games with a purpose and mechanised labour platforms. Experimental evaluations suggest that such workflows are feasible and have positive effects on the game’s enjoyment level and the quality of its output.
Article Preview

1. Introduction

The difficulty of acquiring, representing and maintaining an intelligent system’s knowledge base has been coined as the knowledge acquisition bottleneck in Artificial Intelligence (AI) research (Feigenbaum, 1977). More than 30 years later, this problem continues to affect not only the AI area but also the field of the Semantic Web where the goal of building an intelligent layer over the World Wide Web (Berners-Lee et al., 2001) is hampered by the lack of Web-scale knowledge resources, both in terms of domain models (i.e., ontologies) and instance annotations. Recent years have seen a tremendous increase of openly available formal knowledge resources on the Web thanks to the linked data movement (Heath & Bizer, 2011), in particular regarding information on the instance level. Terminological knowledge, however, is still scarce, especially in novel (or less popular) domains.

The social web enables new ways for collaborative knowledge creation, as a way to overcome the knowledge acquisition bottleneck of the Semantic Web. Social media platforms facilitate involving large and diverse populations of users in the knowledge acquisition process, in one of the following two ways. A first set of approaches piggyback on the data created as part of other Web systems to derive useful knowledge assets. For example, folksonomy induction algorithms extract knowledge from folksonomies derived from social tagging systems such as Flickr (Stohmaier et al., 2012). Doan et al. (2011) consider that such approaches use an implicit crowdsourcing strategy to acquire their data. In contrast, a second set of approaches subscribes to an explicit crowdsourcing strategy by building their own, dedicated application for acquiring knowledge through large-scale social participation. For example, traditional knowledge creation tools have been extended to enable collective knowledge creation, including the Protégé ontology editor (Tudorache et al., 2013) or the GATE linguistic annotation toolkit (Bontcheva et al., 2013). While these extensions primarily support the collaborative and distributed work of knowledge experts, an increasing trend consists in allowing large populations of non-experts to create knowledge through the use of novel social media collaboration platforms such as games or mechanised labour platforms.

Climate Quiz ( is an example of such an approach: it is a game with a purpose deployed on Facebook that facilitates the creation of knowledge in the environmental domain by a large population of non-experts (Scharl et al., 2012). Our evaluation of the game detailed in Section 4 showed that, while it has attracted a high number of players, the heterogeneous domain relevance of its input data hampers player engagement, leads to short play times and affects the quality of the output. To overcome these limitations, we propose embedding the game into a hybrid-genre workflow, which splits the complex problem of knowledge acquisition into tasks performed both by players and micro-workers. This workflow leverages the pros and cons of games and mechanised labour platforms to improve gaming experience and output data quality. Our experiments show that such workflows are indeed possible, although future work will further fine-tune the synchronisation and task management across the two genres. This paper makes the following contributions:

  • Section 2 presents a survey of knowledge acquisition through crowdsourcing, concluding with recent trends in the field and a comparison of strengths and limitations of different crowdsourcing genres.

  • Climate Quiz Evaluation. As an extension of the earlier presentation of this game in (Scharl et al., 2012), this paper provides an in-depth evaluation of the game including its comparison with evaluation details of previous games that target (linguistic) knowledge acquisition (Section 4).

  • Implementation and evaluation of hybrid-genre workflows. We propose a novel concept to workflow integration, and exemplify its implementation with the Climate Quiz. We show experimentally that such hybrid-genre workflows are feasible and that they can improve results as compared to single-genre approaches (Section 5).

Complete Article List

Search this Journal:
Open Access Articles
Volume 16: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing