Article Preview
Top1. Introduction
The difficulty of acquiring, representing and maintaining an intelligent system’s knowledge base has been coined as the knowledge acquisition bottleneck in Artificial Intelligence (AI) research (Feigenbaum, 1977). More than 30 years later, this problem continues to affect not only the AI area but also the field of the Semantic Web where the goal of building an intelligent layer over the World Wide Web (Berners-Lee et al., 2001) is hampered by the lack of Web-scale knowledge resources, both in terms of domain models (i.e., ontologies) and instance annotations. Recent years have seen a tremendous increase of openly available formal knowledge resources on the Web thanks to the linked data movement (Heath & Bizer, 2011), in particular regarding information on the instance level. Terminological knowledge, however, is still scarce, especially in novel (or less popular) domains.
The social web enables new ways for collaborative knowledge creation, as a way to overcome the knowledge acquisition bottleneck of the Semantic Web. Social media platforms facilitate involving large and diverse populations of users in the knowledge acquisition process, in one of the following two ways. A first set of approaches piggyback on the data created as part of other Web systems to derive useful knowledge assets. For example, folksonomy induction algorithms extract knowledge from folksonomies derived from social tagging systems such as Flickr (Stohmaier et al., 2012). Doan et al. (2011) consider that such approaches use an implicit crowdsourcing strategy to acquire their data. In contrast, a second set of approaches subscribes to an explicit crowdsourcing strategy by building their own, dedicated application for acquiring knowledge through large-scale social participation. For example, traditional knowledge creation tools have been extended to enable collective knowledge creation, including the Protégé ontology editor (Tudorache et al., 2013) or the GATE linguistic annotation toolkit (Bontcheva et al., 2013). While these extensions primarily support the collaborative and distributed work of knowledge experts, an increasing trend consists in allowing large populations of non-experts to create knowledge through the use of novel social media collaboration platforms such as games or mechanised labour platforms.
Climate Quiz (apps.facebook.com/climate-quiz) is an example of such an approach: it is a game with a purpose deployed on Facebook that facilitates the creation of knowledge in the environmental domain by a large population of non-experts (Scharl et al., 2012). Our evaluation of the game detailed in Section 4 showed that, while it has attracted a high number of players, the heterogeneous domain relevance of its input data hampers player engagement, leads to short play times and affects the quality of the output. To overcome these limitations, we propose embedding the game into a hybrid-genre workflow, which splits the complex problem of knowledge acquisition into tasks performed both by players and micro-workers. This workflow leverages the pros and cons of games and mechanised labour platforms to improve gaming experience and output data quality. Our experiments show that such workflows are indeed possible, although future work will further fine-tune the synchronisation and task management across the two genres. This paper makes the following contributions:
- •
Section 2 presents a survey of knowledge acquisition through crowdsourcing, concluding with recent trends in the field and a comparison of strengths and limitations of different crowdsourcing genres.
- •
Climate Quiz Evaluation. As an extension of the earlier presentation of this game in (Scharl et al., 2012), this paper provides an in-depth evaluation of the game including its comparison with evaluation details of previous games that target (linguistic) knowledge acquisition (Section 4).
- •
Implementation and evaluation of hybrid-genre workflows. We propose a novel concept to workflow integration, and exemplify its implementation with the Climate Quiz. We show experimentally that such hybrid-genre workflows are feasible and that they can improve results as compared to single-genre approaches (Section 5).