Big Data Analyses in the Community of Inquiry and Educational Research Spheres

Big Data Analyses in the Community of Inquiry and Educational Research Spheres

DOI: 10.4018/978-1-5225-5161-4.ch003


Chapter 3 builds on the previous chapters and provides a summary of big data-style research within the Community of Inquiry scholarly literature, as well as examples from educational research broadly. This chapter also connects to the broader topics of machine learning, data analytics, learning analytics, and educational data mining. Constructs from the Community of Inquiry are integrated into this synthesis and overview. Unfortunately, only a fraction of the studies in educational research broadly today exhibit the tell-tale signs of big data: data volume and variety, new environments or instrumented sources of larger data, often with emerging tools and platforms critical to the analysis of the resulting datasets. A list of additional readings is provided.
Chapter Preview


The 2003 book, Moneyball: The Art of Winning an Unfair Game, and its 2011 film adaptation (De Luca, Horovitz, Pitt, & Miller, 2011) tell the story of the early use of analytics to assist the Oakland Athletics baseball team. The team’s general manager at that time, Billy Beane, hoped to compete in a league where his team boasted fewer financial resources than the competing franchises. Beane believed that the data frequently used to score players (stolen bases, runs batted in) were left over from earlier years of the game, and offered less insight in the modern game. Their successes and subsequent story fueled a revolution in the way that baseball teams are constructed and managed, based on big data and analytics.

Is the field of educational research well positioned for the current era of big data, machine learning, and artificial intelligence? One can hardly pick up a press report, scholarly review, or research journal without encountering these themes. Indeed, AI has become the dominant future orientation of major tech giants such as Google, Apple, Microsoft, and Amazon. Large-scale analyses of data streams such as social media use or cellphone location data regularly feed models used by not only business, but also by government and non-profit customers. Emerging start-ups offer platforms to analyze streams of activity on- and off-line, media (ex, Matroid), and affective computing is finding commercialization in companies such as Affectiva (and many others).

Where does this leave us as educational researchers? Are our research environments producing sets or streams of data for such large-scale and modern analysis techniques? Are our classrooms and experimental spaces sufficiently instrumented? Have “wearables,” natural language processing, screen-based learning and affect detection brought us to exponentially new levels of data collection, analysis, and insight? As with so many questions of this sort, the answer will almost certainly be represented in the William Gibson quote: yes, the future is here, but it is very unevenly distributed.

Any attempt to define “big data” in today’s age is a fool’s errand. While many utilize and later adapted the “3 Vs” 2001 definition attributed to Laney (2012) (volume, variety, velocity), more recent developments in enterprise-scale data capture, distributed and cloud storage, as well as advances in data-intensive analysis and machine learning have strained that original construct. A 2014 attempt to collect definitions of the “big data” phrase and concept by the School of Information at Berkeley (datascience@berkeley, 2014) yielded more than 100 viable alternatives, with focuses varying from practical measures of data size to broader definitions encompassing the wealth of new analysis directions which are now possible in a data-rich age.

For the purposes of this work, a practical definition is, however, required. One appealing and focused definition is that of Manyika et al. (2011): “Big Data describes data that is fundamentally too big and moves too fast, thus exceeding the processing capacity of conventional database systems. It also covers innovative techniques and technologies to capture, store, distribute, manage and analyze larger sized data sets with diverse structures.”

Quoting the work of Prinsloo, Archer, Barnes, Chettym & Van Zyl (2015) provides a similar working definition, relaying that big data has the following features:

...huge volume, consisting of terabytes or petabytes of data; high velocity, being created in or near real time; extensive variety, both structured and unstructured; exhaustive in scope, striving to capture entire populations of systems; fine­grained resolution, aiming at maximum detail, while being indexical in identification; relational with common fields that enable the conjoining of different data­sets; flexible, with traits of extensionality (easily adding new fields) and scalability (the potential to expand rapidly). (p. 2)

Complete Chapter List

Search this Book: