Teaching Visualisation in the Age of Big Data: Adopting Old Approaches to Address New Challenges

Teaching Visualisation in the Age of Big Data: Adopting Old Approaches to Address New Challenges

Belinda A. Chiera (University of South Australia, Australia) and Malgorzata W. Korolkiewicz (University of South Australia, Australia)
Copyright: © 2017 |Pages: 23
DOI: 10.4018/978-1-5225-2512-7.ch005
OnDemand PDF Download:
$37.50

Abstract

Technological advances have led to increasingly more data becoming available, a phenomenon known as Big Data. The volume of Big Data is to the order of zettabytes, offering the promise of valuable insights with visualisation the key to unlocking these insights, however the size and variety of Big Data poses significant challenges. The fundamental principles behind tried-and-tested methods for visualising data are still as relevant as ever, although the emphasis necessarily shifts to why visualisation is being attempted. This chapter outlines the use of graph semiotics to build data visualisations for exploration and decision-making and the formulation of elementary, intermediate- and overall-level analytical questions. The public scanner database Dominick's Finer Foods, consisting of approximately 98 million observations, is used as a demonstrative case study. Common Big Data analytic tools (SAS, R and Python) are used to produce visualisations and exemplars of student work are presented, based on the outlined visualisation approach.
Chapter Preview
Top

Introduction

Recent technological advances have led to data collection at a rate never seen before, from sources such as climate sensors, transaction records, social media and videos, to name a few. With the advent of Big Data come insights at a previously unseen depth and breadth of detail. Operational decisions are increasingly based on data rather than experience or intuition (McAfee & Brynjolfsson, 2012) and more broadly, a shift in perspective is under way on the relationship between data and knowledge generation (Ekbia et al, 2015).

Big Data is typically defined in terms of its Variety, Velocity and Volume. Variety refers to expanding the concept of data to include unstructured sources such as text, audio, video or click streams. Velocity is the speed at which data arrive and how frequently data change. Volume is the size of the data, which for Big Data runs to the order of petabytes (10005) through to zettabytes (10007).

Visualisation is a potentially valuable way to make sense of Big Data, to uncover features, trends or patterns to produce actionable analysis and provide deeper insight (SAS Institute Inc., 2014). There is an increased focus on visualisation over formal data analysis, partly due to the proliferation of easy-to-use web-based visualisation tools (e.g., Tableau Online), and partly due to an added emphasis on the power of well-designed visualisations and the demand for a new type of Big Data ‘dataviz’ analyst (McCosker, & Wilken, 2014). However, the opaque character of large data sets makes it difficult to describe in a systematic way how to effectively translate Big Data into visual or other kinds of knowledge. Furthermore, a ‘black box’ approach to generating data visualisations puts data analysts at risk of producing ornate and visually pleasing graphics that are otherwise useless.

The use of visualisation as a tool for data exploration and/or decision-making is not a new phenomenon. Data visualisation has long been an important component of data analysis, whether the intent is that of data exploration or as part of a model building exercise. However, the challenges underlying the visualisation of Big Data are still relatively new; often the choice to visualise is between simple graphics using a simple palette of colours to distinguish information or to present overly complicated but aesthetically pleasing graphics, which may obfuscate and distort key relationships between variables. Previous work in the literature suggests a tendency to approach Big Data by repeating analytical behaviours typically reserved for smaller, purpose-built data sets (e.g., Gelper, Wilms, & Croux 2015; Toro-Gonz´alez McCluskey, & Mittelhammer 2014; Huang, Fildes, & Soopramanien 2014). There appears, however, to be less emphasis on the exploration of Big Data itself to formulate questions that drive analysis.

In this chapter, a portable framework is proposed to arm aspiring visual analysts with tools to decide what is/is not useful when visualising Big Data. The target audience is working professionals seeking to expand or formalise their data analytics skills through postgraduate university study. Based on the authors’ teaching experience in this context, introducing a framework for visualising Big Data through real-life case studies is advocated, instead of presenting students with an encyclopaedia of methods and graphical displays.

Complete Chapter List

Search this Book:
Reset