An Exploration of the Definition of Data Literacy in the Academic and Public Domains

There is no single agreed-upon definition of data literacy because expectations of what it means to be data literate varies across contexts. The lack of agreement on a definition of data literacy across contexts is therefore necessary. However, definitions are important. Definitions embody our understanding of concepts and are the foundation for operationalization of concepts. The work reported in the chapter is motivated by the observation that despite no shortage of university graduates, organizations are struggling to find data-literate talent. There is an apparent disconnect between data literacy as taught in the academic domain and data literacy as expected by businesses in the public domain. An exploration of definitions of data literacy in academic and public domains is undertaken to gain insight in why the disconnect exists. A thematic analysis and comparison of definitions in the two domains were conducted. The differences identified provide some broad directions for developing data literacy capabilities in students that better fulfil the needs of business organizations.


INTRoDUCTIoN
The digital transformation of economies world-wide is well underway.Data literacy is one of the most important components of the digital literacy necessary to "realize the full potential of digital infrastructure" and "excel the economic growth…[data literacy] is a skill useful not only for daily use but also for work-related tasks" (Damuri et al., 2022, p. 33).These are the views expressed in the "G20 Toolkit for Measuring Digital Skills and Digital Literacy: Framework and Approach" (Damuri et al, 2022).Similar views are expressed in the UNESCO 2018 Digital Literacy Global Framework and the Digital Literacy Index by the Indonesian Communications and Information Ministry 2020 (Damuri et al., 2022).Despite the acknowledged importance of data literacy, needed data literacy capabilities in organizations are not being met.Alarmingly, only 21% of more than 9000 employees surveyed by Accenture were confident of their data literacy skills, and "many developing countries are struggling to improve their data literacy skills" (Damuri, et al., 2022, p. 33).
Data literacy is imperative for organizations, but organizations are grappling with a widening data literacy gap (Forbes Councils, 2019).In a highly competitive, globalized economy organizations depend on data for decision-making, and for businesses to take advantage of new business intelligence techniques in machine learning and AI, businesses must develop a strong data literacy culture (Johnson, 2019).According to Johnson (2019), around 50% of organizations lack the necessary "AI and data skills to achieve business value".The growing urgency to develop data literacy is documented widely in literature (e.g., Grillenberger & Romeike, 2018;Gummer & Mandinach, 2015;Kjelvik & Schultheis, 2019;Ridsdale et al., 2015;Wolff et al., 2016).
Data literacy is unquestionably important.But what is data literacy?Definitions of data literacy are plentiful but there is no single agreed upon definition.Bhargava et al. mentioned "Despite data literacy's growing popularity as a much-needed "bottom-up" solution, data literacy is ill-defined or ambiguous at best" (Bhargava et al., 2015).The authors recognize that understanding of data literacy is necessarily varied.Hence the authors' aim is not to find or present a singular definition of data literacy.Rather the aim is to explore the various conceptualizations of data literacy in two domains: the academic (academic peer-reviewed journals) and the public domain (industry, organizational sites, and non-academic blogs/Wikis etc).The line of argument that provides a rationale for exploring the definitions of data literacy in the academic and public domain is as follows: 1. Definitions are important because they represent our conceptual understanding of a topic and provide the necessary common language for analysis and discussion (Podsakoff, MacKenzie, & Podsakoff, 2016).2. One source of talent to help organizations fill the data literate talent must be university graduates (Winterberry, 2018;New Vantage Partners, 2019;Pothier, 2019;Panetta, 2021).But, despite no shortage of graduates, organizations are unable to fill the required data-literate talent.3.There is therefore an apparent gap between the data literacy expectations/needs of organizations and the data literacy capabilities that graduates develop during their university education (Bersin & Zao-Sanders, 2020;Forbes Councils, 2019).4. Since definitions are important (because they represent conceptual understanding which in turn forms a basis for action) an exploration and comparison of data literacy definitions in the academic and public domain provides some insight into why the disconnect between university education and business exists.These insights can provide a launching point for improving educational approaches to bring data literacy education of graduates closer to the expectations of the organisation.
To reveal conceptualizations of what it means to be data literate in academia and in business, we undertook a search of definitions in the academic and public domain.A theme analysis of the found definitions of data literacy was undertaken.This enabled the comparison of different perspectives on what it means to be data literate in the public and in the academic domain.

METHoDoLoGy
A descriptive literature review was undertaken to collect data literacy definitions from academic papers and from a public point of view.A theme analysis or definitions was done with the help of NVivo.
To collate data literacy definitions in the academic domain, we searched electronic databases including Google Scholar, ScienceDirect, ResearchGate, Elsevier and the Association of College and Research Libraries (ACRL) from 2000 to 2022.Based on our observation, before 2000 data literacy was not recognized as independent of information literacy.Keywords used for determining related research were a combination of terms including data literacy + definition, competencies, frameworks, and practices.Included in the search were academic journals, peer-reviewed papers, conferences, businesses reports, white papers, books, and curriculum documents.The reference lists of the most frequently cited articles were examined and used to find additional papers.Papers about other literacies such as digital literacy and statistical literacy appeared in research results but were excluded since they are focused on mathematics and applications for data analysis.The search yielded 17 highly relevant papers which individually proposed data literacy definition.From each research paper, the following information was recorded in an Excel spreadsheet: Author(s), year of publication, country, type of publication, research field, discipline, and data literacy definition.
To collate data literacy definitions in the public, we undertook a broad Google search using the keywords data literacy + definition, current state, review, competencies, frameworks, and practices.Google search gives a good snapshot of the way that data literacy is defined in the public domain.In this way, the possibility of comparing academic perspectives to data literacy with the organizational interpretation of data literacy has been provided.The first 10 relevant pages were investigated since beyond these the first 10 pages of results there was either repetition or insufficient relevance.Although much literature related to data literacy was found, for the purpose of analyzing definitions of data literacy, we selected only the academic domain and public domain literature which explicitly stated a definition of data literacy.Each definition of data literacy was then analyzed to identify the data literacy skills and competencies present in the definition.This enabled a comparison of the data literacy skills and competencies present in the academic domain and those found in the public domain.
To facilitate theme analysis of the data literacy definitions, Ridsdale et al.'s (2015) framework was used.The categories in Risdale et al's (2015) framework were used to classify the constituent skills and competencies present in each definition.Ridsdale et al. (2015) used it as a 'categorizing' framework because it provides one of the most comprehensive lists of data literacy competencies, skills, and abilities across different domains.Ridsdale et al. generated their categorization of data literacy competence by reviewing a wide range of literature including academic literature, grey literature, and blog posts among other literature, to identify data literacy competencies, skills, and abilities, as well as teaching practices for undergraduate students.Hence, through consider different perspectives on data literacy, Ridsdale et al. (2015) identified 23 competencies (Figure 1) and 64 tasks/skills of data which provides a suitable reference for categorizing and comparing different data literacy definitions (Grillenberger and Romeike, 2018).
The authors utilised NVivo was utilised to conduct a thematic analysis of the definitions collected in academic and public domain literature.NVivo allows for the thematic categorization of data based on keywords (Dollah et al., 2022).NVivo helps to distinguish the relationships among the most frequent ideas, which is the goal of current research.The two groups of data literacy definitions, data literacy from an academic perspective, and data literacy definition from the public domain were imported separately in NVivo.After extracting codes of all definitions of data literacy, Ridsdale's (2015) framework was used, and elements of definitions were categorized under five main headings: • Introduction to data.• Data collection.• Data management.
These categorizations facilitated comparison between two groups of data literacy definitions.

DATA LITERACy AND ITS DEFINITIoN: SoME oBSERVATIoNS FRoM THE LITERATURE
In the process of searching literature in the public and academic domain to collate definitions of data literacy two observations were made: (1) the concept of data literacy has evolved from other literacies, and (2) data literacy is a complex concept with varied, and sometimes poorly defined.

Historical Evolution of Data Literacy and Its Definition
The literature highlighted that the use of the term "data literacy" is relatively recent.The term "data literacy" has emerged from two other literacies: information literacy and statistical literacy (Schield, 2004).Data literacy, information literacy, and statistical literacy, while interrelated, should be distinguished from each other clearly.From the perspective of critical thinking, data literacy is broader than information literacy and statistical literacy and can be considered the foundation of the other two literacies (Schield, 2004) (Figure 2).Statistical literacy is the ability to read and interpret data; the ability to study the use of statistics as evidence in arguments and is the first data-related literacy introduced at educational levels (Schield, 1998(Schield, , 1999)).Statistical literacy is closely related to traditional statistics.Both traverse topics including descriptive statistics, models, probability and statistical inference, generalization, predictions, and explanations (Schield, 1998(Schield, , 1999(Schield, , 2004)).
Information literacy is a set of abilities requiring individuals to "recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information."(Wanner, 2015).According to Wanner (2015), an information-literate individual can: 1. Determine the extent of information needed, 2. Access the needed information effectively and efficiently, 3. Evaluate the information and its sources critically, 4. Incorporate selected information into one's knowledge base, 5. Use information effectively to accomplish a specific purpose, and 6.Understand the economic, legal, and social issues surrounding the use of information, and access and use information ethically and legally.
The emergence of data literacy as a concept and a field of inquiry can be seen to parallel the increasing proliferation of information and communication technologies in business organisations.In the academic domain, researchers discussed data literacy from the year 2000.It wasn't until 2010 that researchers started to investigate the role of data literacy in different aspects of individuals' life, such as how data literacy can help individuals to make strategic decisions at work (Carlson, 2011;Carlson, 2013;Carlson, 2015;Ridsdale, 2015;Fontichiaro and Oehrli, 2016).Concurrently, organizations started to consider data literacy as an important factor for success from 2005 (Figure 3).
There is clear growth in the volume of literature focusing specifically on data literacy, for example: Carlson et al. (2015), Gummer and Mandinach (2015), Ridsdale et al. (2015), Wolff et al. (2016), Grillenberger and Romeike (2018), and Kjelvik and Schultheis (2019) tried to define data literacy and developed a conceptual framework for data literacy.Evidently, new technology and social demands are driving the formation of data literacy as a distinct type of "literacy" and as a field of organizational and educational research.For instance, in Forbes Magazine it was noted that: The growing prevalence of technology such as automation, robotics, artificial intelligence (AI) and machine learning means "data" is becoming a universal language across all industries.However, not enough people currently speak this language.In fact, as our collective volume of data increase, so too does our data literacy gap.(Forbes Councils, 2019)

There Is a Variety of Definitions
The historical perspective lends a useful understanding to where the concept of data literacy evolved from.
Recently many efforts have been made to define data literacy and related competencies.Current approaches share a hierarchical definition involving identifying, understanding, operating on, and using data (Mandinach & Gummer, 2013;Prado & Marzal, 2013;Wolff et al., 2016).However, while some focus on understanding and operating on the data, others focus on putting the data into action to support a reasoned argument (Deahl, 2014).One evident issue in existing conceptualisations of data literacy is over-emphasis of technical requirements and little regard for the competencies related to the application and use of data (Bhargava et al., 2015).
One global definition of data literacy may not be possible, nor even useful, given the variation in data usage among different contexts: As Panetta (2021) in Gartner mentioned, data literacy is deeply contextual.But there is a need to define data literacy and create different competencies frameworks for disciplines based on the job market and requirements (Mandinach & Gummer, 2016;Pothier, 2019;  Qin & D'Ignazio, 2010b).Lack of agreement on data literacy definition between different players in each discipline is causing differential expectations and dissatisfaction of outcomes.

The Definitions of Data Literacy Are Very Broad, and the Term Is Often Used Without Being Defined
Data literacy is a difficult term to define.It can be used to encompass many things related to data, and therefore it can lose meaning.Considering data literacy as everything which is data-related, will not be helpful.Therefore, coming to an agreement about what data literacy is and what competencies it involves is difficult.The reason for this is two-fold.Firstly, data literacy is an emerging literacy, and we are still grappling to understand exactly what it encompasses, especially since data literacy may have different interpretation in different contexts.Secondly, as technologies change, different types of data and decision-making requirements also arise making the identification of data literacy competencies a dynamic exercise.

The Definition Changes Over Time and With Context
The idea of data literacy continually evolves as new technologies and data are made available.Panetta believes "data literacy is an underlying component of digital dexterity, an employee's ability and desire to use existing and emerging technology to drive better business outcomes" (Panetta, 2019, P1) which is showing the close relationship between data literacy and technology.As if it isn't difficult enough to deal with multiple disciplines and contexts, data literacy is the constantly changing and evolution of technologies and data themselves.Due to the prevalence of data and analytics capabilities, including artificial intelligence, it seems that the understanding of data literacy is in a constant state of change (Panetta, 2019).

DATA LITERACy DEFINITIoNS IN THE ACADEMIC AND PUBLIC DoMAIN
The analysis of data literacy definitions is facilitated by first providing a summary of key features of the data literacy definitions found in academic papers and in the public domain.The findings from the theme analysis are visually presented using charts and word clouds.

Summary of Data Literacy Definitions from Academic Papers
As it is obvious from Table 1, Schield (2004) and Prado and Marzal (2013) considered data literacy as an essential component of information literacy.Carlson et al. (2011) mainly paid attention to data presentation and considered data literacy as a part of data information literacy.Vahey et al. (2012), Mandinach andGummer (2013,2016), Deahl (2014), and Wolff et al. (2016) emphasized the datadriven decision-making which will be facilitated through data literacy competencies.Finally, Ridsdale et al. (2015) considered the goal of critical thinking from data literacy education.
In the field of education, the most common definition of data literacy was from one of five articles by Mandinach and Gummer (2015).They defined data literacy as: "The ability to understand and use data effectively to inform decisions.It is composed of a specific skill set and knowledge base that enables educators to transform data into information and ultimately into actionable knowledge.These skills include knowing how to identify, collect, organise, analyse, summarise, and prioritise data.They also include how to develop hypotheses, identify problems, interpret the data, and determine, plan, implement, and monitor courses of action".Prado and Marzal (2013) contribute to the advancement of data literacy with the proposal of a set of core competencies and contents that can serve as a framework of reference for its inclusion in libraries' information literacy programs.Deahl (2014) propose a definition of "data literacy," situate the concept within the landscape of new media literacies.Schneider (2013) described a pragmatic approach for the mediation and the teaching of research data literacy, i.e. those dimensions of information literacy that are dedicated to the creation, management, and reuse of research data.Wolff et al. (2016) explored the different perspectives offered on both data and statistical literacy and then investigated to what extent these address the data literacy needs of citizens in today's society.They considered existing approaches to teaching data literacy in schools, to identify how data literacy is interpreted in practice.Ridsdale et al. (2015) conducted a systematic review, to determine data literacy competencies, skills, and abilities, as well as teaching practices for undergraduate students.They defined detailed competencies for data literacy in key knowledge areas around data collection, data management, data evaluation, and data application.

Summary of Data Literacy Definitions From Public Domain
As it is obvious from Table 2, in public domain data literacy has been defined by educational entities, industrial organizations, and as reports of empirical research.Harvard Business Review (HBR) journal, Tableau, and Data Journalism from the educational perspective defined data literacy.The focus of theses definitions is about the process of transforming data to actionable instructional knowledge.Industrial organizations such as ODI (Open Data Institute), Quanthub (data skills platform), and Zeenea considered data literacy as ability to think critically about data and apply data in a purposeful manner within a given job role.Another important factor in these definitions is ability of data communication.It can be summarized that data literacy from organizational point of view is the ability to think about the data critically and transform it to actionable knowledge to help decision makers in businesses.Data literate staff embrace and use data in all that they and communicate data as information.

Data Literacy Definitions Comparison
For having a clear view of similarities and differences between two different perspectives to data literacy, word clouds of two data literacy definitions' groups have been created.Word clouds illustrated most repeated words in data literacy definitions in red colour and the more repeated ones in bold black format.
The most repeated words in Figure 4 are data, interpret, information, ability, and use.Whereas the most repeated words in Figure 5 are data, information, knowledge, and understanding.If we compare two Figures of 4, and 5, we notice some words which are bold in one of them, but not in another one.In academic perspective, words such as education, scientists, question, answer, develop, evaluate, and perspective are emphasized.But in public domain, the focus is more on executive words such as actionable, practice, instructional, techniques, and employees.The players in academic perspective are educators and scientists, whereas in organizational definitions we have employees.
Codes generated from data literacy definitions have been categorized based on Ridsdale's framework (Figure 1). Figure 6 is illustrating the extracted codes of data literacy definitions from academic domain and Figure 7 is illustrating the extracted codes of data literacy definitions from public domain.Both graphs demonstrate which category has been emphasize more by putting them in a higher position.The numbers in graphs showing how many codes exist in each of these categories.The red codes are those ones which are red in word clouds too.Bold black ones also are repeated more than other codes in definitions.
Except data evaluation, which is including the most repeated codes in both groups of definitions, hierarchy of other categories are different in Figure 6 and Figure 7.
In academia the focus is more about technology for creating information from data.In business the focus is more about people using data for decision making.The perspective of academic researchers on data literacy focused on the process of collecting data, analysing, understanding, and evaluating and converting data to information.From organizational point of view, data literacy's main goal is to transform data to information and information to actionable knowledge or wisdom to help decision makers to think critically and make strategic decisions based on data from the real world.
If we now frame these differences using Nelson's (1989) data to wisdom continuum, then the nature of the difference between academic and organisational perspectives on data literacy becomes abundantly clear (Figure 8).
In the public domain, the wisdom element is stronger than other elements of data, information and knowledge.In business, critical thinking and having the wisdom for predicting the future based on previous knowledge and experiences is more important than other skills.The component of information literacy that enables individuals to access, interpret, critically assess, manage, handle and ethically use data.From that perspective, information literacy and data literacy form part of a continuum, a gradual process of scientific-investigative education that begins in school, is perfected and becomes specialized in higher education and forms part of individuals' skill set throughout their lifetime.

Schneider Research Data Literacy
General A new sub-discipline within research data management that emerges from the need to educate students and scientists of all disciplines and to train information scientists from library and information science to do so.

Deahl
Data Literacy

New Media Literacies
The ability to understand, find, collect, interpret, visualize, and support arguments using quantitative and qualitative data.

General
The ability to ask and answer real-world questions from large and small data sets through an inquiry process, with consideration of ethical use of data.It is based on core practical and creative skills, with the ability to extend knowledge of specialist data handling skills according to goals.These include the abilities to select, clean, analyse, visualise, critique, and interpret data, as well as to communicate stories from data and to use data as part of decision making.Organisations are investing heavily in technology but still cannot create data centricity in businesses.The key for companies to achieve data literacy is paying enough attention to developing useful data literacy skills amongst workers (NewVantage Partners, 2019;Pothier, 2019;Winterberry, 2018).In the academic domain the emphasis is on the technical elements of converting raw data to information, and there is apparently little consideration of engaging with data to think about it critically in everyday life.This academic perspective likely shapes data education of graduates.In the public domain, data application and evaluation, the role of people has been considered, and less attention has been assigned to understanding what data means in their context.Considering importance of data awareness in data literacy success in business (Qin & D'Ignazio, 2010), lack of awareness of data literacy in businesses, can impede moving to data literacy culture in orgnizations.

CoNCLUSIoN
Within the past decade, in parallel with academic efforts to define data literacy (Carlson et al., 2011;Gummer and Mandinach, 2015;Ridsdale et al., 2015), businesses have also sought to determine what it means for a person, business, or society to be data literate (Inverarity et al., 2022).Comparison between academic and public data literacy definitions, demonstrated difference in two groups' understanding of data literacy.Businesses focus on thinking about data critically and convert the data to actionable knowledge.Businesses' goal of data application is critical decision making.In contrast, accessing, collecting, managing, analysing, and using data ethically are important competencies of data literacy in academic domains.
There is a discrepancy between graduates and the needs of businesses.Graduates' data literacy skills are not meeting organizations' expectations (Forbes Councils, 2019).The exploration and comparison of data literacy definitions in the public and academic domain alludes to one possible factor contributing to the apparent discrepancy between graduates and business needs.The exploration of data literacy definitions between academic and public domain highlights inconsistency between academic and public understanding of data literacy is highlighted: In academic domain, the emphasis is on technology, but in public domain the emphasis is on human decision making and how data can be used to facilitate the decision making.This finding provides some initial direction for critical reflection on what is being taught to students in universities and raises awareness of the need to better understand the needs of industry as a foundation for the design of more effective data literacy education in university programs.Furthermore, given the contextual nature of data literacy, there is a need for further research to further explore differences in data literacy requirements based and on the needs of different disciplines and application contexts.It is hoped that the simple analysis presented in this paper is a stimulus for critical reflection on data literacy education and helps pave the way for educating graduates who are well equipped with the capabilities required by various business organizations in various roles and can therefore be more easily be absorbed within the job market.

Figure 2 .
Figure2.The relationship between statistical literacy, information literacy, and data literacy from the critical thinking perspective(Schield, 2004)

Figure 3 .
Figure 3. Number data literacy definitions provided based on years

Figure 7 .
Figure 7. Data literacy definitions' codes in public domain

Table 1 . Key data literacy definition from academic perspective
Data LiteracyEducation The ability to understand and use data effectively to inform decisions.It is composed of a specific skill set and knowledge base that enables educators to transform data into information and ultimately into actionable knowledge.These skills include knowing how to identify, collect, organise, analyse, summarise and prioritise data.They also include how to develop hypotheses, identify problems, interpret the data, and determine, plan, implement, and monitor courses of action.

Table 2 . Data literacy definition from public perspective
Data literacy is the ability to read, understand, create, and communicate data as information.Much like literacy as a general concept, data literacy focuses on the competencies involved in working with data.It is, however, not similar to the ability to read text since it requires certain skills involving reading and understanding data.
educationalThe first step in data literacy is the ability to communicate, write and read about data in context.Then, employees, not just data scientists, need to critically assess the data, find meaning in the numbers and glean actionable business insights from it.