Article Preview
TopLiterature Review
The data quality literature has long discussed the importance of quality (Juran & Godfrey, 1999; Wand & Wang, 1996; Wang, Storey, & Firth, 1995). Decisions made on the basis of corrupt or inferior data will be skewed, with potentially costly consequences (Baltzan & Phillips, 2009; Fisher, Chengalur-Smith, & Ballou, 2003). As Baltzan and Phillips (2009) observe, “decisions are only as good as the quality of data breach information used to make the decisions.”
Researchers have devoted much energy to investigating how to evaluate information for quality. One of the most prominent such scholars, professor and Director of the MIT Information Quality Program Richard Wang, has written several seminal papers on the subject. In one such paper, Wang and Strong (1996) develop a conceptual framework designed to capture “the aspects of data quality that are important to consumers” (Wang & Strong, 1996, p. 5). The framework conceives of data quality as comprising four dimensions. One dimension refers to the intrinsic factors of the data itself. Examples are the data’s accuracy, objectivity, believability, and reputation – all of which go to the data’s quality in their own right. The second dimension refers to contextual factors. Data quality “must be considered within the context of the task at hand” (Wang & Strong, 1996, p. 6). Contextual factors include value-added, relevance, timeliness, completeness, and appropriate amount of data. Third, the data’s representational dimension includes aspects related to its format (e.g., whether it offers a concise and consistent representation) and meaning (e.g., its interpretability and the ease with which it can be understood). The last dimension is its accessibility. Data needs to be secure, while being accessible. This four-dimensional model is widely accepted by other scholars in the data quality field (Bovee, Srivastava, & Mak, 2003; Strong, Lee, & Wang, 1997).