Data Quality on the Internet

Data Quality on the Internet

Vincent Cho (Hong Kong Polytechnic University, Hong Kong)
DOI: 10.4018/978-1-60566-378-4.ch011
OnDemand PDF Download:
List Price: $37.50


This chapter will review the studies on the data quality on the Internet and will propose some suggestions to improve existing Internet resources. The layout of this chapter is as follows. First, the definitions of data quality will be visited. Next, the author would like to review the reasons of poor data quality. Framework and assessment based on the past literature will be reviewed and finally some recommendations are highlighted.
Chapter Preview


The notion of data quality has been widely investigated in the literature. It is often defined as “fitness for use”, i.e., the ability of a data collection to meet user requirements for his/her decision making on certain tasks. Both theoretical and experimental results indicate that data quality is a multi-dimensional concept (Ballou, 1998; Redman, 1996; Wang and Strong, 1996; Wand and Wang, 1996). Zmud (1978) empirically derived the data quality using factor analysis to examine the dimensionality of the construct of information. Four dimensions were derived: quality of information, relevancy of information, quality of format, and quality of meaning. Theoretically, Wand and Wang (1996) identified four dimensions of intrinsic data quality: completeness, lack of ambiguity, meaningfulness, and correctness. These dimensions are said to be applicable across different applications applied to different tasks. Hub et al. (1990) and Fox et al. (1994) identified four dimensions of data quality: accuracy, completeness, consistency, and currency.

In an early discussion of the quality of information systems, Davis and Olson (1985) identified three aspects of quality that refer to characteristics of data: accuracy, precision, and completeness. Lee et al. (2002) and Strong et al. (1997) grouped the information quality from an information system into four categories, intrinsic information quality, contextual information quality, representational information quality, and accessibility information quality. Intrinsic information quality implies that information has quality in its own right. Contextual information quality highlights the requirement that information quality must be considered within the context of the task at hand; it must be relevant, timely, complete, and appropriate in terms of amount, so as to add value. Representational and accessibility information quality emphasize the importance of computer systems that store and provide access to information; that is, the system must present information in such a way that it is interpretable, easy to understand, easy to manipulate, and is represented concisely and consistently; also, the system must be accessible but secure.

Complete Chapter List

Search this Book: