Privacy-preserving data mining (PPDM) is one of the newest trends in privacy and security research. It is driven by one of the major policy issues of the information era—the right to privacy. This chapter describes the foundations for further research in PPDM on the Web. In particular, we describe the problems we face in defining what information is private in data mining. We then describe the basis of PPDM including the historical roots, a discussion on how privacy can be violated in data mining, and the definition of privacy preservation in data mining based on users’ personal information and information concerning their collective activities. Subsequently, we introduce a taxonomy of the existing PPDM techniques and a discussion on how these techniques are applicable to Web-based applications. Finally, we suggest some privacy requirements that are related to industrial initiatives and point to some technical challenges as future research trends in PPDM on the Web.
The Basis Of Privacy-Preserving Data Mining Historical Roots
The debate on PPDM has received special attention as data mining has been widely adopted by public and private organizations. We have witnessed three major landmarks that characterize the progress and success of this new research area: the conceptive landmark, the deployment landmark, and the prospective landmark. We describe these landmarks as follows:
The conceptive landmark characterizes the period in which central figures in the community, such as O’Leary (1991, 1995), Piatetsky-Shapiro (1995), and others (Klösgen, 1995; Clifton & Marks, 1996), investigated the success of knowledge discovery and some of the important areas where it can conflict with privacy concerns. The key finding was that knowledge discovery can open new threats to informational privacy and information security if not done or used properly. The deployment landmark is the current period in which an increasing number of PPDM techniques have been developed and published in refereed conferences. The information available today is spread over countless papers and conference proceedings. The results achieved in the last years are promising and suggest that PPDM will achieve the goals that have been set for it.