Article Preview
Top1. Introduction
The increasing advent of information technology provides historical data a great value and allows its efficient reusing (Wu et al., 2005). Limited human capability makes the high size of stored information becomes less useful. Human inspection and interpretation of the data is not feasible since no one can understand and benefit from the massive databases. The huge size of the historical data is the damning factor that limits the promising expected benefit from this stored data. Typically, these resources may include irrelevant features and noise. Limited resources and learning speed are important factors that must be considered in generalization performance. Increasing computational complexity may lead to intractable behavior. Knowledge discovery is the process of extracted hidden pattern and clusters from massive stored data. It concerns with the elicitation of buried knowledge from massive databases (Liao et al., 2011; Han & Kamber, 2011). It includes ensemble of sub processes that can extract the required knowledge with an efficient manner and benefit significantly from this flood of information. Data preprocessing, feature selection, data mining and evaluation are the main sub processes of KD process. Data preprocessing is the first step in KD process. It concerns with the data preparation, cleaning, discretization and removal of outliers as well as the managing of inconsistency. Some preprocessing techniques are related to understanding the distribution of the data like central tendency and dispersion measure of the data. These descriptive data summaries can be presented graphically like bar charts or histograms which is very helpful for visual inspection of the data (Kotsiantis, 2007). Others like discretization are related to data preparation to be better understanding and managing. Discretization is a process which adopts the concept of hierarchy generation. It tends to improve the data mining process by reducing the number of managing values. Feature selection is the process of specifying a minimal feature subset including the most relevant features that best contribute in classification process based on evaluation criteria (Ladha & Deepa, 2011; Chen et al., 2011; Lavrac, 1999). Zhao et al. (2008) recommended the application of a feature selection process for pattern recognition, machine learning and data mining. This recommendation is motivated by the considerable support provided to subsequent steps of knowledge extraction. Feature selection process reduces the dimensionality of feature space and computational cost which decrease drastically the storage resource and running time. In order to increase the speed of training and improve the predictive accuracy, we get rid of noise. The retained data is considered the most relevant data that make better understanding of extracted knowledge and facilitate the visualization function (Escolano et al., 2009; Tsai, 2009). Concerning the evaluation step, Rough Set (RS) theory has many applications including the medical domain (Hassanien et al., 2009). Rough sets theory provides a novel approach to knowledge description and approximation of sets. It was introduced by Pawlak during the early 1980s (Pawlak, 1982) and is based on an approximation space-based approach to classify sets of objects. RS outperforms many other techniques by some properties. Firstly, it doesn't need any external parameters which provide an advantage over many other techniques. Secondly, it can ascertain the completeness of the data for the classification task especially for limited or expensive information sources (Pawlak, 1991; Polkowski, 2003; Yu & Liu, 2004). Rough set methods can also be used to classify unknown data based on already gained knowledge. It can be utilized to determine whether sufficient data for a task is available to extract a minimal sufficient set of features for classification. Reduct is an important concept in rough set theory and data reduction is a main application of rough set theory in pattern recognition and data mining. In this context, this paper proposes an integrated model that decomposes applying genetic algorithms for feature selection and rough set during classification process for prediction problem in the biomedical domain. Medical historical data is considered as a buried value. Stored patients data can be used as a significant prediction source for unknown cases. The high dimensionality of medical data imposes the use of some helpful support to analyze and classify the data. Computer Aided Diagnosis (CAD) systems present a number of tremendous aids to health care field such medical diagnosis, medical imaging, computer vision, genomics and medical computer translation which plays an important role in the physician’s interpretation (Cios & Mooree, 2002; Kononenko, 2001). Therefore, the proposed research investigates the efficiency of the combining (EI-GA-RS) in improving the classification accuracy for diseases diagnosis. Four data sets were tested to certain the reliability and the efficiency of the proposed system.