Article Preview
Top1. Introduction
Generally Computational biology (CB) demonstrates a dynamic and significant scientific space for automated or artificial analysis and processing methods and rough set-based analysis permit specific operations with multiple datasets. Traditionally, biological research operates the data in systematic and regular formulas. From last decade, technology boosts up to handle large volumes of data with quick manner. Now-a-days, computational tools handle biological data in quick success. In 1995, first details genome segments were sequenced and consequently many other analyses have been accomplished (Fleischmann et al., 1995, Berman et al., 2000). Microarray data analysis with accuracy is one of the fundamental supports along with lot of other informatics analysis. Parallel DNA sequence processing is another significant contribution (Schena et al., 1995; Duggan et al., 1999).
Classifier techniques might be helpful to solve some of the challenging real-world problems. In a classifier system, a harmonious combination of multiple techniques is used to build an efficient solution to deal with a particular data classification. One field of the classification approaches that has recently become a topic for researchers is Meta learning or classifiers. Meta learning or classifier refers to handling a set of base predictors for a given classification task and then integrate the output information using an integration technique. Association rules can be found under different names such as: decision combination, classifier ensembles, classifier fusion, consensus aggregation, hybrid methods and more (Kuncheva, 2002; Dasarathy, 1994).
Priority generates priority values to interlinked homologous datasets. These values are governed by priority association rules. The main purpose of priority association rule and rough data set is to improve the performance of a single classifier. Different classifiers usually make different predictions on the same sample of data. This is due to their diversity and many research works illustrated that the sets of misclassified samples from different classifiers by using multiple sets of classifiers. The techniques that are used to develop priority association table can be divided into two categories: classifiers disturbance and sample disturbance. The first approach utilizes the instability of the base classifiers. These classifiers are very sensitive to the initialization parameters like neural networks, random forests, and decision trees. The second approach even trains the classifier with different sample subsets or to train classifiers in different feature subspaces.
As with any classification problem, document classification is comprised of two stages: feature extractor and a decision stage that actually performs the assignment of documents to classes based on the extracted features. Several feature extractors have been proposed by Jia et al. 2015, but by far, the most popular ones have been variants of the term-frequency vector. A wealth of data mining and machine learning techniques have then been applied to and/or developed for the purposes of document classification. These include the naive Bayes classifier, k-nearest neighbor classifier, Apriori algorithm, neural networks, decision trees, logistic regression and most recently support vector machines (SVM) (Joachims, 2001). While the aforementioned advances have been significant both conceptually and from the viewpoint of enhancing classification accuracy, it is evident that document classification methods have largely focused on intelligent mining of data in the documents.