Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is K-fold Validation

Big Data Analytics for Sustainable Computing
One of the methods of testing classification accuracy where the dataset is split into k subsets and in each iteration, k-1 of the subsets are used for model training while one subset retained for testing of model performance.
Published in Chapter:
Feature Selection Algorithm Using Relative Odds for Data Mining Classification
Donald Douglas Atsa'am (Department of Mathematics, Statistics and Computer Science, University of Agriculture, Makurdi, Nigeria)
Copyright: © 2020 |Pages: 26
DOI: 10.4018/978-1-5225-9750-6.ch005
Abstract
A filter feature selection algorithm is developed and its performance tested. In the initial step, the algorithm dichotomizes the dataset then separately computes the association between each predictor and the class variable using relative odds (odds ratios). The value of the odds ratios becomes the importance ranking of the corresponding explanatory variable in determining the output. Logistic regression classification is deployed to test the performance of the new algorithm in comparison with three existing feature selection algorithms: the Fisher index, Pearson's correlation, and the varImp function. A number of experimental datasets are employed, and in most cases, the subsets selected by the new algorithm produced models with higher classification accuracy than the subsets suggested by the existing feature selection algorithms. Therefore, the proposed algorithm is a reliable alternative in filter feature selection for binary classification problems.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR