Employee Classification in Reward Allocation Using ML Algorithms

Employee Classification in Reward Allocation Using ML Algorithms

Parakramaweera Sunil Dharmapala
Copyright: © 2023 |Pages: 18
DOI: 10.4018/978-1-7998-9220-5.ch186
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This work discussed an application of machine learning algorithms in predicting employee categories in reward allocation based on input features determined from survey responses. The results reported in this article are primarily based on beliefs and perceptions of the survey respondents about the four categories of employees, namely performer, needy, starter, and senior. The authors considered two classification models—full model with 10 input features and the reduced model with seven input features—and the results show that the reduced model performed better than the full model, indicating that three qualitative input features bear no relevance to predicting the employee categories. Both models selected optimizable ensemble and optimizable SVM as best machine learning classifiers, based on accuracy rates and AUC scores. Finally, using the reduced model on out-of-sample observations, employee categories were correctly predicted matching the actual categories.
Chapter Preview
Top

Literature Review

This work reports the research studies conducted in the past relevant to ‘reward allocation’ in various groups and organizations with a wide spectrum of backgrounds.

Key Terms in this Chapter

Unsupervised Learning (UL): There is no target output variable in UL, nor the data are tagged as in SL. UL methods exhibit self-organization that captures patterns as probability densities.

Supervised Learning (SL): Machine learning task of learning a function that maps an input (mostly a vector) to an output, based on input-output pairs of observations in the training data set.

Accuracy Rate: In classification learning, a measure to determine the accuracy of prediction using of a machine learning method, by plotting a ‘confusion matrix’ with predicted classes as columns and true classes as rows. On-diagonal elements show the correctly predicted classes and off-diagonal elements show the falsely predicted classes. The percentage of correctly predicted classes to the total predicted classes forms the accuracy rate.

Cross-Validation: A statistical technique used in machine learning to overcome the Overfitting problem. The full data set is partitioned into ‘training data’ and ‘validation and testing data’ by creating multiple manyfold samples using random sampling with replacement. The number of folds determines the number of times the full data set is shuffled and partitioned, and that many models are built to determine the pattern of .

Reward Allocation: In industrial and organizational psychology, ‘organizational citizenship behavior’ (OCB) is a person's voluntary commitment within an organization that is not part of his or her contractual tasks. Organ (1988) defines OCB as “individual behavior that is discretionary, not directly or explicitly recognized by the formal reward system, and that in the aggregate promotes the effective functioning of the organization”. Reward allocation is designed to promote OCB in an organization.

Histogram: A frequency bar graph that plots the allotted amounts on the horizontal axis and the number of responses on the vertical axis depicting the allocation favored by respondents.

Overfitting: In machine learning, the fitting of corresponds too closely to the training data set and may therefore fail to fit validation and testing data. An overfitted model is a statistical model that contains more parameters than can be justified by the sample data and therefore fails to predict out-of-sample data reliably.

Area Under ROC Curve (AUC): ROC curve, or receiver operating characteristic curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system. ROC space is defined by FPR (false positive rate) and TPR (true positive rate) as x and y axes, respectively, which depicts a relative trade-off between truly predicted and falsely predicted classes. The total area is bounded by the unit square [0 to 1] on x-axis, and [0 to 1] on y-axis. AUC area extends from the 45°-line (that connects (0,0) and (1,1)) towards the top left (0,1) covering 100% total area. A higher AUC area closer to (0,1) depicts a stronger classification method.

Classification Learning: Under Supervised Learning, the output target is a categorical variable representing categories/classes/attributes that the inputs map into after determining the pattern of the function from training data.

Complete Chapter List

Search this Book:
Reset