Performance Enhancement of Outlier Removal Using Extreme Value Analysis-Based Mahalonobis Distance

Performance Enhancement of Outlier Removal Using Extreme Value Analysis-Based Mahalonobis Distance

Joy Christy A, Umamakeswari A
ISBN13: 9781799824916|ISBN10: 1799824918|ISBN13 Softcover: 9781799824923|EISBN13: 9781799824930
DOI: 10.4018/978-1-7998-2491-6.ch014
Cite Chapter Cite Chapter

MLA

A, Joy Christy, and Umamakeswari A. "Performance Enhancement of Outlier Removal Using Extreme Value Analysis-Based Mahalonobis Distance." Handling Priority Inversion in Time-Constrained Distributed Databases, edited by Udai Shanker and Sarvesh Pandey, IGI Global, 2020, pp. 240-252. https://doi.org/10.4018/978-1-7998-2491-6.ch014

APA

A, J. C. & A, U. (2020). Performance Enhancement of Outlier Removal Using Extreme Value Analysis-Based Mahalonobis Distance. In U. Shanker & S. Pandey (Eds.), Handling Priority Inversion in Time-Constrained Distributed Databases (pp. 240-252). IGI Global. https://doi.org/10.4018/978-1-7998-2491-6.ch014

Chicago

A, Joy Christy, and Umamakeswari A. "Performance Enhancement of Outlier Removal Using Extreme Value Analysis-Based Mahalonobis Distance." In Handling Priority Inversion in Time-Constrained Distributed Databases, edited by Udai Shanker and Sarvesh Pandey, 240-252. Hershey, PA: IGI Global, 2020. https://doi.org/10.4018/978-1-7998-2491-6.ch014

Export Reference

Mendeley
Favorite

Abstract

Outlier detection is a part of data analytics that helps users to find discrepancies in working machines by applying outlier detection algorithm on the captured data for every fixed interval. An outlier is a data point that exhibits different properties from other points due to some external or internal forces. These outliers can be detected by clustering the data points. To detect outliers, optimal clustering of data points is important. The problem that arises quite frequently in statistics is identification of groups or clusters of data within a population or sample. The most widely used procedure to identify clusters in a set of observations is k-means using Euclidean distance. Euclidean distance is not so efficient for finding anomaly in multivariate space. This chapter uses k-means algorithm with Mahalanobis distance metric to capture the variance structure of the clusters followed by the application of extreme value analysis (EVA) algorithm to detect the outliers for detecting rare items, events, or observations that raise suspicions from the majority of the data.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.