AI for Health-Related Data Modeling: DCN Application Analysis

AI for Health-Related Data Modeling: DCN Application Analysis

Na Cheng
Copyright: © 2022 |Pages: 11
DOI: 10.4018/IJISMD.300780
Article PDF Download
Open access articles are freely available for download

Abstract

Data modeling of health-related data from Data Center (DC) has positive effects for health monitoring, disease prevention, and healthcare research. However, health-related data has the characteristics of huge, high-dimensional, and non-normalized, which are not beneficial to direct analysis, so data needs to be preprocessed before data modeling. This paper focuses on the features of health-related data, and outlier detection during data preprocessing is studied. Meanwhile, we propose an improved algorithm for health-related data based outlier detection. The experimental results reveal that the proposed outlier detection algorithm has a smaller running time, and more outliers are detected compared to three baselines. In addition, local importance based random forest feature selection algorithm is proposed to measure the importance of each feature. The experimental results indicate that the proposed algorithm can select optimal feature subset to apply health-related data.
Article Preview
Top

1. Introduction

Since last few years, due to the emerging technologies such as cloud computing, big data, and Internet of Things (IoT) (Joshi et al., 2021; Muniswamaiah et al., 2019), the volume of data is increasing day by day. The rapid growth of web applications including search engine, online shopping, and cloud computing is putting forward severe requirements on the underlying infrastructure in terms of computing, storage, and networking. In order to meet the storage and processing needs of large amounts of data, Data Center (DC) has become an indispensable information platform, which is responsible for the management and maintenance of massive computing and storage systems. Internet companies like Microsoft, Google, Amazon, Facebook, and Alibaba have built high-performance data centers around the world. These data centers connect servers and network switches over network to meet the needs of high-speed computing and massive storage in a more convenient way. While Data Center Network (DCN) plays a crucial role in data center by connecting all the data center resources together (Chen et al., 2021).

Machine Learning (ML) is a very successful approach of Artificial Intelligence (AI) (Di Mitri et al., 2017; Phellan et al., 2021), which is the core of AI, and it is also a form of AI in which the computer learns how to complete a task by itself. ML can help machines to take right decisions and smart actions in real time without human intervention. There are two common ML models that are supervised learning model and unsupervised learning model. ML has been around for a while which has grown at a high speed in recent years. In future, ML will be one of the best solutions for analyzing large amounts of data. If handled right, ML could change the way humans live more than any technology that ever existed.

As more and more people begin to connect to the Internet, data from DC is increasing, but health-related data is what we are concerned about. Health has always been part of our whole way of life. Every part of our life relies on having good health. Living a healthy lifestyle can help prevent chronic diseases and long-term illnesses. The importance of good health in our life is undoubtedly great. Accordingly, the main contributions of this paper are summarized as follows. (i) Combining voting strategy based global outlier detection with K-means based nearest-furthest neighbors search, an improved algorithm for health-related data based outlier detection algorithm is proposed. (ii) We propose local importance based random forest feature selection algorithm to measure the importance of each feature.

The remaining of this paper is organized as follows. Section 2 reviews the related work. In Section 3, two algorithms are proposed in terms of data preprocessing. The experimental results are shown in Section 4. Section 5 concludes this paper.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 8 Issues (2022): 7 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing