Usage of Big Data Prediction Techniques for Predictive Analysis in HIV/AIDS

Usage of Big Data Prediction Techniques for Predictive Analysis in HIV/AIDS

Chinmayee Mohapatra (KIIT University, India), Biswaranjan Acharya (KIIT University, India), Siddhath Swarup Rautaray (KIIT University, India) and Manjusha Pandey (KIIT University, India)
Copyright: © 2018 |Pages: 27
DOI: 10.4018/978-1-5225-3203-3.ch003


The term big data refers to the data that exceeds the processing or analyzing capacity of existing database management systems. The inability of existing DBMS to handle big data is due to its large volume, high velocity, pertaining veracity, heterogeneous variety, and on-atomic values. Nowadays, healthcare plays a vital role in everyone's life. It becomes a very large and open platform for everyone to do all kinds of research work without affecting human life. When it comes to disease, there are so many types found all over the world. But among them, AIDS (acquired immunodeficiency syndrome) is a disease that spreads so quickly and can easily turn life to death. There are many studies going on to create drugs to cure this deadly disease, but until now, there has been no success. In cases such as this, big data is implemented for better a result, which will have a good impact on society.
Chapter Preview

Big Data

Nowadays, data is coming in a very large and very fast manner, which becomes difficult for the traditional database to handle, so big data came to be. There are many characteristics of big data, but mostly 9v’s are considered. The main characteristics of big data are volume, velocity, veracity, verity, value, validity, volatility, variability, and visualization.

Figure 1.

Characteristics of Big Data

  • Volume indicates the amount (Hammer et al., 2008) of data that may in Gbs or Tbs or more than that.

  • Velocity is considered as how fast the data are created and collected.

  • Validity means till which time extend the data will be valid. And it also indicates that how much the data is valid for the computation.

  • Veracity means the truthfulness of data. Value means, how much the data is valuable.

  • Verity means how many types of data are present that may be structured (Oweis et al., 2015), semi-structured or unstructured. If we will consider one by one then we can say that structured data includes the data bases, the semi-structured data includes XML (Uddin et al., 2014) Files and the unstructured comes into figure in case of social media like Facebook, Twitter.

  • Value deals with the valuable input. It indicates how much the data is valuable for the computation. It also includes the value that needed to be spent for storing the data. Like if we store the data in low cost storage (Toshniwal et al., 2015) then may cause some difficulties in future to access the data. Sometimes it can happen that the data will be lost forever. So, Value should be taken care of.

  • Volatility means the volatile character of data i.e. the same data should be available all the time for all the users for computation and getting the result (DeRoos et al., 2014). Like there are many online shopping companies where they don’t want to store some data about the customer’s purchase because after one year the warranty period ends. So, after one year they delete the data from their database.

  • Variability, it indicates that may be the velocity of data varies that may be at peak for some time and sometimes (Patel et al., 2012) very slow. And also, along with the velocity, the data flows may be highly inconsistent. So, it becomes a challenge to manage the data, especially when unstructured data is involved.

  • Visualization is the process which helps to understand the meaning of different data values in a faster and accurate manner.

Complete Chapter List

Search this Book: