Popularity Prediction of Video Content Over Cloud-Based CDN Using End User Interest

Popularity Prediction of Video Content Over Cloud-Based CDN Using End User Interest

Rohit Kumar Gupta, Shabbir Kurabadwala, Pradeep Kumar Tiwari, Ankit Mundra
Copyright: © 2022 |Pages: 13
DOI: 10.4018/IJSI.301227
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

It is often believed that more is better, but that is not true in the case of data. As online data is increasing briskly, we are not able to handle such enormous data. With the increasing trends of speedy and uninterrupted access to data usage, CDNs have become quite popular in today’s world. However, it has become difficult to store all the content on CDN servers. This paper aims towards optimizing one of the aspects of CDN’s cached data that is video content. We propose a push-based caching approach by finding appropriate popular videos in accordance with a region to improve an end user’s quality of experience. A semi-supervised machine learning approach has been implemented to classify videos as low, medium, or highly popular. Popularity Prediction research has increased in energy lately. In any case, there has been little work done dependent on prior and significant video parameters for popularity prediction purposes. The experimental results show good accuracy, justifying the selection of parameters and the processing associated with them
Article Preview
Top

Introduction

A Content Delivery Network (CDN) is a network of server nodes distributed geographically around the world that are used for downloading resources (typically static content such as images and JavaScript), allowing for faster delivery of resources with lower latency. Over the years, CDNs have gained immense popularity. With the development of Cloud Computing technology and other utilities, it has become quite easy and efficient to rent bandwidth and storage resources and manage CDNs over the cloud. The petabytes of client produced content traffic moving through CDNs show that the substance index is enormous, and that there is a large amount of online content, and it is of extreme importance to spot popular data and cache it on these servers.

The basic building blocks of a CDN are - CDN PoPs (Points of Presence), Caching servers and SSD/HDD + RAM.

  • CDN PoPs are smartly placed data centres which are responsible for transmitting data within their geographic region. Their principal aim is to scale back the total round-trip time by closing into the location of the user. Each PoP is normally equipped with various caching servers.

  • The task of delivering cached files is done by the CDN Caching servers. They accelerate the website's load time and reduce the overall bandwidth utilization. Inside a caching server are several RAM assets and storage drives.

  • Solid State Drives (SDD) and Hard Disk Drives (HDD) or RAM are used to store the cached files. More frequently requested files are hosted on faster memory. RAM is the fastest memory after SDDs and HDDs.

Predicting video popularity is of great importance as it would result in increased user experience and faster response time. The main focus of this paper is to use machine learning algorithms to predict the video popularity in advance, to utilize the resources of CDN caching servers efficiently.

The process of forecasting the happening of a scenario on a specific dataset, after some algorithm has been trained on it is known as Prediction. The machine learning algorithm generates the most anticipated values for some unknown variable for each tuple in the dataset, helping the model to identify what could be the most likely value of the unknown variable for that example. In this paper, video popularity has been predicted based on certain features which are available at the time of video upload and continuous evaluation is done to determine the dynamic change in its popularity. As mentioned earlier, this project is accomplished by a semi-supervised approach. The k-means clustering algorithm was used to assign labels to the dataset (which were not present in the occupied dataset), and three algorithms, namely Nave Bayes, K-Nearest Neighbour, and Support vector machine, were used to predict popularity after further pre-processing and vectorization of some features. The detailed stepwise process is further explained in the paper.

The following sections of this paper are described in detail: Section II focuses on related work, Section III explains the methodology and implementation of the proposed approach, Section IV analyses the results, and Section V concludes the work (Al-Abbasi et al., 2019; Wang et al., 2018)

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024)
Volume 11: 1 Issue (2023)
Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2021)
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing