Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Exploring Video Sharing Websites Content with Machine Learning

Nan Zhao, Löic Baud, Patrick Bellot

Source Title: International Journal of Distributed Systems and Technologies (IJDST) 5(4)

DOI: 10.4018/ijdst.2014100103

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This article studies the characteristics of content on video sharing websites. A better understanding on online video content can help to analyse Internet users' behaviour and improve the video-sharing service. We improved an existing graph-sampling algorithm so that it could be more adapted to sample over the video sharing websites. A newly category system is defined in this paper, which can be applied on many different video sharing websites for content analysis. We also implement machine learning to realize the content re-classification with the newly defined category system. The efficiency reaches at 90%. From the classified content analysis, we find the content category distribution is not constant, and nowadays, cultural goods content take about 70% over all the sampled videos.

Article Preview

Top

1. Introduction

Video sharing is a type of web services, which allows people to upload, share, distribute or store video content on the Internet. The type for video content can vary from a short clip to a full film. The service normally generates an embedded code for the uploaded video content, which provides user to share their video content in many ways as mail, blog or the social network. In the last decade, the video sharing service turns to one of most active web services, which brings a great raise of the traffic volume over Internet according to the study result of ipoque (Schulze & Mochalski, 2009). As the increase of the bandwidth by the ISPs grows, the Internet users can have a better on-line video performance. Thence, comparing to download video content, the Internet users prefer to enjoy the content on video sharing websites immediately. Meanwhile, the video sharing service can also provide a large space for storing the video clips free of charge or with a fee very low.

Therefore in the recent years, the video sharing service has drawn a lot of interest to Internet researchers. There are several studies with certain video sharing websites as traffic characteristics analysis (Gill et al., 2007) and some properties researches (Cha et al., Halvey & Keane, 2007, Cheng et al., 2008, Kaiser, 2012, Mitra et al., 2011). Those first studies of the video sharing services are very important because they give the first opinions for exchanges of Internet traffic and consummation of video sharing service by the Internet users. However, the sampling algorithm in those prior studies can cause bias to popular videos, and as a consequence their results may be also biased. What is more, there are not many deeper studies on the content types and the distribution of shared video content on those websites. Therefore, in this paper, we present our recent study on the video sharing websites. Our study mainly concerns about video content characteristics based on a different video-sampling algorithm from those used in the existing studies. We try to figure out what kinds of videos are uploaded on the video sharing websites, how are those uploaded videos consumed by Internet users and the distribution of video duration and video count of views. The study results could be helpful for content resource management over the video sharing websites and making better video sharing service. Our study is composed by two parts. The first part work is implemented from January to May in 2013. In this part work, we collected videos from two video sharing websites YouTube and DailyMotion to analyse the video content distribution and other video characteristics on the two video sharing websites. The second part work is implemented from March to May in 2014. In this part work, we apply machine learning on the newly collected videos from YouTube. We aim to figure out an efficient machine learning classification algorithm to classify a large quantities of videos on video sharing websites. The highlights of our work could be summarized as below:

•
We use a graph-sampling algorithm based on Random Walk and suggested videos supplied by the video sharing websites, which to some extent, can reduce certain effect caused by popular videos and correlations between videos.
•
We then define a new category system, which is sufficiently uncorrelated and independent, and it can replace the defaulted category system given by the video sharing websites. This new system can easily be adopted by video sharing websites. With this category system, it becomes easier to compare content category distribution and popularity among different video sharing websites, which is helpful to understand users’ behaviour on video sharing.
•
We use software Weka to realize the machine learning. We choose J48 algorithm as decision trees building algorithm. With ten different decision trees corresponding to different categories, we succeed to classify 91.60% of sampled videos. The average classification accuracy of the ten decision trees is about 95%.
•
Finally, Comparing with the results from the two-part work, we find that during the one-year time, the proportion of content categories change a lot. There are more cultural goods on YouTube in 2014 than in 2013. However, for the distribution of content popularity, there is no so much changing. The popularity of most content increases, especially for Series content, grows to the second most popular category among all the content.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 2 Issues (2023)

Volume 13: 8 Issues (2022)

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Exploring Video Sharing Websites Content with Machine Learning

Abstract

1. Introduction

Complete Article List