Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Scaling of Streaming Data Using Machine Learning Algorithms

Önder Aykurt, Zeynep Orman

Source Title: Analyzing Multidisciplinary Uses and Impact of Innovative Technologies

DOI: 10.4018/978-1-6684-6015-3.ch008

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Today, data is generated continuously by millions of data sources, which send in the records simultaneously, in small to large sizes. The rapid growth of data in velocity, volume, value, variety, and veracity has presented big challenges for businesses of all types. This type of data is called streaming data. Streaming data includes a variety of data such as mobile application notifications, e-commerce purchases, sensors in transportation vehicles, information from social applications, IoT sensors. This data is required to be processed sequentially and incrementally on record by record and used for a wide variety of analytics including correlations, filtering, and sampling. Information derived from such analysis gives visibility into many aspects such as customer activity, website clicks, geo-location of devices. There has been a great interest in developing systems for processing continuous data streams. This chapter aims to design a scalable system that can instantly analyze the data using machine learning algorithms.

Chapter Preview

Top

Introduction

The volume of data currently produced by various activities has never been so big and generated at an increasing speed. With the development of technology day by day, its place and importance in our lives are increasing. Developing technology has improved the interaction of many devices with each other and with people. As a result of this interaction, a large amount of data emerges. Real-time generated data is valuable as soon as it arrives and supports decision-making. These data, which are sequential due to their characteristics, obtained in different sizes and irregular periods, are defined as streaming data. Streaming data may lose value or be lost entirely if not processed immediately. Therefore, it is crucial to develop scalable systems that continuously receive and analyze unstructured data. Streaming data is datasets with different properties than static data. In streaming data, the processing time of the algorithm is more critical than the processing time of the static data processing algorithm. Streaming data is valuable as soon as it arrives in the system and needs to be processed and evaluated quickly. E.g., The data coming to the application about the security of financial applications should be evaluated at that moment in terms of transaction security. The data model designed in static data is permanent, and it can update itself according to the data used in streaming data. Since the future data size cannot be predicted, the data model must be adapted to the time-varying data flow. The time required for processing and evaluating streaming data is more limited than static data. The shortening of the evaluation period is critical for the value of the data in the application where streaming data is used.

The setup and continuity of a system that can receive and process streaming data is essential for resource usage. Data analysis hardware is insufficient for good design and easy use. Along with the status of these aircraft, resources are acquired if needed. This system of systems will also be financial (Nittel, 2015). With the data being stored, the references taken from the streaming data model are put in the foreground instead of the applications related to the data records. A small size velocity detail is vital for data in such programs. Volume refers to the unknown of the total size of the data and the size of the entire data. Scanning significant volumes of streaming data in entire storage or time intervals has a negative impact on system performance. The speed will be the probability of being given for a period of time it can process data that can be a job once. Since the streaming data may change over time, the algorithm used if it runs more than once, the model needs to be updated. Accuracy refers to the reliability of the data and whether it needs review. Streaming data is often heterogeneous, and many different types of data are processed together. The concept of diversity provides information about this feature of streaming data.

Fields such as social media applications, e-commerce, mobile applications, the internet of things, operation tracking systems, advertising can be examples as streaming data sources (Kolajo et al., 2019). With the development of e-commerce and web applications, web analysis has gained an increasingly important role. Big data processing tools are used to analyze data such as the number of visitors to the relevant website, the relationship between the products examined, and the user profile. Thus, it was possible to collect and process data in real-time. Physically monitored operations tracking systems are one of the main data sources of streaming data. Basically, metrics that affect the overall performance of discrete computer systems are monitored. A lot of data is processed and recorded, such as the status of disk drives, processor load and performance, network usage, storage unit performance, and access times. Monitoring these systems is important for overall system performance and identifying potential problems. Advertising applications is one of the most critical areas where data is produced and evaluated in real-time. Metrics such as purchases and ad clicks in different environments, together with real-time bidding systems, offer the opportunity to reach the right customer group at the right time. The data produced for this is collected and processed with the metrics determined from the system. Valuable data produced as output is used in new proposals.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Scaling of Streaming Data Using Machine Learning Algorithms

Abstract

Introduction

Complete Chapter List