Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Pattern Recognition for Large-Scale Data Processing

Amir Basirat, Asad I. Khan, Heinz W. Schmidt

Source Title: Big Data: Concepts, Methodologies, Tools, and Applications

DOI: 10.4018/978-1-4666-9840-6.ch043

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

One of the main challenges for large-scale computer clouds dealing with massive real-time data is in coping with the rate at which unprocessed data is being accumulated. Transforming big data into valuable information requires a fundamental re-think of the way in which future data management models will need to be developed on the Internet. Unlike the existing relational schemes, pattern-matching approaches can analyze data in similar ways to which our brain links information. Such interactions when implemented in voluminous data clouds can assist in finding overarching relations in complex and highly distributed data sets. In this chapter, a different perspective of data recognition is considered. Rather than looking at conventional approaches, such as statistical computations and deterministic learning schemes, this chapter focuses on distributed processing approach for scalable data recognition and processing.

Chapter Preview

Top

Introduction

Recent advancements in computing technology and data analysis have brought forward the ability to generate enormous volumes of highly-complex data which have called for a paradigm shift in the computing architecture and large scale data processing approaches. Jim Gray, a distinguished database researcher and manager of Microsoft Research's eScience group called the shift a “fourth paradigm”. The first three paradigms were defined as experimental, theoretical and, more recently, computational science (Hey, Tansly & Tolle, 2009). Gray argued that the only solution to this outgrowth of big data, commonly known as data deluge, is to develop a new set of computing tools to process and analyze the data flood as the existing computer architectures are becoming more incapable of dealing with data-intensive tasks over time due to their constantly growing latency gaps between multi-core CPUs and mechanical hard disks (Gray, Bell & Szalay, 2006). In fact; with an emerging interest to leverage massive amounts of data available in open sources such as the Web for solving long standing information retrieval problems; the question remains, how to effectively incorporate and efficiently exploit immense data sets. This question brings to the forefront a crucial need for high levels of scalability in the world of big data. Thus reinforcing Moore’s Law of exponential increases in computing power and solid-state memory (Moore, 2000), in which it is stated that:

The complexity for minimum component costs has increased at a rate of roughly a factor of two per year... Certainly over the short term this rate can be expected to continue, if not to increase (pg. 57).

Although this was initially referred to the transistor counts within a processor, the effect of this law seems to be applicable in almost all areas of computing, including data generation and analysis. The implications of Moore's Law are quite profound as it is one of the few stable rulers we have today, in other words it's a sort of technological barometer (Malone, 1996):

It very clearly tells you that if you take the information processing power you have today and multiply by two, that will be what your competition will be doing 18 months from now. And that is where you too will have to be (pg. 6).

This outgrowth of big data has significant implications regarding the existing developments of computing applications. According to Anderson (2011), the chief editor of Wired magazine:

Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age. The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to - well, at petabytes we ran out of organizational analogies (pg. 769).

As human beings, our brains could be viewed as large-scale distributed and interconnected networks of sensory systems and memories. Observing, recognizing and recalling what we have seen contribute to a significant portion of the activities conducted within these large-scale networks. Provided that an optimal solution is found for the scalability problem, the internet could provide the levels of interconnectivity and complexity that bear a resemblance to the human brain. Harnessing the massive potential embodied within these distributed networks of interconnected high-performance machines may provide recognition and processing capabilities for large-scale and highly-complex data.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Pattern Recognition for Large-Scale Data Processing

Abstract

Introduction

Complete Chapter List