Hybrid Neural Architecture for Intelligent Recommender System Classification Unit Design

Hybrid Neural Architecture for Intelligent Recommender System Classification Unit Design

Emmanuel Buabin
DOI: 10.4018/978-1-4666-2542-6.ch010
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The objective is intelligent recommender system classification unit design using hybrid neural techniques. In particular, a neuroscience-based hybrid neural by Buabin (2011a) is introduced, explained, and examined for its potential in real world text document classification on the modapte version of the Reuters news text corpus. The so described neuroscience model (termed Hy-RNC) is fully integrated with a novel boosting algorithm to augment text document classification purposes. Hy-RNC outperforms existing works and opens up an entirely new research field in the area of machine learning. The main contribution of this book chapter is the provision of a step-by-step approach to modeling the hybrid system using underlying concepts such as boosting algorithms, recurrent neural networks, and hybrid neural systems. Results attained in the experiments show impressive performance by the hybrid neural classifier even with a minimal number of neurons in constituting structures.
Chapter Preview
Top

Introduction

From point of sale systems, through telecommunications data warehousing systems to space observation systems, computers have brought to the fore, a fair idea of the amount of data, humans generate each day. Bramer (2008) indicates that,

  • 1.

    The current NASA earth observation satellites generates a terabyte size (i.e. 109 bytes) of data each day. The amount of data generated, is by far greater that the total amount of data ever transmitted by all previous observation satellites,

  • 2.

    The Human Genome project stores thousands of bytes for each of several billion genetic bases, and

  • 3.

    As far back as 1990, the US Census collected over a million bytes of data.

The above-mentioned examples even suggest data explosion of structured and unstructured data across computer systems in all aspects of human endeavour. Researchers estimate that close to eighty percent (80%) of an institution’s information lies in the form of unstructured data—emails, text files, graphs, etc. This means, for business entities to measure business performance over time, they need to take into consideration, unstructured data to arrive at a meaningful and appreciable management decision.

The introduction of the World Wide Web has also increased the volume of unstructured data stored by businesses and individuals. With every computer literate as potential author on the Web, new domains have been registered and corresponding website information added or updated to increase readership/viewership. News giants such as CNN, BBC etc have taken great advantage of the Internet to disseminate news to a wider reading/viewing public, by transmitting “breaking news” stories for example, within minutes for the public to access. Governments, public/private institutions and individuals, have also joined in the massive dissemination of huge amounts of unstructured data by publishing pages about business, jobs, goods, and services. Users have often times resorted to using text based publications mainly because

  • 1.

    They are easy to deploy,

  • 2.

    They have low runtime impact on servers—hence, loads faster,

  • 3.

    They use less hardware resources such as server disk space, and

  • 4.

    Much more information could be stored and published as compared to video, images etc.

For this reason, the Internet has become one of the large data repositories of unstructured data in the world. On one hand (i.e. human comprehension), Internet data has overwhelmed humans and become more of a “curse” than a “blessing.” On the other hand, the same repository (i.e. the Internet) has become a “blessing” due to wealth of information it contains.

Many data mining systems have been built over the past few decades. To be specific, “Multi-” and “single-” classed text based classifiers have been built to ascertain their potential on real world news text corpora (Billsus & Pazzani, 1999b; Kroha & Baeza-Yates, 2005; Shaikh, et al., 2006; Garfield & Wermter, 2006, 2003a; Garfield & Wermte, 2003b; Wermter, et al., 2000b, 1999; Arevian, 2007; Arevian & Pachev, 2007) are examples of text classifiers. Text understanders (Hahn & Schnattinger, 1998), text categorizers (Taeho, 2009; Saha, 2010), text summarizers (Evans, et al., 2004; Mckeown, 2002; Takeda & Takasu, 2007; Radev, 2001; Chang-Shing, et al., 2005; McKeown, et al., 1995; Mingrong, et al., 2008; Wasson, 1998; Xindong, et al., 2010), Web classifiers (Jingwen, et al., 2009; Meijuan, et al., 2009; Meijuan & Jingwen, 2009b), Recommender systems (Chiang, et al., 2004; Wei, et al., 2009; Tintarev & Masthoff, 2006; Felden & Chamoni, 2007; Aciar, et al., 2006; Zhang, 2008; Zhang & Jiao, 2007) have also been built to support text based research—e.g. news domain. The rationale behind the usage of news text corpora is as a result of its inherent non-stationary features and ability to present similar challenges as the Internet. Therefore, successful implementation of a text-based classifier on news text corpora should be equally competitive on non-stationary platforms such as Really Simple Syndicated (RSS) feeds.

Complete Chapter List

Search this Book:
Reset