Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Effectiveness of Normalization Over Processing of Textual Data Using Hybrid Approach Sentiment Analysis

Sukhnandan Kaur Johal, Rajni Mohana

Source Title: International Journal of Grid and High Performance Computing (IJGHPC) 12(3)

DOI: 10.4018/IJGHPC.2020070103

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Various natural language processing tasks are carried out to feed into computerized decision support systems. Among these, sentiment analysis is gaining more attention. The majority of sentiment analysis relies on the social media content. This web content is highly un-normalized in nature. This hinders the performance of decision support system. To enhance the performance, it is required to process data efficiently. This article proposes a novel method of normalization of web data during the pre-processing phase. It is aimed to get better results for different natural language processing tasks. This research applies this technique on data for sentiment analysis. Performance of different learning models is analysed using precision, recall, f-measure, fallout for normalize and un-normalize sentiment analysis. Results shows after normalization, some documents shift their polarity i.e. negative to positive. Experimental results show normalized data processing outperforms un-normalized data processing with better accuracy.

Article Preview

Top

1. Introduction

Natural language processing is a field of computational linguistics and artificial intelligence. It is the key to unlock various decisions using narrative web content. The automation of decision support system widely relies over the performance of natural language processors. Data available over the web sphere in various forms such as text, audio, video or pictures. Due to the arbitrary nature of the language, this data is unstructured in nature. Efficiency of decision support system also gets affected by this unstructured data processing. This may sometimes hinder the performance of sentiment analyzer thus affecting the decision support system. As shown in Figure 1, initially, data is collected from the various social sites for automation of the decision support systems. Then data is pre-processed to get the structured content which includes removing the redundant content, cleaning and normalization. Later, various language processing tasks are carried out. Depending on the requirement, the results of the language processor are filtered out for the automation of decision support system. In this work, the result of sentiment analyzer (SA) is considered.

Figure 1.

Automation of decision support system

The proliferation of web data primarily as communication medium give rise to the existence of unstructured content in the form of posts, blogs, reviews, etc. This web data is rich indicator of people’s reaction for any entity. This reaction of people is analyzed and termed as sentiment analysis in the field of natural language processing.

Classification of this web data into predefined categories, i.e. positive, negative or neutral is the task of sentiment analyzer. The web content is usually the raw data which is taken as an input by the sentiment analyzer. To reduce the performance degradation, it is necessary to pre-process data efficiently. Given the importance to minimize the human intervention in sentiment analysis and to get better results, systematized and efficient mechanisms is the need of the hour. Normalization is the basic task to handle performance degradation of various natural language processing tasks. The term normalizes in past is taken as to just make the content in a well-structured format. These days normalize has broader term in the field of natural language processing. It includes handling slangs, spell correction, finding missing words, cleaning the text, etc. In this manuscript, the presented system design and algorithm is used to handle unstructured or noisy data for sentiment analysis.

1.1. Motivation and Contribution

The most important source of texts is undoubtedly the Web. The web content is full of unstructured content and slangs. The motivation behind our work is to process the semantically correct and methodologically useful content for sentiment analysis. To find the significant meaning or the replacements of each and every slang is the key concern of the work presented. It is a general methodology which can be embedded into various natural language applications to enhance their performance.

The proposed technique is generic in nature. This can be applied to the pre-processing of any textual data for language processing task. This helps in enhancing the performance of the automatic decision support system. Hybrid systems for sentiment analysis comprises of two modules: corpus based, and dictionary based. The corpus-based approach is characterized by the maximum likelihood ratio along with point-wise mutual information for normalization. The dictionary-based approach consists of a crossword dictionary for slangs and emoticons. The development of hybrid system stems from the failure of any single technique to achieve a satisfactory level of accuracy in sentiment analysis.

The paper structure is following the state-of-the-art algorithms for normalization in section 2. It includes the summarized content of various researchers work in the same field. It is preceded by the design and algorithm of the proposed hybrid method for handling un-normalized data in section 3 and section 4. Afterwards, the experimental results and evaluation of the system is done in section 5 and 6. Lastly, the conclusion is presented in section 7.

Complete Article List

Search this Journal:

Reset

Volume 16: 1 Issue (2024)

Volume 15: 2 Issues (2023)

Volume 14: 6 Issues (2022): 1 Released, 5 Forthcoming

Volume 13: 4 Issues (2021)

Volume 12: 4 Issues (2020)

Volume 11: 4 Issues (2019)

Volume 10: 4 Issues (2018)

Volume 9: 4 Issues (2017)

Volume 8: 4 Issues (2016)

Volume 7: 4 Issues (2015)

Volume 6: 4 Issues (2014)

Volume 5: 4 Issues (2013)

Volume 4: 4 Issues (2012)

Volume 3: 4 Issues (2011)

Volume 2: 4 Issues (2010)

Volume 1: 4 Issues (2009)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Effectiveness of Normalization Over Processing of Textual Data Using Hybrid Approach Sentiment Analysis

Abstract

1. Introduction

1.1. Motivation and Contribution

Complete Article List