Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Hybridized GA-Based Feature Selection for Text Sentiment Analysis

Gyananjaya Tripathy, Aakanksha Sharaff

Source Title: Encyclopedia of Data Science and Machine Learning

DOI: 10.4018/978-1-7998-9220-5.ch112

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Recent research work has described the effectiveness of various sentiment classification techniques ranging from simple lexicon-based methods to more complex machine learning techniques. Researchers of the article develop an integrated framework that bridges the gap between dictionary-based methods and machine learning methods to achieve better accuracy and more flexibility. To solve the problem of scalability that occurs as the feature set grows, a hybrid genetic algorithm (GA)-based dimensional reduction method is proposed. With the help of this novel approach, authors can reduce the size of the feature set by reaching a remarkable value of accuracy. Here the authors have compared the proposed feature reduction method with a widely used principal component analysis and singular value decomposition-based feature reduction algorithms. In addition, the proposed sentiment analysis model is tested in other metrics, including precision, recall, F1 score, and feature size.

Chapter Preview

Top

Introduction

The advancement of today's internet technology has changed the lifestyle of society. Due to this advancement, the current generation has upgraded their lifestyle up to a certain extent. Different social forums are commonly used to share helpful information and new ideas for advertisement and service improvement. The social platform is often watched with various perspectives. These include compiling business marketing strategies for product and promotional services, observing harmful actions to detect and reduce cyber-attacks, and sentiment analysis to analyze human responses and feedback (Saberi & Saad, 2017). Sentiment analysis is often referred to as archaeology, uprooting and classifying sentiments from text using Natural Language Processing (NLP), mathematics, or Machine Learning (ML) methods. ML methods use various approaches and a database that can be trained to distinguish and find sentiments (Fiok et al., 2021). Authors have widely studied the field of sentiment analysis over the past few years. In this state of affairs, different methods have been tested after development. The most usual process is ML which requires a robust database to train and learn the relationship between various aspects and sentiments.

Sentiment analysis is a form of written assessment or language spoken to determine whether speaking is negative, positive, or neutral and to what extent. Current analysis Market tools can handle a lot of price customer criticism honestly and accurately. Collectively, sentiment analysis finds customers’ ideas on various topics, including procurement, the provision of services, or the presentation of promotions (Alsaeedi & Khan, 2019). Sentiment analysis is often used in the case of a review. Reviews can be taken from various resources for various reasons, such as product reviews, political reviews, and community reviews. When feedback from customers using any product, further questions will be included: Is the product usable? Is this product satisfactory? Is this product worth the money? Some helpful information always comes out of updates in positive or negative feedback (Birjali et al., 2021). Sentiments need to be learned using these practical answers. The semantic position estimates submission and ideas in the text data. The rules-based analysis searches for different words in a text and categorizes them based on positivity and negativity.

The proposed paper is based on Amazon's review dataset's hybrid sentiment analysis process. The dataset contains several responses and equally separates the positive and negative labels. Authors have developed an integrated novel algorithm based on the Genetic Algorithm (GA) to minimize the feature (Iqbal et al., 2019). Iqbal et al. (2019) have explained the feature selection method using GA by evaluating the fitness value with sentiment score whereas in the proposed model the fitness of each solution is evaluated using the accuracy score of each feature subsets. Support Vector Machine (SVM) (Preeti et al., 2020) is used to check the validity of the words concerning the label to find an effective solution. This evolutionary process of selecting the right element improves accuracy with increasing scalability. This customized method offers a 45% reduced feature set with better accuracy. In addition to demonstrating the feasibility of this proposed method, the authors conducted a detailed study with other mitigation strategies such as Principal Component Analysis (PCA) and Singular Value Decomposition (SVD). Using these two algorithms as a comparison, the authors obtained the proposed model results, which provides up to 14.5% increased accuracy over PCA and 16.2% increased accuracy over SVD through the Naïve Bayes learning process and this reduction feature strategies. As a comparison of the number of features of all three feature reduction strategies, the proposed method gives 13% better results compared to PCA and a 10% better result compared to SVD. With a small amount of variable set, the proposed system exceeds the other two algorithms.

The main contributions to the proposed work are as follows:

Key Terms in this Chapter

Chromosome: Set of parameters which is a suggested solution to the complication that Genetic Algorithm is trying to resolve.

Mutation: Mutations convert one or more genes from a chromosome from its original state. In the resolution of the solution, the solution may change completely from the previous solution.

Classification: This is the technique used to separate the categorical values on the basis of their positivity and negativity.

Crossover: One kind of genetic operator used to convert the chromosome from one generation to another. By doing so, high quality offspring can be collected.

Feature Optimization: The technique used towards the dimensionality reduction. As an optimized feature reduction, this will only select the features which have more impact on the target variable.

Fitness Calculation: This is the core part of the algorithm. It is an objective function which is used to find the optimal one. This will calculate the fitness to select new parents for mating.

Population: A bunch of attributes that converges towards the best solution with the certain iteration to take care of the issue.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

A Hybridized GA-Based Feature Selection for Text Sentiment Analysis

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List