Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Named Entity Recognition for Code Mixed Social Media Sentences

Yashvardhan Sharma, Rupal Bhargava, Bapiraju Vamsi Tadikonda

Source Title: International Journal of Software Science and Computational Intelligence (IJSSCI) 13(2)

DOI: 10.4018/IJSSCI.2021040102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

With the increase of internet applications and social media platforms there has been an increase in the informal way of text communication. People belonging to different regions tend to mix their regional language with English on social media text. This has been the trend with many multilingual nations now and is commonly known as code mixing. In code mixing, multiple languages are used within a statement. The problem of named entity recognition (NER) is a well-researched topic in natural language processing (NLP), but the present NER systems tend to perform inefficiently on code-mixed text. This paper proposes three approaches to improve named entity recognizers for handling code-mixing. The first approach is based on machine learning techniques such as support vector machines and other tree-based classifiers. The second approach is based on neural networks and the third approach uses long short-term memory (LSTM) architecture to solve the problem.

Article Preview

Top

Introduction

Named Entities are names of famous places, persons, artifacts, etc. For example, in the sentence I love staying in Manhattan the word Manhattan is a Named Entity (NE) as it represents a name of a famous location. The task of identifying Named Entities in a sentence is the task done by a Named Entity recognizer. Named Entity Recognition is very important in many applications such as sentiment analysis as it helps in removing words with no sentiment attached. Named Entities also help in narrowing down the results in Document Retrieval. In English, there are a lot of systems designed to tackle the problem, prominent ones being Stanford NER tagger (Finkel, Grenager & Manning, 2005) and Illinois NER tagger (Ratinov & Roth 2009). Named Entity Recognition is widely used in many applications such as Question Answering systems, Coreference Resolution, Query Labeling (Bhargava, Y. Sharma, S. Sharma & Baid, 2015), Sentiment Analysis (Bhargava, Y. Sharma & S. Sharma, 2016) Question Classification (Bhargava, Khandelwal, Bhatia & Sharma, 2016) etc.

With the rapid growth in use of online networking websites such as Facebook, Twitter and Instagram, writing of sentences has tended to become more informal. The informal sentences present in social media have certain characteristic differences from region to region. In countries having English as their major language, an informal sentence generally consists of shortcuts for specific words, acronyms, emoticons and hashtags. In other countries where English is not the major language, there exists another problem along with the usage of the above tokens. It is the usage of words from their native languages into a sentence along with english words. This tendency of using native language words in sentences along with english is called code-mixing. Code mixing helps the user to express his opinions or feelings without the boundaries of a single language. This trend is generally observed when the writer is more comfortable explaining certain phrases in a sentence in his native language. For example, consider the sentence You have that DJ wala look. In this sentence, the word wala is of hindi language while the other words are in english. Generally, the syntax and grammar for code mixed sentences is not the same as that of a native English statement. When language used becomes informal, the linguistic tools present become less reliable as due to the change in the grammar rules. This causes the native NER taggers to perform poorly on social media texts.

Multilingual Code mixing can also be present where a sentence may have more than two languages present. For example, You have that DJ wala look kani, konchem hairstyle change chesi unte thoda acha hoga. In this sentence there are three languages mixed up namely English, Telugu and Hindi. To deal with a bilingual code mixed sentence, one needs to consider the grammar patterns for both the languages along with the hashtags, emoticons and acronyms. With multilingual code-mixed sentences one needs to deal with the patterns observed in all the languages along with the other.

Analysing social media texts made by a user can state many issues such as the state of mind of the person, the opinion of the person on a certain issue or event or product, etc. For example, let us consider a hypothetical ad campaign for a famous brand on social media platforms. For the company to analyse the response about the campaign, it needs tools which can analyse the text written on the social media platform. In the current scenario, there are tools for basic English and other monolingual texts. There are tools for handling monolingual informal sentences for some of the largely spoken languages. But unfortunately, there aren’t a lot of tools present for handling multilingual sentences as well as code mixed sentences.

In this paper, the problem of Named Entity Recognition is tackled for Hindi-English code mixed text. The remaining paper is organized as follows. Firstly, challenges were recognised on specifications of both social media and code mixed texts. Secondly, the related work done in the previous couple of years has been explained clearly as the background work in the paper. Three approaches were proposed for dealing with NER recognition in the proposed methodology section. Later on, the data set which was provided by CMEE-IL 2016 Task Organizers (Rao & Devi, 2016) was analysed with 5 major categorical tags in the paper. Lastly, the experiments conducted, result obtained and the error analysis of the proposed method were mentioned in the paper followed by the conclusion and future scope.

Complete Article List

Search this Journal:

Reset

Volume 16: 1 Issue (2024)

Volume 15: 1 Issue (2023)

Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming

Volume 13: 4 Issues (2021)

Volume 12: 4 Issues (2020)

Volume 11: 4 Issues (2019)

Volume 10: 4 Issues (2018)

Volume 9: 4 Issues (2017)

Volume 8: 4 Issues (2016)

Volume 7: 4 Issues (2015)

Volume 6: 4 Issues (2014)

Volume 5: 4 Issues (2013)

Volume 4: 4 Issues (2012)

Volume 3: 4 Issues (2011)

Volume 2: 4 Issues (2010)

Volume 1: 4 Issues (2009)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Named Entity Recognition for Code Mixed Social Media Sentences

Abstract

Introduction

Complete Article List