Reference Hub5
Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media Text

Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media Text

Neetika Bansal, Vishal Goyal, Simpel Rani
Copyright: © 2020 |Volume: 12 |Issue: 1 |Pages: 11
ISSN: 1937-9633|EISSN: 1937-9641|EISBN13: 9781799805656|DOI: 10.4018/IJEA.2020010105
Cite Article Cite Article

MLA

Bansal, Neetika, et al. "Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media Text." IJEA vol.12, no.1 2020: pp.52-62. http://doi.org/10.4018/IJEA.2020010105

APA

Bansal, N., Goyal, V., & Rani, S. (2020). Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media Text. International Journal of E-Adoption (IJEA), 12(1), 52-62. http://doi.org/10.4018/IJEA.2020010105

Chicago

Bansal, Neetika, Vishal Goyal, and Simpel Rani. "Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media Text," International Journal of E-Adoption (IJEA) 12, no.1: 52-62. http://doi.org/10.4018/IJEA.2020010105

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

People do not always use Unicode, rather, they mix multiple languages. The processing of codemixed data becomes challenging due to the linguistic complexities. The noisy text increases the complexities of language identification. The dataset used in this article contains Facebook and Twitter messages collected through Facebook graph API and twitter API. The annotated English Punjabi code mixed dataset has been trained using a pipeline Dictionary Vectorizer, N-gram approach with some features. Furthermore, classifiers used are Logistic Regression, Decision Tree Classifier and Gaussian Naïve Bayes are used to perform language identification at word level. The results show that Logistic Regression performs best with an accuracy of 86.63 with an F-1 measure of 0.88. The success of machine learning approaches depends on the quality of labeled corpora.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.