Reference Hub

This research has been cited in:

Conference
Online User Reviews Investigation Towards Madura Island Tourism using Latent Semantic Analysis2022 IEEE 8th Information Technology International Seminar (ITIS)10.1109/ITIS57155.2022.10010297
Article
Computational linguistics based text emotion analysis using enhanced beetle antenna search with deep learning during COVID-19 pandemicPeerJ Computer Science10.7717/peerj-cs.1714
Article
Optimized machine learning model discourse analysisEducation and Information Technologies10.1007/s10639-024-12515-3
Conference
Impact of Preprocessing on Twitter Based Covid-19 Vaccination Text Data by Classification Techniques2022 International Conference on Applied Artificial Intelligence and Computing (ICAAIC)10.1109/ICAAIC53929.2022.9792768

Text Mining and Pre-Processing Methods for Social Media Data Extraction and Processing

Santoshi Kumari

Source Title: Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media

ISBN13: 9781799895947|ISBN10: 1799895947|ISBN13 Softcover: 9781799895954|EISBN13: 9781799895961

DOI: 10.4018/978-1-7998-9594-7.ch002

Cite Chapter Cite Chapter

MLA

Kumari, Santoshi. "Text Mining and Pre-Processing Methods for Social Media Data Extraction and Processing." Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media, edited by Pantea Keikhosrokiani and Moussa Pourya Asl, IGI Global, 2022, pp. 22-53. https://doi.org/10.4018/978-1-7998-9594-7.ch002

APA

Kumari, S. (2022). Text Mining and Pre-Processing Methods for Social Media Data Extraction and Processing. In P. Keikhosrokiani & M. Pourya Asl (Eds.), Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media (pp. 22-53). IGI Global. https://doi.org/10.4018/978-1-7998-9594-7.ch002

Chicago

Kumari, Santoshi. "Text Mining and Pre-Processing Methods for Social Media Data Extraction and Processing." In Handbook of Research on Opinion Mining and Text Analytics on Literary Works and Social Media, edited by Pantea Keikhosrokiani and Moussa Pourya Asl, 22-53. Hershey, PA: IGI Global, 2022. https://doi.org/10.4018/978-1-7998-9594-7.ch002

Export Reference

Favorite

View Full Text HTML

View Full Text PDF

Abstract

A huge amount of unstructured data is generated from social media platforms like Twitter. Volume of tweets and the velocity with which they are generated on various topics presents extensive challenges in data analytics and processing techniques. Linguistic flexibility for writing tweets presents many challenges in preprocessing and natural language processing tasks. Addressing these challenges, this chapter aims to select, modify, and apply information retrieval and preprocessing steps for retrieving, storing, organizing, and cleaning real-time large-scale unstructured Twitter data. The work focuses on reviewing the previous research and applying suitable preprocessing methods to improve the quality of data by removing unessential data. It is also observed that using tweeter APIs and access tokens provides easy access to real-time tweets. Preprocessing methods are fundamental steps of text analytics and NLP tasks to process unstructured data. Analyzing suitable preprocessing methods like tokenization, removal of stop word, stemming, and lemmatization are applied to normalize the extracted Twitter data.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Text Mining and Pre-Processing Methods for Social Media Data Extraction and Processing

MLA

APA

Chicago

Export Reference

Abstract

Request Access