Article Preview
Top1. Introduction
Thanks to the quick development of information technology, news articles, internet sites, emails, and digital libraries are all accessible as electronic text documents. In order to manage such vast amounts of data, text classification (TC) has evolved as an imperative tool for locating as well as categorizing text content. Unlabeled text documents are typically routed to one or more established categories utilizing the text classification problem, depending on the content of the documents (Harish & Revanasiddappa, 2017). In many applications, including spam detection (Crawford et al., 2015), document categorization (Jiang et al., 2016), sentiment analysis (Bakshi et al., 2016), email classification (Nikhath et al., 2016), text summarization (Jo, 2017), and soon, it has been witnessed that the TC problem is easily adapted. It is always a difficult effort for academics to improve TC preciseness through study of lots of extremely sparse phrases and skewed realities that are internal to the texts. Consequently, feature selection (FS) is a key component of text categorization. Additionally, it makes choosing the best features for TC more difficult if the document collection’s text is connected to numerous grouping and there is an uneven distribution of classes.