Knowledge-Infused Text Classification for the Biomedical Domain

Knowledge-Infused Text Classification for the Biomedical Domain

Sonika Malik, Sarika Jain
Copyright: © 2022 |Pages: 15
DOI: 10.4018/IJISMD.306635
Article PDF Download
Open access articles are freely available for download

Abstract

Extracting knowledge from unstructured text and then classifying it is gaining importance after the data explosion on the web. The traditional text classification approaches are becoming ubiquitous, but the hybrid of semantic knowledge representation with statistical techniques can be more promising. The developed method attempts to fabricate neural networks to expedite and improve the simulation of ontology-based classification. This paper weighs upon the accurate results between the ontology-based text classification and traditional classification based on the artificial neural network (ANN) using distinguished parameters such as accuracy, precision, etc. The experimental analysis shows that the proposed findings are substantially better than the conventional text classification, taking the course of action into account. The authors also ran tests to compare the results of the proposed research model with one of the latest researches, resulting in a cut above accuracy and F1 score of the proposed model for various experiments performed at the different number of hidden layers and neurons.
Article Preview
Top

1. Introduction

The classification of disease based on the symptoms provided as input proves to be a challenging task, which can be simplified using the machine learning algorithm. Text classification is a field that holds the enormous capability to classify text, but it remains a difficult task if done manually (Korde & Mahender, 2012). Text mining has various applications like classification (Supervised, Semi-supervised, Unsupervised), sentiment analysis, and document filtering. Machine Learning and Natural Language Processing approach collaborate to find the patterns in various documents and automatically classify them.

Artificial intelligence (AI) is expected to generate hundreds of billions of dollars in economic value. Although technology has become a part of our daily life, many individuals are still suspicious. It's a major problem for them as AI approaches work like black boxes and create ideas mysteriously. Furthermore, several sectors have accepted knowledge graphs (KGs) as a useful tool for data administration, processing, and enrichment (Jain, 2021). In spite of this, KGs are increasingly being recognized as the foundations of an AI system that provides explainable AI through the design concept known as “Human in the Loop” (HITL). The AI's promise is to automatically derive patterns and rules from large datasets using machine learning algorithms like deep learning. In many circumstances, this makes categorization activities much easier to perform. While machine learning algorithms can collect knowledge from historical data, they are unable to produce new results using that knowledge. There is no trust if nothing is explained. Explain ability ensures that trustworthy agents in the system can understand and defend the AI agent's decisions. Symbolic AI and statistical AI are combined in semantic AI. It incorporates methods such as machine learning, information analysis, semantic web mining, and text mining. It blends neural networks and semantic reasoning with AI techniques. By using this new framework, AI-based systems can be built more quickly and efficiently (Malik & Jain, n.d.).

There's a traditional way of presenting documents called Bag of Words (BOW). Using this strategy, you get information about the terms and their respective frequencies within the sentence or document. Due to each document being represented as a vector of term frequencies in the lexicon, it is also known as the Vector Space Model (or VSM). The semantic relationships between words are also ignored in this representation (Salton & Yang, 1973). The words are also completely out of sequence when they are represented by BOW. That's because this strategy stresses that phrases have some frequency information attached to them.

On the other hand, the task of a text classifier is to classify textual documents according to predetermined categories, with the apparent assumption of each class consisting of records with similar content, usually discussing a particular topic that is different from other classes. When texts are displayed in a vector space, they tend to be sparse due to high dimensionality (Altınel & Ganiz, 2018). This is a significant challenge, especially when there are a large number of class labels but insufficient training data for each of them. If you're working with real-world applications, obtaining labelled data of high quality for training is frequently prohibitively expensive. It is, therefore, necessary to have an accurate text classifier that can make use of semantic information.

There are a variety of techniques for classifying documents based on their semantics. It is based on the meaning of words and hidden semantic relationships between words and documents. Semantic text classification has some advantages over standard text categorization:

  • Relationships between words might be implied or explicit.

  • Word-to-document correlations are extracted and used.

  • Possibility of generating keyword representations for existing classes.

  • Classification accuracy is improved.

  • Traditional text classification algorithms cannot manage synonymy and polysemy because they do not take into account semantic links between words.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 8 Issues (2022): 7 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing