Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Extract Clinical Lab Tests From Electronic Hospital Records Through Featured Transformer Model

Source Title: International Journal of Practical Healthcare Innovation and Management Techniques (IJPHIMT) 10(1)

DOI: 10.4018/IJPHIMT.336529

Article PDF Download Open access articles are freely available for download

Abstract

Natural language, as a rich source of information, has been used as the foundation of the product review, the demographic trend, and the domain specific knowledge bases. To extract entities from texts, the challenge is, free text is so sparse that missing features always exist which makes the training processing incomplete. Based on attention mechanism in deep learning architecture, the authors propose a featured transformer model (FTM) which adds category information into inputs to overcome missing feature issue. When attention mechanism performs Markov-like updates in deep learning architecture, the importance of the category represents the frequency connecting to other entities and categories and is compatible with the importance of the entity in decision-making. They evaluate the performance of FTM and compare the performance with several other machine learning models. FTM overcomes the missing feature issue and performs better than other models.

Article Preview

Top

1. Introduction

Natural language is the most common means of providing information to different people, to different organizations, and to different subjects; therefore, text has become a rich source of information. However, with the development of information technology, the amount of documentation has become so great that it is impossible for humans to handle. We need to take advantage of the capacity of computers to automatically understand documentation and to group documentation into different categories, topics, and subjects so that the total amount of information becomes manageable for humans.

Text mining can be applied to question answering, spam detection, semantic analysis, news categorization, and content classification, to name a few uses. Depending on the application, text data can come from different sources, such as web pages, emails, chats, social media, tickets, product descriptions, invoices, insurance claims, user reviews, and so on. Due to the unstructured nature of the medium, it is challenging and time-consuming to extract information from text.

However, there are many open issues and challenges in text mining, such as synonyms, long-range dependencies, and multiple interacting features. Besides these challenges in the perspective of semantic analysis, we can also see issues from the perspective of data quality, such as missing features, lack of samples, and imbalanced classes. When these issues can be solved only through data modeling, not through data collection, the problem-specific data processing needs to be added to algorithms to overcome the limits on data quality. Another challenge is that the integration of domain knowledge could play an important role in text mining. Domain knowledge can help speed up text processing and increase the precision of the results. Domain-specific knowledge extraction requires semantic analysis to extract the association between the objects or concepts in the documentation. It is still challenging to make the semantic analysis efficient and scalable.

Feature engineering is an important step in machine learning. Typical text classification uses machine learning technology to perform Natural Language Processing (NLP) and to assign labels or tags to textual units, such as terms, sentences, paragraphs, documents, and queries. Normally, machine learning-based methodology performs classification in two steps: first selecting interesting features and then feeding features into classifiers to make predictions. In the training set, because the feature set plays the role of a shortcut of the context, it needs to be complete so that trained models can be used to retrieve results from new data. In other words, when the feature set is incomplete, only partial results can be retrieved by the trained models. Unlike traditional machine learning, deep learning trains word embeddings as the starting point of the classification. Feature engineering is done during model training by adjusting feature weights. However, the feature set still needs to be complete.

In terms of data modeling, text classification can be divided into two categories, one based on maximum likelihood and one based on minimum energy. For maximum likelihood-based methodology, we have Naive Bayes, Support Vector Machines (SVM), etc. For energy-based methodology, we have the Hidden Markov Model (HMM), the Conditional Random Fields model (CRF), etc. The difference between the two categories is not only in the technical details but also in how many language patterns can be modeled. Normally, maximum likelihood-based methodologies consider words to be independent tokens and use Bag of Words (BoW) to build sample sets. Minimum energy-based methodologies can fit models not only with individual words but also with the associations between words, making it possible to conduct semantic analysis. Deep learning is a separate architecture that trains word embeddings, and the classification layer is the last layer in the architecture.

Deep learning architecture is built upon neural networks. Neural approaches have the advantage of overcoming the limitations of feature engineering. Word embeddings convert input texts into an importance vector in which some words have higher significance and some words have lower significance. In this way, the words with higher significance can contribute more to the classification process and the words with lower significance can contribute less to classification process. It is optional to reduce the number of dimensions, but when the word embeddings are built, feature engineering is done.

Complete Article List

Search this Journal:

Reset

Volume 11: 1 Issue (2025): Forthcoming, Available for Pre-Order

Volume 10: 2 Issues (2024): 1 Released, 1 Forthcoming

Volume 9: 2 Issues (2022)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Extract Clinical Lab Tests From Electronic Hospital Records Through Featured Transformer Model

Abstract

1. Introduction

Complete Article List