Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Concept Attribute Labeling and Context-Aware Named Entity Recognition in Electronic Health Records

Alexandra Pomares-Quimbaya, Rafael A. Gonzalez, Oscar Mauricio Muñoz Velandia, Angel Alberto Garcia Peña, Julián Camilo Daza Rodríguez, Alejandro Sierra Múnera, Cyril Labbé

Source Title: Data Analytics in Medicine: Concepts, Methodologies, Tools, and Applications

DOI: 10.4018/978-1-7998-1204-3.ch017

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language and detail, as well as being full of ad hoc terminology, including acronyms and jargon, which is especially challenging in non-English EHR, where there is a dearth of annotated corpora or trained case sets. This paper proposes an approach for NER and concept attribute labeling for EHR that takes into consideration the contextual words around the entity of interest to determine its sense. The approach proposes a composition method of three different NER methods, together with the analysis of the context (neighboring words) using an ensemble classification model. This contributes to disambiguate NER, as well as labeling the concept as confirmed, negated, speculative, pending or antecedent. Results show an improvement of the recall and a limited impact on precision for the NER process.

Chapter Preview

Top

Introduction

Electronic health records (EHR) constitute an important resource not just for tracing single patient histories but for population studies with clinical or administrative purposes. The nature of EHR, however, presents multiple challenges for doing so. In fact, physicians often complain that EHR systems are oriented towards information storage, but lack the ability to provide information extraction as well, mainly a consequence of the unstructured nature of the information (Menasalvas, Rodriguez-Gonzalez, Costumero, Ambit, & Gonzalo, 2016). Actually, the extraction task involves a combination of structured and unstructured data, including codified clinical classifications, images, test results and narrative text, among others. This paper will focus on the recognition of pre-established entities of interest in EHR extracted from clinical systems. This falls within the task of named entity recognition (NER), responsible for extracting entities and relationships between them, within a specific domain, typically relying on dictionaries, ontologies and thesaurus to do so (Menasalvas et al., 2016). As such, this paper proposes an approach to NER, which is aimed at improving precision and recall by combining different NER methods.

There is a large body of work on methods and tools to process biomedical text in general (Menasalvas et al., 2016). However, as pointed out in Leaman et.al (Leaman, Khare, & Lu 2015), biomedical texts are a highly codified result edited for clarity and intended at a large audience, while clinical narrative texts contained in EHR are written by healthcare professionals about a single patient and are aimed at colleagues or themselves. This implies a variety of notes, which are diverse in language and detail, as well as ad hoc terminology, including acronyms and jargon, far from being highly codified and standard. In practice, these results in EHR systems are country, hospital and even service dependent (Menasalvas et al., 2016). In addition, EHR are often filled under time pressure and with low motivation due to the fact that it takes time away from actual patient care or education. As a result, EHR narrative text usually suffers from low quality reflected in: variable semantics, structure without formal sentences, missing punctuation, missing expected words, misspelling or heterogeneous styles and jargon (Menasalvas et al., 2016). Moreover, independently of the motivation or resulting quality, the clinical language per se implies an additional series of challenges, including term variability, ambiguity and complexity, lack of fine-grained classifications, results followed by units or dosages, incomplete syntactic components in sentences, as well as data availability (Dong, Qian, Guan, Huang, Yu, & Yang, 2016). In NER terms, ambiguity is one of the biggest challenges, because concepts of interest are frequently hypothetical, negated or include temporal relationships (Menasalvas et al., 2016). As such, many existing natural language processing (NLP) approaches become ineffective or insufficient for clinical narrative text.

Moreover, despite numerous NER proposals for EHR, the vast majority are limited to medical text written in English (Menasalvas et al., 2016). Given that NER relies on dictionaries, several are already available, including Unified Medical Language System (UMLS), Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT) or International Classification of Diseases (ICD); these are also in English with either no translations or limited versions in other languages. In the context of this paper, EHRs belong to a Spanish speaking hospital and it has already been recognized that for event extraction from EHRs in Spanish, the lack of annotated corpora is perhaps the main difficulty (Casillas, Pérez, Oronoz, Gojenola, & Santiso, 2016). Despite there being ways to avoid a language-specific annotated corpus through supervised machine learning, training the models is costly and using it in complex language sets runs into major performance issues (Dong et al., 2016). In addition, the choice of inference algorithms and managing heterogeneous medical fields further complicates medical NER (Casillas et al., 2016; Dong et al., 2016).

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Concept Attribute Labeling and Context-Aware Named Entity Recognition in Electronic Health Records

Abstract

Introduction

Complete Chapter List