Augmenting Medical Decision Making With Text-Based Search of Teaching File Repositories and Medical Ontologies: Text-Based Search of Radiology Teaching Files

Augmenting Medical Decision Making With Text-Based Search of Teaching File Repositories and Medical Ontologies: Text-Based Search of Radiology Teaching Files

Priya Deshpande, Alexander Rasin, Eli T. Brown, Jacob Furst, Steven M. Montner, Samuel G. Armato III, Daniela S. Raicu
Copyright: © 2018 |Pages: 26
DOI: 10.4018/IJKDB.2018070102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Teaching files are widely used by radiologists in the diagnostic process and for student education. Most hospitals maintain an active collection of teaching files for internal purposes, but many teaching files are also publicly available online, some linked to secondary sources. However, public sources offer very limited (and ad-hoc) search capabilities. Based on the previous work on data integration and text-based search, the authors extended their Integrated Radiology Image Search (IRIS 1.1) engine with a new medical ontology, SNOMED CT, and the ICD10 dictionary. IRIS 1.1 integrates public data sources and applies query expansion with exact and partial matches to find relevant teaching files. Using a set of 28 representative queries from multiple sources, the search engine finds more relevant teaching cases versus other publicly available search engines.
Article Preview
Top

1. Introduction

A radiology teaching file repository is a collection of important cases for teaching and clinical follow-up, and references to better understand the classification of diseases (Dashevsky et al., 2015). All teaching files share a similar general structure but significant variations exist, even within the same data source. Teaching files may include information such as patient history, findings, diagnosis, differential diagnosis, and images related to clinical reports. Teaching files can be categorized into three types: (1) personal teaching files that are meant for the general use of the teaching file owner, (2) shared in-house teaching files in which the owner makes the teaching file content available for viewing within their institution, and (3) public teaching files built on a shared model but with more comprehensive content that may undergo a formal review before publication (De-Arteaga et al., 2015).

A recent national survey assessing the role and desired features of radiology teaching files found that, among the 396 respondents from 115 institutions, 89% use some form of teaching file from which 76% keep a personal teaching file containing a variety of media and 67% use a shared in-house teaching file, while 83 institutions had paid subscriptions to a public teaching file repository (Dashevsky et al., 2015). Public teaching file solutions have become increasingly popular, providing users with instant access to thousands of cases (although of inconsistent data) (Seitz et al., 2003), sometimes for a fee. While all of these public and commercial solutions are available, most do not permit users to (1) easily submit personal cases to their libraries, (2) perform efficient querying, categorization and search for particular cases, (3) simulate basic PACS (picture archiving and communication system) functionality, or (4) enable self-directed and assessed learning – all important teaching file repository features as identified by at least 50% of the survey respondents (Dashevsky et al., 2015).

Therefore, as the first step to organize and extract medical knowledge from large teaching file repositories, we have 1) developed a database schema for teaching file integration and a framework for a radiology image search engine and 2) evaluated the framework on the Radiology Society of North America Medical Imaging Resource Community (RSNA MIRC) (2018) and MyPacs (Group, 2018) repositories indexed using the Radiology Lexicon (RadLex) (RSNA, 2018). We normalized all data sources and augmented the integration process with data cleaning and validation to account for different format representations. Many data sources include noisy entries – for example, different teaching files, even though stored in the same data repository, do not use the same text category names. For the teaching files that did not come indexed by RadLex (as is done for MIRC), we annotated all imported data with RadLex terms.

In this paper, we propose an extension of the data repository indexing by integrating Unified Medical Language System (UMLS) Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) (2017) using UMLS Metamorphosis (SNOMED, 2017) and show that this extension improves search results, particularly for queries that originally retrieved few teaching files. To evaluate and quantify search results, we propose a new evaluation criterion in consultation with medical experts that measures the accuracy of search results based on the term appearance in different categories of teaching file text. Based on domain knowledge and surveys, we found that findings and diagnosis are the most relevant search categories in teaching files. Our original search engine is referred to as Integrated Radiology Image Search (IRIS) (Deshpande, 2017) IRIS 1.0, and the proposed improvement as IRIS 1.1.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 8: 2 Issues (2018)
Volume 7: 2 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing