Automated Generation of SNOMED CT Subsets from Clinical Guidelines

Automated Generation of SNOMED CT Subsets from Clinical Guidelines

Carlos Rodríguez-Solano (University of Alcalá, Spain), Leonardo Lezcano (University of Alcalá, Spain) and Miguel-Ángel Sicilia (University of Alcalá, Spain)
DOI: 10.4018/978-1-4666-3667-5.ch013
OnDemand PDF Download:
No Current Special Offers


Recently, there has been a growing body of literature on how the large SNOMED CT (SCT) terminology could be implemented and used in different clinical settings. Its complexity and size is a major impediment for coding clinical information in practical applications. Therefore, it is sometimes necessary to define subsets for various use cases and specific audiences. Subsets are clusters of SNOMED CT terms that share a specified common characteristic. The automated generation of subsets from clinical document corpora have been proposed elsewhere, but they still require a collection of documents that is representative for the targeted domain. In this chapter the authors extend the research described in Rodríguez-Solano, Cáceres, and Sicilia (2011), where clinical guidelines’ glossaries are used as seed terminologies to automatically generate subsets by traversing SNOMED relationships. In the current research, further results have been obtained considering additional clinical guidelines; the application of quantitative analysis to the generated Snomed CT subsets, derived as result of implementing the proposed techniques, has allowed the evaluation of them.
Chapter Preview


Semantic Interoperability between heterogeneous healthcare systems requires improvements in the precision of meaning and understandability during the exchange of information; in particular, for Electronic Health Record (EHR) systems, it is essential to enable the consistent use of clinical terminologies.

SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) ( is a compendium of three knowledge sources in the biomedical sciences; it may also be viewed as a comprehensive thesaurus and ontology of biomedical concepts One of them is the UMLS Metathesaurus, a large, multi-purpose and multilingual vocabulary database that currently comprises more than 1.5 millions biomedical terms from over 100 sources.

Although SNOMED CT (SCT) has gained adoption in the last years, as previously mentioned, its practical application for coding clinical information is hampered by its complexity and size. It contains over 300,000 concepts in the latest release, and over 1 million relationships between these concepts; therefore, the large scale of SNOMED CT is a major barrier to its progress while there is evidence that only a small fraction of its content is being used (Stroetmann et al., 2009).

Because of the very wide range of concepts, and the rate of change of those concepts (Spackman K., 2005), dealing with the entire set of terms contained in SNOMED CT is difficult. In particular, when clinicians use SNOMED browsers to code clinical documents, experiments have shown their performance to be disappointing (Chiang, et al., 2006). Moreover, these ontologies that sometimes contain more than a hundred thousand concepts (terms) are hard to maintain as changes can affect large parts of the model.

Complete Chapter List

Search this Book: