Article Preview
Top1. Introduction
An ontology, in the domain of computer sciences, is defined as “a possibly (complete or incomplete) consensus agreement about a domain conceptualization” (Guranio, 2001). They are commonly developed as engineering artifacts, and are employed in various applications. Currently, the domain of health and biomedical informatics is greatly influenced by the capabilities of ontologies in representing knowledge. For example, biomedical ontologies, which capture the structure and semantics are employed for various activities including semantic tagging, medical data integration, and in easing semantic interoperability issues. In the past two decades various standard ontologies in the form of biomedical terminologies, taxonomies, and vocabularies have been designed, developed and deployed. For instance, International Classification of Diseases (ICD)1 hierarchically organizes various medical concepts in the domains such as diseases, symptoms, injuries, and procedures. Logical Observation Identifiers Names and Codes (LOINC)2 is a standard for identifying medical laboratory test observations. Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT)3 is a computer-readable ontology of medical terms and instances providing codes, terms, synonyms and definitions covering medical domains such as diseases, findings, procedures, microorganisms, and substances.
The success of employing these ontological standards towards uniform semantic interpretation and for easing medical semantic data interoperability has resulted in interoperability issues between the ontological standards themselves. This is due to the unique way knowledge is structured and expressed in each respective biomedical ontology, and the different approaches taken by these ontological standards. In order to address these interoperability issues between multiple standards and to continue to effectively capture and convey medical knowledge, the National Library of Medicine has initiated the Unified Medical Language System (UMLS) (Bodenreider, 2004; Browne, Divita, & McCray, 2003). The UMLS system can be viewed as a compendium of multiple standard medical ontologies (e.g., ICD, LOINC, SNOMED-CT, and NCBI) that also provides mappings between the aggregated concepts. This mapping structure can be exploited to navigate between the standards and implicitly mitigate the interoperability issues between them. The UMLS system is comprised of three components: Semantic Network (SN), Metathesaurus (META) and Specialist Lexicon (SL). UMLS-SN is a structure connecting defined semantic types with semantic relationships. UMLS-META comprises millions of medical instances aggregated from various aforementioned medical standards and forms the base of the UMLS system. Finally, the Specialist Lexicon - contains syntactic, morphological and orthographic data for commonly used medical vocabulary.
The UMLS-SN classifies the medical instances in the UMLS-META. For example, the medical instances such as Coughing, Redness of Eye, and Wheezing are of semantic type Finding. Myocardial Infraction and Diabetes are of semantic type Disease or Syndrome. Hematologic Tests and Urinalysis are of semantic type Laboratory Procedures. These semantic types are connected using semantic relationships such as “co-occurs_with” (e.g. Finding co-occurs with Finding), “associated_with” (e.g. Findings associated with Pathologic Function), and so on. In addition to classifying the medical instances in UMLS-META, the UMLS-SN also provides additional semantics, thus acting as a meta-structure or meta-model to UMLS-META (Section 3 elaborates on the structure of the UMLS.). The UMLS-META has been successfully employed in various application such as biomedical natural language processing (Savova, et al., 2010), ontological alignment (Jimenez−Ruiz, Grau, & Horrocks, 2012), clinical decision support systems (Detmer, Barnett, & Hersh, 1997), and EHR applications (Plaza & D´ıaz, 2010), to name a few. However, the success of employing the UMLS-SN knowledge in health and biomedical informatics applications (e.g., clinical environments, private hospitals, research facilities, etc.) beyond UMLS system is jeopardized due various key issues.