Expanding Terms with Medical Ontologies to Improve a Multi-Label Text Categorization System
M. Teresa Martín-Valdivia (University of Jaén, Spain), Arturo Montejo-Ráez (University of Jaén, Spain), M. C. Díaz-Galiano (University of Jaén, Spain), José M. Perea Ortega (University of Jaén, Spain) and L. Alfonso Ureña-López (University of Jaén, Spain)
Copyright: © 2009
This chapter argues for the integration of clinical knowledge extracted from medical ontologies in order to improve a Multi-Label Text Categorization (MLTC) system for medical domain. The approach is based on the concept of semantic enrichment by integrating knowledge in different biomedical collections. Specifically, the authors expand terms from these collections using the UMLS (Unified Medical Language System) metathesaurus. This resource includes several medical ontologies. They have managed two rather different medical collections: first, the CCHMC collection (Cincinnati Children’s Hospital Medical Centre) from the Department of Radiology, and second, the widely used OHSUMED collection. The results obtained show that the use of the medical ontologies improves the system performance.
Our main objective is to improve a MLTC system by automatically integrating external knowledge from ontologies. Ontologies have been used for several natural language processing tasks such as automatic summarisation (Chiang et al., 2006), text annotation (Carr et al., 2001) and word sense disambiguation (Martín-Valdivia et al., 2007, among others.