Extracting Definitional Contexts in Spanish Through the Identification of Hyponymy-Hyperonymy Relations

Extracting Definitional Contexts in Spanish Through the Identification of Hyponymy-Hyperonymy Relations

Olga Acosta, Gerardo Sierra, César Aguilar
DOI: 10.4018/978-1-4666-8690-8.ch003
(Individual Chapters)
No Current Special Offers


The automatic extraction of hyponymy-hypernymy relations in text corpus is one important task in Natural Language Processing. This chapter proposes a method for automatically extracting a set of hyponym-hyperonym pairs from a medical corpus in Spanish, expressed in analytical definitions. This kind of definition is composed by a term (the hyponym), a genus term (the hyperonym), and one or more differentiae, that is, a set of particular features proper to the defined term, e.g.: conjunctivitis is an infection of the conjunctiva of the eye. Definitions are obtained from definitional contexts, and then sequences of term and genus term. Then, the most frequent hyperonyms are used in order to filter relevant definitions. Additionally, using a bootstrapping technique, new hyponym candidates are extracted from the corpus, based on the previous set of hyponyms/hyperonyms detected.
Chapter Preview


The automatic extraction of lexical relations in text corpora is one of the current interests in artificial intelligence (AI), particularly for area of natural language processing (NLP). One of the most exploited lexical relations is hyponymy-hypernymy. According to Murphy (2003), in artificial intelligence, hyponymy-hypernymy enables inference mechanisms in terms of entailments such that a statement entails another statement that includes one of the word’s hyperonyms (A cat stole my food-An animal stole my food). On the other hand, this relation is implicit in analytical definitions where a term (hyponym) is defined by means of a genus (hypernym) plus one or more differentiae (Conjunctivitis is an infection of the conjunctiva of the eye). Finally, in grammatical terms, selectional restrictions, for example, on the object of a verb can be phrased in terms of a hyperonym, and hyponyms of that word can then also selected as potential objects: I need a beverage (beverage can be coffee, tea, juice, and so on).

There are several reasons for automatically extracting lexical relations. On the one hand, current technologies have enabled the accumulation of huge amounts of text information. Consequently, this situation has also increased the need to obtain useful information from these text sources saving time and effort. On the other hand, applications focused on NLP such as text summarization, information retrieval and information extraction can be benefited with new approaches of automatic extraction of useful information in order to improve performance.

To reach this goal, we propose a method that takes advantage of hyponym-hyperonym pairs extracted from candidate analytical definitions found in specialized corpus, considering the association established between the term defined and its genus. We obtain these analytical definitions from definitional contexts (DCs) in Spanish, based on the methodology developed by (Sierra et al. 2008). Once identified these DCs, we extracted term and genus. Then, we use the most frequent hyperonym subset in a bootstrapping step for finding candidate hyponyms in the same specialized corpus (Acosta et al. 2011). In this phase, relational adjectives are used as relevant features linked with a hyperonym.

Complete Chapter List

Search this Book: