Semantic Methods for Textual Entailment

Semantic Methods for Textual Entailment

Andrew Neel, Max H. Garzon
DOI: 10.4018/978-1-60960-741-8.ch028
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The problem of recognizing textual entailment (RTE) has been recently addressed using syntactic and lexical models with some success. Here, a new approach is taken to apply world knowledge in much the same way as humans, but captured in large semantic graphs such as WordNet. We show that semantic graphs made of synsets and selected relationships between them enable fairly simple methods that provide very competitive performance. First, assuming a solution to word sense disambiguation, we report on the performance of these methods in four basic areas: information retrieval (IR), information extraction (IE), question answering (QA), and multi-document summarization (SUM), as described using benchmark datasets designed to test the entailment problem in the 2006 Recognizing Textual Entailment (RTE-2) challenge. We then show how the same methods yield a solution to word sense disambiguation, which combined with the previous solution, yields a fully automated solution with about the same performance. We then evaluate this solution on two subsequent RTE Challenge datasets. Finally, we evaluate the contribution of WordNet to provide world knowledge. We conclude that the protocol itself works well at solving entailment given a quality source of world knowledge, but WordNet is not able to provide enough information to resolve entailment with this inclusion protocol.
Chapter Preview
Top

Introduction

The task of recognizing textual entailment (RTE) as defined by (Dagan et al. 2005; Bar-Haim et al. 2006) is the task of determining whether the meaning of one text (the hypothesis) is entailed (or inferred) from another text (simply, the text) to humans. It differs from logical inferences because world knowledge is required to assess the hypothesis. While typical instances of RTE are easy for humans, it is conversely difficult for computers in most instances.

The significance of finding a quality solution is high. Automatic solutions would have substantial impact on computers system’s capability key textual entailment recognition tasks. Consider the task of automatic tutoring (Graesser et al. 2000; Graesser, Hu, and McNamara 2005), where a student provides answers to open ended questions asked by a tutoring system in natural language. Here, student answers must be evaluated against a number of known quality answers. Similarly, consider writing a program for a computer to automatically summarize textual knowledge of several documents (called the multi-document summarization problem). Here, it may be desirable to remove sentences from the text which could be inferred by another sentence. A third example is information retrieval (IR). Here, the goal is to find documents which are semantically similar in response to a query. Thus, the task for RTE is to evaluate the semantic closeness or relatedness of two documents and only provide a match when documents entail the query.

One related challenge to RTE is Word Sense Disambiguation (WSD). WSD literally applies knowledge of the world events to match words and phrases to meanings, herein called synsets. Similar to RTE, WSD is performed easily by humans but very poorly by computers. Thus, WSD also requires a protocol for disambiguating words and phrases into synsets automatically (i.e., without human assistance).

Humans implicitly disambiguate words by matching the word in context to meanings and experiences stored in memory. With humans, context and experience serve as world knowledge. Consider the following entailment instance: the text “John Smith received a belt.” entails the hypothesis “John Smith received a strip of leather to fasten around his waist.” In this example, “belt” may have the meaning of “a strip of leather to fasten around the waste,” “a strip of leather with machine-gun ammo attached” or “a strong punch.” The human may remember the full set of potential meanings but experience will quickly identify “a strip of leather to fasten around the waste” as the specific and proper meaning. Resolving word/phrases to a list of synsets (i.e., a concept or meaning) is relatively easy. However, no automated solution has captured human experience sufficiently well to choose the appropriate meaning. Therefore, the crux of this issue is finding a representation of human world knowledge and experience in a model that will perform on computers the same function with a success comparable to humans.

Humans are very good at solving both entailment and WSD because we seem to be able to relate words and lexicon into what is meant by the speaker in the context of prior knowledge of the real world. This paper presents a solution for entailment that can be implemented easily by digital computer systems. Arguably, the closest digital equivalent to a human’s experience with word relationships is WordNet (Kaplan and Shubert, 2001; Clark et Al., 2005). Here, the fundamental construct is not a word but an abstract semantic concept., called synset (synonymous set). Each synset may be expressed by different words, and, conversely, the same word may represent different synsets. As the name implies, the concepts of WordNet are inter-connected to provide a network relationships between synsets.

Complete Chapter List

Search this Book:
Reset