A Lexico-Syntactic-Semantic Approach to Recognizing Textual Entailment

A Lexico-Syntactic-Semantic Approach to Recognizing Textual Entailment

Rohini Basak (Department of Information Technology, Jadavpur University, India), Sudip Kumar Naskar (Computer Science and Engineering, Jadavpur University, India) and Alexander Gelbukh (Center for Computing Research, Instituto Politécnico Nacional, Mexico)
DOI: 10.4018/978-1-7998-3038-2.ch010

Abstract

Given two textual fragments, called a text and a hypothesis, respectively, recognizing textual entailment (RTE) is a task of automatically deciding whether the meaning of the second fragment (hypothesis) logically follows from the meaning of the first fragment (text). The chapter presents a method for RTE based on lexical similarity, dependency relations, and semantic similarity. In this method, called LSS-RTE, each of the two fragments is converted to a dependency graph, and the two obtained graph structures are compared using dependency triple matching rules, which have been compiled after a thorough and detailed analysis of various RTE development datasets. Experimental results show 60.5%, 64.4%, 62.8%, and 61.5% accuracy on the well-known RTE1, RTE2, RTE3, and RTE4 datasets, respectively, for the two-way classification task and 54.3% accuracy for three-way classification task on the RTE4 dataset.
Chapter Preview
Top

Introduction

The task of recognizing textual entailment (RTE) takes a pair of text fragments as input and determines whether the meaning of one fragment (called hypothesis, H) can be derived from the meaning of the other fragment (called text, T), that is, whether T logically entails H. Due to huge impact of the RTE task in many application areas of natural language processing (NLP), such as information extraction, information retrieval, question answering, summarization, paraphrase acquisition, machine translation and reading comprehension, the PASCAL Network of Excellence funded by the European Union has organized a number of RTE challenges and, in particular, has released a number of RTE datasets, such as RTE1 to RTE4 (Dagan et al., 2005; Bar-Haim et al., 2006; Giampiccolo et al., 2007; Giampiccolo et al., 2008).

Textual entailment (TE) is a directional relationship between T–H pairs: the entailment relation may hold from T to H but not from H to T, as in the following example from the RTE1 dataset:

  • Example 1 (Entailment).

  • T: The Daily Telegraph, most prized asset in Lord Conrad Black’s crumbling media empire, has been sold to Britain’s Barclay twins.

  • H: Daily telegraph is sold.

Here, the meaning of H can be derived from that of T. However, T contains additional information other than that contained in H and thus cannot be inferred from H.

Typically, RTE is a 2-way classification task, where T–H pairs are labeled as YES or NO entailment. This 2-way classification task has been extended to a 3-way task in the fourth PASCAL RTE challenge, where the no-entailment cases were further classified into mutually contradicting or not. In the RTE4 dataset, T–H pairs are labeled as Entailment, Contradiction or Unknown:

  • Example 2 (Contradiction).

  • T: As German voters go to the polls on Sunday, unemployment will be a key issue. Despite tough labour market reforms, the number of unemployed has risen to 5m. And Germany's jobless are getting despondent.

  • H: Germany's jobless rate decreases.

Here, not only no entailment holds from T to H, but T and H expresses contradicting information.

  • Example 3 (Unknown).

  • T: Proposals to extend the Dubai Metro to neighbouring Ajman are currently being discussed. The plans, still in the early stages, would be welcome news for investors who own properties in Ajman.

  • H: Dubai Metro will be expanded.

Here, T only represents a possibility of an event (expansion of Dubai Metro) but not asserts this event. Therefore, H is not entailed by T but does not contradict it.

Key Terms in this Chapter

Text Entailment: A process aimed to determine whether the meaning of one text fragment is inferred from the meaning of the other text fragment.

Parsing: Breaking down a sentence into its component parts so that the meaning of the sentence can be understood.

Text Similarity: A measure that indicates how similar (usually, in meaning) two given texts are to each other.

Semantics: The meaning and interpretation of words, signs or sentence structures and largely determines our reading comprehension.

Syntax: The rules that govern the ways in which words combine to form phrases, clauses and sentences. It indicates proper order of words in a phrase or sentence.

Lexical Property: A property related to words or the vocabulary of a language.

Complete Chapter List

Search this Book:
Reset