A Yes/No Answer Generator Based on Sentiment-Word Scores in Biomedical Question Answering

A Yes/No Answer Generator Based on Sentiment-Word Scores in Biomedical Question Answering

Mourad Sarrouti, Said Ouatik El Alaoui
DOI: 10.4018/978-1-7998-1204-3.ch005
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Background and Objective: Yes/no question answering (QA) in open-domain is a longstanding challenge widely studied over the last decades. However, it still requires further efforts in the biomedical domain. Yes/no QA aims at answering yes/no questions, which are seeking for a clear “yes” or “no” answer. In this paper, we present a novel yes/no answer generator based on sentiment-word scores in biomedical QA. Methods: In the proposed method, we first use the Stanford CoreNLP for tokenization and part-of-speech tagging all relevant passages to a given yes/no question. We then assign a sentiment score based on SentiWordNet to each word of the passages. Finally, the decision on either the answers “yes” or “no” is based on the obtained sentiment-passages score: “yes” for a positive final sentiment-passages score and “no” for a negative one. Results: Experimental evaluations performed on BioASQ collections show that the proposed method is more effective as compared with the current state-of-the-art method, and significantly outperforms it by an average of 15.68% in terms of accuracy.
Chapter Preview
Top

Introduction

The large size of literature in the biomedical domain makes it difficult for information seekers even in their field of interest to find the information they need (McDermid, Kristjanson, & Spry, 2010). The most used for accessing to biomedical information are information retrieval (IR) systems, such as PubMed1 which gives access to the MEDLINE2 biomedical bibliographic database (Hristovski, Dinevski, Kastrin, & Rindflesch, 2015). Indeed, finding sufficient and short precise answers is a challenging task for classical IR systems (Wren, 2011). In addition, in classical IR systems, the users have often to deal with the burden of studying and filtering the returned citations of their queries so as to find the precise information they were looking for. Therefore, to minimize searching and browsing time while maximizing the usefulness of that knowledge is a growing interest for biomedical question answering systems (Bauer & Berleant, 2012). Question answering (QA) regards a sophisticated form of IR characterized by information needs that are expressed as natural language statements or questions (Wren, 2011). It aims at providing inquiries with specific pieces of information as an answer, by automatically analyzing thousands of articles, ideally, in less than a few seconds. Typically, an automated QA system consists of three main processing phases, which can be studied and developed independently (Athenikos & Han, 2010; Cao et al., 2010; Neves & Leser, 2015): (1) question processing, (2) document processing, and (3) answer processing. Figure 1 illustrates the generic architecture of a biomedical QA system.

Given an input biomedical question, the question is first handed over to the question processing phase. The latter consists of the following components: (a) question analysis for extracting some useful information such as biomedical entity names, and semantic relationships; (b) question classification for identifying the answer format and the topic (Cao et al., 2010; Patrick & Li, 2012; Roberts et al., 2014; Lopes et al., 2014; Sarrouti et al., 2015); (c) query formulation for constructing IR-style query by transforming the question into a canonical form. The output of this phase is an appropriate query which is used as input to document processing, the second phase. An IR system is normally used to retrieve the relevant documents (Sarrouti & Alaoui, 2016). Then, passages are extracted which serve as answer candidates as well as an input to the last phase, answer processing; in this phase, the system uses an appropriate answer extraction algorithm to estimate the qualities of the candidate answers. Finally, the top-ranked candidate answers and the raw texts from which the answers were extracted are shown to the user (Sarrouti & Ouatik, 2017).

Figure 1.

A typical biomedical question answering system architecture

978-1-7998-1204-3.ch005.f01

Complete Chapter List

Search this Book:
Reset