Article Preview
TopIntroduction
Information retrieval (IR) is a wide area that covers the extraction of specific information from a pool of information resources. People, at the present scenario expect direct answers for their query posted in search engines. Question Answering (QA) Systems could easily address the current user’s needs by returning passages as answers. The QA systems remains a boon to the teaching and learning community, as it provides short answers instead of long documents. The general architecture for QA systems is shown in Figure 1. Few works emphasized on customizing QA systems, to facilitate e-learning. In closed domain QA systems, exact answers for questions were obtained with the extensive use of Natural Language Processing (NLP) techniques. Few other works relied on course contents, Frequently Asked Questions (FAQs) and ratings to cross verify the essence of the questions, using a recommender system. Leema & Gulzar (2018) proposed a system that generated course recommendations for students in learning platforms, based on query classification technique.
Factoid queries are based on simple facts or definitions. Enormous research has been conducted on improving the quality of search results for factoid queries through decades. Recent research in Question Answering has paid attention to the extraction of answers to non-factoid queries.
Non-factoid queries contain lengthy sentences and the answers to these queries will require consideration of multiple facets. Answer retrieval for a non-factoid query includes the following major challenges: The answers should cover multiple aspects of the query; the answers should be presented through multiple passages; the answers may not contain exact terms in the query, so identifying the correct answer becomes a critical task.
Deep learning nowadays not only attracts image-processing applications but also overwhelms text mining applications. As a consequence, there emerged a technique called zero-shot learning,
Figure 1. General Architecture for Question Answering Systems
where a machine can predict the accurate class for unseen data. Many researchers have implemented zero-shot learning for image processing applications (Xie et al., 2019; Fu et al., 2018b; Liu et al., 2018; Xiong et al., 2016; Gavves et al., 2015) and only a few works used it for text processing (Artetxe & Schwenk, 2019; Zhang et al., 2019; Fu et al., 2018a; Yazdani & Henderson, 2015). This paper focuses on the implementation of zero-shot learning for text processing, especially for non-factoid question answering and then summarizes the appropriate answers using the summarization techniques adapted in (Ha et al., 2018; Cao et al., 2017). This model could be incorporated into teaching and learning platforms such as Massive Open Online Courses (MOOCs).
TopTheoretical Framework
In search of answers to a non-factoid query, many researchers had their experimentation with various NLP and machine learning techniques that are applied with neural networks, probabilistic and algebraic models. The neural network acts as an intelligent negotiator in selecting appropriate answers and determines whether answers are relevant to a specific query. As an advancement of neural networks, deep learning came into live usage. Answer extraction focuses on the use of CQAs and other external knowledge bases. The reason behind choosing CQA for this task is that since human beings write answers directly, the reliability of the answer will be good. Weber et al. (2012) emphasized mining tips from yahoo answers and used those tips to create tail answers. Keikha et al. (2014) focused on creating a collection of questions and passage-level answers using TREC GOV2 queries and documents.
Deep learning approaches yields better answers for non-factoid QA rather than traditional IR methods. CNN is found to be the best at those times for feature extraction. Far beyond classification, CNN could also perform other NLP tasks such as document summarization, QA, sentiment analysis, etc (Kim, 2014). In QA, Yih et al. (2014) evaluated the semantic likeness between a query and records in a Knowledge Base (KB) to determine sustaining facts, while answering a query. Consequently, Dong et al. (2015) suggested a Multi-Column CNN (MCCNN) that could examine and recognize many facets of a query and craft its representations.