Natural language understanding and assessment is a subset of natural language processing (NLP). The primary purpose of natural language understanding algorithms is to convert written or spoken human language into representations that can be manipulated by computer programs. Complex learning environments such as intelligent tutoring systems (ITSs) often depend on natural language understanding for fast and accurate interpretation of human language so that the system can respond intelligently in natural language. These ITSs function by interpreting the meaning of student input, assessing the extent to which it manifests learning, and generating suitable feedback to the learner. To operate effectively, systems need to be fast enough to operate in the real time environments of ITSs. Delays in feedback caused by computational processing run the risk of frustrating the user and leading to lower engagement with the system. At the same time, the accuracy of assessing student input is critical because inaccurate feedback can potentially compromise learning and lower the student’s motivation and metacognitive awareness of the learning goals of the system (Millis et al., 2007). As such, student input in ITSs requires an assessment approach that is fast enough to operate in real time but accurate enough to provide appropriate evaluation. One of the ways in which ITSs with natural language understanding verify student input is through matching. In some cases, the match is between the user input and a pre-selected stored answer to a question, solution to a problem, misconception, or other form of benchmark response. In other cases, the system evaluates the degree to which the student input varies from a complex representation or a dynamically computed structure. The computation of matches and similarity metrics are limited by the fidelity and flexibility of the computational linguistics modules. The major challenge with assessing natural language input is that it is relatively unconstrained and rarely follows brittle rules in its computation of spelling, syntax, and semantics (McCarthy et al., 2007). Researchers who have developed tutorial dialogue systems in natural language have explored the accuracy of matching students’ written input to targeted knowledge. Examples of these systems are AutoTutor and Why-Atlas, which tutor students on Newtonian physics (Graesser, Olney, Haynes, & Chipman, 2005; VanLehn , Graesser, et al., 2007), and the iSTART system, which helps students read text at deeper levels (McNamara, Levinstein, & Boonthum, 2004). Systems such as these have typically relied on statistical representations, such as latent semantic analysis (LSA; Landauer, McNamara, Dennis, & Kintsch, 2007) and content word overlap metrics (McNamara, Boonthum, et al., 2007). Indeed, such statistical and word overlap algorithms can boast much success. However, over short dialogue exchanges (such as those in ITSs), the accuracy of interpretation can be seriously compromised without a deeper level of lexico-syntactic textual assessment (McCarthy et al., 2007). Such a lexico-syntactic approach, entailment evaluation, is presented in this chapter. The approach incorporates deeper natural language processing solutions for ITSs with natural language exchanges while remaining sufficiently fast to provide real time assessment of user input.
Entailment evaluations help in the assessment of the appropriateness of student responses during ITS exchanges. Entailment can be distinguished from three similar terms (implicature, paraphrase, and elaboration), all of which are also important for assessment in ITS environments (McCarthy et al, 2007).
The terms entailment is often associated with the highly similar concept of implicature. The distinction is that entailment is reserved for linguistic-based inferences that are closely tied to explicit words, syntactic constructions, and formal semantics, as opposed to the knowledge-based implied referents and references, for which the term implicature is more appropriate (McCarthy et al., 2007). Implicature corresponds to the controlled knowledge-based elaborative inferences defined by Kintsch (1993) or to knowledge-based inferences defined in the inference taxonomies in discourse psychology (Graesser, Singer, & Trabasso, 1994).
Key Terms in this Chapter
Expectation: A stored (generally ideal) answer to a problem, against which input is evaluated; concept used in ITSs.
Natural Language Understanding and Assessment: An NLP subset focusing on evaluating natural language input in intelligent tutoring systems.
Intelligent Tutoring System: Interactive, feedback-based computer systems designed to help students learn various topics.
Syntactic Parsing: The process of discovering the underlying structure of sentences.
Dependency: Binary relations between words in a sentence whose label indicates the syntactic relation among the two words.
Graph Subsumption: A specific example of graph isomorphism. Isomorphism exists when two graphs are equivalent. Subsumption can be viewed as subgraph isomorphism.
Natural Language Processing: The science of capturing the meaning of human language in computational representations and algorithms.
Entailment: The task of deciding whether a text fragment logically or semantically infers another text fragment.
Latent Semantic Analysis: A statistical technique for human language understanding based on words that co-occur in documents of large corpora.