Analyzing Process Data from Problem-Solving Items with N-Grams: Insights from a Computer-Based Large-Scale Assessment

Analyzing Process Data from Problem-Solving Items with N-Grams: Insights from a Computer-Based Large-Scale Assessment

Qiwei He (Educational Testing Service, USA) and Matthias von Davier (Educational Testing Service, USA)
DOI: 10.4018/978-1-4666-9441-5.ch029
OnDemand PDF Download:


This chapter draws on process data recorded in a computer-based large-scale program, the Programme for International Assessment of Adult Competencies (PIAAC), to address how sequences of actions recorded in problem-solving tasks are related to task performance. The purpose of this study is twofold: first, to extract and detect robust sequential action patterns that are associated with success or failure on a problem-solving item, and second, to compare the extracted sequence patterns among selected countries. Motivated by the methodologies of natural language processing and text mining, we utilized feature selection models in analyzing the process data at a variety of aggregate levels and evaluated the different methodologies in terms of predictive power of the evidence extracted from process data. It was found that action sequence patterns significantly differed by performance groups and were consistent across countries. This study also demonstrated that the process data were useful in detecting missing data and potential mistakes in item development.
Chapter Preview


Computer-based assessments (CBAs) are used for more than increasing construct validity (e.g., Sireci & Zenisky, 2006) and improving test design (e.g., van der Linden, 2005). They also provide new insights into behavioral processes related to task completion that cannot be easily observed using paper-based instruments (Goldhammer, Naumann, & Keβel, 2013). In CBAs, a variety of timing and process data accompanies test performance data. This means that much more than data is available besides correctness or incorrectness. The analyses of these types of data are necessarily much more involved than those typically performed on traditional tests.

This study draws on process data from log files recorded in a computer-based large-scale program, the Programme for International Assessment of Adult Competencies (PIAAC; cf. Schleicher, 2008), to address the question of how sequences of actions recorded in problem-solving tasks are related to task performance. As shown in the following, by analyzing the process data produced by test takers in different performance groups, we were able to obtain insights into how these action sequences are associated with different ways of cognitive processing and to identify key actions that lead to success or failure. These results can be useful for test developers, psychometricians, and instructors to help them better understand what distinguishes successful from unsuccessful test takers and may eventually contribute to improved task and assessment design.

Problem-Solving Items in PIAAC

Large-scale survey assessments of skills and knowledge targeting student and adult populations have often been at the forefront of innovations in test design and the use of analytic methodologies (Rutkowski, Gonzalez, von Davier, & Zhou, 2014; von Davier, Sinharay, Oranje, & Beaton, 2006; von Davier & Sinharay, 2014). PIAAC is no exception. PIAAC, whose first results were released by the Organisation for Economic Co-operation and Development (OECD) in October 2013, is an assessment of literacy skills among adult populations that provides important comparative information to government leaders and policy makers worldwide.

Of significance here, it is the first international household survey of skills predominantly collected using information and communications technology (ICT). The use of computers as the delivery platform enables data collection not just on whether respondents are able to solve the tasks but how they approach the solution and time their efforts.

Three constructs are measured in PIAAC—literacy and numeracy, which are both available in both computer-based and paper-based modes—and problem solving in technology-rich environments (PSTRE), which involves more interactive item types and is available only on computer. The PSTRE items that we focus on in this study are used to assess the skills required to solve problems for personal, work, and civic purposes by setting up appropriate goals and plans, and accessing and making use of information through computers and networks (OECD, 2009).

The construct behind the PSTRE items describes skillful use of ICT as collecting and evaluating information for communicating and performing practical tasks such as organizing a social activity, deciding between alternative offers, or judging the risks of medical treatments (OECD, 2009). To give a response in the simulated computer environments that form the PSTRE tasks, participants are required to click buttons or links, select from dropdown menus, drag and drop, copy and paste, and so on. Two example items are shown in Figure 1 and in Figures 2-4. The layout of the first item (Figure 1) is rather simple. To solve this problem, test takers are required to fill in the box by performing calculations based on the table provided. The second item (Figures 2-4) is more complex as three environments are involved in a Web searching task. In this item, test takers are required to access and evaluate information in the context of a simulated job search. As shown in the item directions, located on the left side of the screen, test takers must find and bookmark one or more sites that do not require users to register or pay a fee.

Figure 1.

An example item with a single environment used in the PIAAC PSTRE test (OECD, 2012)

Figure 2.

An example item with multiple environments used in the PIAAC PSTRE test (OECD, 2012). This is the opening screen of a job search task (see other screens in Figures 3 and 4).

Figure 4.

Second page of the website shown in Figure 3. Details on fees and registration are located in the directions for the form.

Key Terms in this Chapter

Frequency: Term (action) frequency captures how salient an action is within a sequence. Sequence frequency refers to the number of sequences that contain a certain action.

Sequential Pattern Mining: A type of data mining related to finding statistically relevant patterns between data examples where values are delivered in a sequence.

N-Gram: A contiguous sequence of n items from a given sequence of text or speech in the fields of computational linguistics and probability. Items can be phonemes, syllables, letters, words, or actions depending on the application.

Problem Solving in Technology Rich Environments (PSTRE): In PIAAC, it refers to the ability to use technology to solve problems and accomplish complex tasks in a contextualized setting (buying concert tickets, organizing multiple work-group meetings using web-based tools, etc.). PSTRE items generally involve interactive item types and are available only on computer.

Sequence Feature Selection: A process to select the informative (“good”) features from a set of potential features that will be further used in data analytic tasks.

Text Mining: The process of deriving information from text using data analytic techniques.

Computer-Based Assessment (CBA): Assessment built around the use of a computer (or smart phone or tablet) to collect response data.

Process Data: Data (often stored in log files) that captures respondents’ interactions with the computer separated into discrete (typically time-stamped) actions while working on a computer-based task.

Natural Language Processing (NLP): A field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages.

Complete Chapter List

Search this Book: