A Case Study on Data Quality, Privacy, and Entity Resolution

A Case Study on Data Quality, Privacy, and Entity Resolution

William Decker (University of Arkansas – Little Rock, USA), Fan Liu (University of Arkansas – Little Rock, USA), John Talburt (University of Arkansas – Little Rock, USA), Pei Wang (University of Arkansas – Little Rock, USA) and Ningning Wu (University of Arkansas – Little Rock, USA)
Copyright: © 2014 |Pages: 22
DOI: 10.4018/978-1-4666-4892-0.ch004
OnDemand PDF Download:
No Current Special Offers


This chapter presents ongoing research conducted through collaboration between the University of Arkansas at Little Rock and the Arkansas Department of Education to develop an entity resolution and identity management system. The process includes a multi-phase approach consisting of data-quality analysis, selection of entity-identity attributes for entity resolution, development of a truth-set, and implementation and benchmarking of an entity-resolution rule set using the open source entity-resolution system named OYSTER. The research is the first known of its kind to evaluate privacy-enhancing, entity-resolution rule sets in a state education agency.
Chapter Preview


Public elementary and secondary schools had 49.4 million students enrolled in the 2009-10 school year (Institute of Education Sciences, 2011). In the same year, approximately 21.5 million students were enrolled in public post-secondary schools. Institutions receiving public funds from the U.S. Department of Education (USDOE) are subject to the FERPA (20 U.S.C. § 1232g; 34 CFR Part 99). FERPA is administered by the Family Policy Compliance Office of the U.S. Department of Education.

Under the precept of better decisions require better information, significant resources were provided to develop databases that house educational data and empower educators with data-driven decision-making capabilities. This is evidenced through the USDOE’s award of four rounds of statewide longitudinal data system (SLDS) grants to U.S. states in 2006, 2007 and 2009 (two rounds were issued in 2009). Forty-one states and the District of Columbia received at least one grant to build a statewide longitudinal data system (Institute of Education Sciences, 2011). These systems were built in order to “efficiently and accurately manage, analyze, and use education data, including individual student records” (Institute of Education Sciences, 2011).

Complete Chapter List

Search this Book: