Integrating Semi-Open Data in a Criminal Judicial Setting

Integrating Semi-Open Data in a Criminal Judicial Setting

Mortaza S. Bargh, Sunil Choenni, Ronald F. Meijer
DOI: 10.4018/978-1-5225-0717-8.ch007
(Individual Chapters)
No Current Special Offers


Judiciary systems comprise various partner organizations (e.g., police, public prosecutor, courts, and rehabilitation centres) that collaboratively resolve criminal cases. These partner organizations have their own data administration and management systems, which are setup/operated separately and integrated barely. This chapter explains the approach of the authors' organization for integrating the data sets of the Dutch judiciary systems, and for opening the data integration outcomes to the public and/or to specific groups. These outcomes (e.g., data sets and reports) are meant to provide useful insights into (the performances of) the partner organizations individually and collectively. Such data opening efforts do not comply with all Open Data requirements, mainly due to the quality, (privacy) sensitivity and interoperability issues of the raw data. Nevertheless, since these initiatives aim at delivering some benefits of Open Data, the chapter introduces the new paradigm of Semi-Open Data for acknowledging such data opening initiatives.
Chapter Preview


In recent years we have observed a surge in Open Data movements. The term Open Data refers to data that is reusable and distributable by third parties without any costs (ODef, 2015). The idea of Open Data has widely gained popularity for several good reasons. For a government, Open Data could be a means to strengthen democracy. By opening up its data, a government provides transparency into its operations, meets regulatory compliance, and increases public participation and collaboration. Open Data is also expected to boost economic developments and innovations. Commercial companies and start-ups can exploit the data to develop new applications and services such as recommender apps (Choenni, Bargh, Roepan, & Meijer, 2015) and to create new types of professions such as data journalists and fact checkers. Furthermore, businesses and individuals can use Open Data to support their strategic and personal decisions. For example, individuals can use neighborhood statistics for buying houses and for implanting trees and plants in polluted areas. Scientists can reuse the data for reviewing scientific results as well as for their own evidence-based research.

Data opening, however, appears to be a tedious and almost infeasible process for many organizations. There are several impediments of Open Data identified in the literature, see for example (Zuiderwijk, Janssen, Choenni, Meijer, & Sheikh Alibaks, 2012; CoA report, 2015). These obstacles range from privacy concerns to technical issues. In this chapter, the authors argue that acting according to the current definition of Open Data (ODef, 2015) is too challenging for the judicial field. For example, three typical challenges are privacy, legacy and interoperability of judicial data. First, opening judicial data ‘as it is’ often contravenes the privacy law and regulations because such data pertain to real-life persons and encompass privacy sensitive attributes like names, birth-dates, crime types, and sentences. Even if these sensitive information items are removed or anonymized, there is a chance to infer such sensitive information by combining several junks of information. As an example of such inference, suppose that a school decides to publish the average exam marks of its male and female students. If we know that only two female students took part in an exam and if we are able to derive from social media that one of these female students got a 9, then the result of the other female student can be deduced.

Opening judicial data as raw as possible, as required in Open Data (OD101, 2015; Wonderlich, 2010), may put data analysts on the wrong track due to the legacy nature of judicial data. The semantics of judicial data may evolve over time since rules and regulations are subject to frequent changes. For example, crimes are categorized into different categories – the well-known categories of crimes include violent crime, theft, burglary, drugs offence, drunk driving, vandalism, and economic crime (Moolenaar, Choenni, & Leeuw, 2007). The crime categories today might be different from those categories of ten years ago. For example, ten years ago the Internet related crimes were assigned to the category of ‘others’ or ‘economic crime’, while today there is a call for a new category ‘cybercrime’. If a new category is added or an existing category is deleted in a categorization system, then the objects need to be redistributed among the new set of categories. For a proper interpretation of the results obtained by analyzing or querying such data, it is of vital importance to know how the semantics of the data have evolved over time. Suppose that someone is interested in knowing how the category economic crime has developed over the last ten years, during which the objects among the categories have been redistributed due to the introduction of the new category of cybercrime. This redistribution may cause the occurrence of a trend break. For a proper interpretation of such a break in the trend, therefore, one should know how and when the redistribution was carried out.

Complete Chapter List

Search this Book: