A Critical Data Ethics Analysis of Algorithmic Bias and the Mining/Scraping of Heirs' Property Records

A Critical Data Ethics Analysis of Algorithmic Bias and the Mining/Scraping of Heirs' Property Records

DOI: 10.4018/979-8-3693-1762-4.ch015
(Individual Chapters)
No Current Special Offers


The data and research ethics surrounding artificial intelligence (AI), machine learning (ML), and data mining/scraping (DMS) have been widely discussed within scholarship and among regulatory bodies. Concurrently, the scholarship has continued to examine land rights within the Gullah Geechee community for heirs' property land rights and land dispossession along the Gullah Geechee Cultural Heritage Corridor (GGCHC). This chapter presents the results of a critical data ethics analysis for the risks of data brokerage, algorithmic bias, the use of DMS by dominant groups external to the GGCHC, and the ensuing privacy implications, discrimination, and ongoing land dispossession of heirs' property owners. Findings indicate a gap in documented research for heirs' property records, yet Gullah Geechee algorithmic bias was evident. Further research is needed to understand better the data privacy protections needed for heirs' property records and ongoing scrutiny of local versus federal policy for data privacy protections specific to heirs' property records.
Chapter Preview


The data ethics and privacy challenges surrounding artificial intelligence (AI), machine learning (ML), and data mining/scraping (DMS) have been well documented within scholarship and among regulatory bodies (Dobbs & Gaither, 2023; Throne, 2022; Strobel & Shokri, 2022). Recently, social media companies have been levied unprecedented fines and other penalties for these invasions and data extractions, and the U.S. Congress is considering proliferation protections for the use of AI (Throne, 2022). A subset of the scholarship has focused explicitly on the destructive nature of algorithmic bias, digital discrimination, and threats to data privacy (Strobel & Shokri, 2022). Opportunities exist for the design of “robust algorithms that are also accurate and privacy preserving” (Strobel & Shokri, 2022, p. 49), yet those algorithms designed for data extraction used for predatory or nefarious purposes may not consider fairness, equity, or even data privacy.

For example, Jackson et al. (2019) stressed the need for data science to consider the long history of social injustices against vulnerable populations:

Another lesson learned is that where data on vulnerable populations exists, partnering with data scientists derived from those vulnerable populations can help to disentangle an algorithm’s inferential ability from a manifesting of implicit bias in data collection. Data science must include vulnerable populations in the research design, analysis and inference of data findings in order to make interpretations that are valuable and meaningful to those populations. Whether focused on social science, biomedical applications or preventing the harvesting of large scale genomic data from vulnerable populations with no clear reciprocal benefit to them, the inclusion of these diverse population and perspectives can improve data science. (p. 7)

Specifically, LaPointe and Yale (2022) noted the unique challenges of underrepresented minority property owners who may be displaced by tax delinquency.

While scientists, researchers, and human protections professionals may understand and comply with the need to consider these ethical aspects of DMS for research purposes established in policy, others who desire to use these technologies for financial gain may desire the continued lack of policy or even consider data access over data privacy. However, as Simshaw (2022) reported, the limited digitization of heirs’ property records may have impeded AI, ML, and DMS use. This data ethics analysis explored the existing literature for any reported use of DMS among heirs’ property records.

Specifically, the social injustices, biases, discrimination, and land dispossession experienced by heirs’ property (tenancy in common) owners along the Gullah Geechee Cultural Heritage Corridor (GGCHC) have been well documented. In prior work, the chapter author, with others (Throne, 2020; Versey & Throne, 2021), used critical inquiry and intersectionality to address the heirs’ property challenges along the GGCHC and, specifically, the contemporary challenges of the multiple heirs to property whereby common land originated when the original owner died intestate, and the property was passed down generation to generation outside of probate1. Heirs’ property exists across states, regions, and indigenous populations, including Appalachia, Native Americans, Hispanic residents of Texas, and rural African Americans (Bailey & Thomson, 2022; Simshaw, 2022). Frankly, for the GGCHC heirs’ property owners, the financial interests of often-White property developers and land speculators have been reported as predatory and unscrupulous in methods used to acquire these often Black-owned generational properties along the GGCHC (Bailey & Thomson, 2022).

Key Terms in this Chapter

Data Brokers: Data brokers are typically companies or organizations that handle the exchange of much of the internet data content including internet companies, advertisers, retailers, trade associations, ad-tech groups, data analytics firms, and credit agencies ( Reviglio, 2022 ).

Algorithmic Bias: Algorithmic bias is the discrimination caused by algorithmic decision-making that occurs when one group is unfairly or arbitrarily disadvantaged over another (Kim & Cho, 2022).

Algorithmic Neutrality: Neutrality is the unconditional absence of bias, and numerous scientists have noted the impossibility and illusion of algorithmic neutrality ( Kozlowski et al., 2022b ; Phillips-Brown, 2023 ). Algorithmic neutrality cannot exist when the training data is human data, as bias is intrinsically human, and human bias cannot be eliminated, only reduced.

Algorithmic Fairness: Algorithmic fairness is the intentional examination of models for fair and equitable algorithms to reduce bias. “To construct a fair algorithmic model, “three criteria need to be considered: fairness, expressiveness, and utility. Fairness can be evaluated by the three measures introduced by how the model is treated fairly without bias between groups. Expressiveness is how the value after applying the method of processing data expresses the information of the original data. It can be evaluated by the performance obtained from the various classifiers. Utility is an evaluation of tasks that the AI model must perform” (Kim & Cho, 2022, p. 2).

Data Mining/Scraping: While data mining and data scraping are often used synonymously across disciplines, the distinction between mining versus scraping typically refers to data mining as the gathering of large datasets for analysis, while data scraping involves the process of gathering large datasets. Due to the overlap in meaning, for the purposes of this chapter, the terms are concomitant.

Algorithmic Discrimination: Algorithmic discrimination occurs amidst algorithmic bias when algorithmic decision-making allows one group to be unfairly or arbitrarily disadvantaged over another (Kim & Cho, 2022).

Complete Chapter List

Search this Book: