In this digital computer-based ecosystem, governments use mechanisms of data extraction which are barely identified by citizens. Therefore, among the data extracted from institutional e-platforms, computer-based transactions play an important role in this process. This research study aims to shed light on the data mining techniques used by governments and public institutions, identify which are the most commonly used, and expose the privacy risks they may pose to citizens. The chapter is made through a systematic literature review with two main keywords: “big data” and “government.” This study intends to answer the following research questions: What are the key techniques used by governments to extract data? May these tools pose risks to citizens?
TopIntroduction
In recent years, the use of large amounts of data has boosted the digital era in several areas (Al-Sai & Abualigah, 2017), such as corporations and governments, where population has evidenced the mass collection of their data (Zhou et al, 2014). Therefore, big data has turned out to be a key factor and discipline not only in the business field but also in public administrations (Chen & Hsieh, 2014), in order to develop and enhance different initiatives (Wang et al., 2016).
It should be noted that data can become a brand-new opportunity for governments as this asset can offer citizen-centric services (Chen & Hsieh, 2014), reduce waste (LaBrie et al., 2018), fight terrorism (Calo, 2016) and corruption (Von Haldenwang, 2004), or even implement protocols for health or social crisis such as COVID-19 pandemic (Saura et al., 2021). However, some authors argue that government data-centric projects do not benefit society even though they have a citizen-centric aim (Gómez, 2015; Huffine, 2015; Lu et al., 2012).
Likewise, governments based on the use of big data techniques, also known as e-governments (Esteves & Joseph, 2008), promote collaboration, productivity, efficiency, and transparency (Von Haldenwang, 2004; McNeal; Zhang & Chen, 2010; Morabito, 2015). In this way, the use of data in the public sector is changing the paradigm into a digital and innovative era where organizational structures are more flexible, thus facilitating processes to users as well as reducing costs (Ebrahim & Iran, 2005).
Of note, we have witnessed how the era of Big Data has boosted the volume, complexity and growth of data generated, where governments collect data from users because of their public functions, in which citizens have to share their information. Nevertheless, public administrations do not value at all citizens’ data (Privacilla, 2001) and how it can be used to predict and forecast events or even how people will behave (Saura et al., 2021a).
Data collection, gathering or mining is related to personal information from users. In this context, surveillance capitalism is taking part of a new data-driven era in which the interaction with smart devices in the daily life, generates large amounts of data that are collected and treated by organizations (Zuboff, 2015). Of note, these data extraction techniques are unknown by users because it is normally gathered in hidden ways (Andrejevic, 2014).
Considering that governments have been collecting data for ages (Amankwah-Amoah, 2015), the collection and acquisition techniques developed have improved through all these decades, where the utilization of mobile devices has exponentially grown in recent years. In this way, data collectors are developing how to efficiently gather, extract, interpret and storage these large amounts of data (Zhou et al, 2014).
Of note, in recent years open data governments initiatives have been being developed as they allow to prevent corruption, offer better services to citizens, as well as increase reliability on governments’ accountability (Nikiforova & McBride, 2021), nevertheless it may entail data risks if securing rules are not well settled and designed (Bonina & Eaton, 2020).
Related to governments built based on data, smart cities offer a way to manage energy consumption, sustainability, economics’ areas, among others, in order to improve every day’s people lifes. In this data-driven cities, information and communication technologies, artificial intelligence and machine learning play a key role for its development, as these technologies contribute to manage, collect and treat large amounts of data (Ulla et al., 2020). Therefore, studies related to both IoT and smart cities suggest that security and resilience are critical factors as different devices are interconnected to each other, sharing sensitive information from users (Abdul Ahad et al., 2020).
Additionally, there is a need of developing new strategies, techniques and tools for data extraction since industries such as finance or insurance are exploiting data from their clients to improve their decision-making, and cede and transfer this information to third-parties. Moreover, the different purposes of each sector and organization imply different extraction techniques, affecting how, what, and why data is extracted (Sadowski, 2019).