Community Issues in American Metropolitan Cities: A Data Mining Case Study

Community Issues in American Metropolitan Cities: A Data Mining Case Study

Brooke Sullivan (California State University, Fullerton, CA, USA) and Sinjini Mitra (California State University, Fullerton, CA, USA)
Copyright: © 2014 |Pages: 17
DOI: 10.4018/jcit.2014010103
OnDemand PDF Download:
No Current Special Offers


The city of San Francisco in California has 826,000 residents and is growing slowly compared to other large cities in the western United States, facing concerns such as an aging population and flight of families to nearby suburbs. This case study investigates the social and demographic factors that are causing this phenomenon based on data that were collected by San Francisco's city controller's office in its annual survey to residents. By using data analytics, we can predict which residents are likely to move away, and this help us infer which factors of city life and city services contribute to a resident's decision to leave the city. Results of this research indicate that factors like public transportation services, public schools, and personal finances are significant in this regard, which can potentially help the city of San Francisco to prioritize its resources in order to better retain its locals.
Article Preview


Cities are sometimes thought of as the “closest, most intimate form of government” (Funkhouser, 2013). Cities are therefore expected to have a greater ability to understand and satisfy residents’ needs for public goods and services (Montalvo, 2009). For metropolitan cities, some of the important services include urban transportation systems such as subways and trains, and parks which break up the urban atmosphere. However, without tools that can interpret and explain residents’ perceptions of their city, it can be difficult for city officials to judge which issues should be addressed first. Towards this end, cities often conduct surveys among their residents to seek their opinions about certain specific city services with the goal of determining which group of people are likely to move away. Once the survey results are obtained, the main challenge is to analyze the data using suitable methods and interpret them in order to gain valuable insights. The latter will help city officials to make important decisions about allocating resources towards specific improvement projects that may help retain more residents in the area.

Many statistical and analytical tools are available today for analyzing survey data. The term, “analytics,” is the most current term for pattern recognition and knowledge discovery in large datasets (Duval, 2012). This can be done through data mining, predictive analytics, and statistics (Gartner, n.d.). In businesses, analytics tools are often used to predict which customers are at risk to stop buying, identify segments of customers, forecast business outcomes, and ultimately lead to business decisions. One form of business analytics is data mining, which refers to finding patterns in large amounts of data (Sharda, Delen, & Turban, 2013; Azzalini & Scarpa, 2012; Pearce, 2011). Data mining incorporates concepts from different fields, including mathematics, statistics, artificial intelligence, and database management (Sharda, Delen, & Turban, 2013). There are several industry buzz words that are going around currently and hence are worth discussing, including business intelligence, big data, and business analytics. “Business intelligence” refers to using collected data to present results through queries, reports and data mining (Gittlen, 2012). Another one is “Big data”, which relates to the phenomenon that due to cheaper data storage and faster processing, organizations gather data at huge rates that previously could not have been comprehended (Barton & Court, 2012; AWS, n.d.). “Business analytics” is a term similar to business intelligence, but is more powerful because it does not rely on historical data (Gittlen, 2012). Analytics and data mining have existed under different names since the 1980s (University of North Carolina, 2003). Analysts predict that this field will continue to develop, and employers currently face a shortage of talent with skills in this field (Violino, 2012). Some popular examples of analytics are the recommender systems of Netflix and Amazon. Amazon has been using data mining tools since the late 1990s to create lists of bestselling books for special populations of people, such as those at the same workplace (Streitfield, 1999). Today, Amazon recommends products to shoppers based on associations in the purchase patterns of products (Furnas, 2012). Techniques from analytics and data mining are also present in the fields of healthcare, scientific research, banking, and retailing (Suh, 2012).

Complete Article List

Search this Journal:
Volume 24: 1 Issue (2022): Forthcoming, Available for Pre-Order
Volume 23: 4 Issues (2021): 3 Released, 1 Forthcoming
Volume 22: 4 Issues (2020)
Volume 21: 4 Issues (2019)
Volume 20: 4 Issues (2018)
Volume 19: 4 Issues (2017)
Volume 18: 4 Issues (2016)
Volume 17: 4 Issues (2015)
Volume 16: 4 Issues (2014)
Volume 15: 4 Issues (2013)
Volume 14: 4 Issues (2012)
Volume 13: 4 Issues (2011)
Volume 12: 4 Issues (2010)
Volume 11: 4 Issues (2009)
Volume 10: 4 Issues (2008)
Volume 9: 4 Issues (2007)
Volume 8: 4 Issues (2006)
Volume 7: 4 Issues (2005)
Volume 6: 1 Issue (2004)
Volume 5: 1 Issue (2003)
Volume 4: 1 Issue (2002)
Volume 3: 1 Issue (2001)
Volume 2: 1 Issue (2000)
Volume 1: 1 Issue (1999)
View Complete Journal Contents Listing