Senses of “Selfie” Around the World From Web Search Patterns Over Extended Time

Senses of “Selfie” Around the World From Web Search Patterns Over Extended Time

Shalin Hai-Jew (Kansas State University, USA)
Copyright: © 2018 |Pages: 47
DOI: 10.4018/978-1-5225-3373-3.ch008
OnDemand PDF Download:
List Price: $37.50


When social phenomena and practices go viral, like “selfies,” they instantiate in different locations around the world in different ways based on cultural differences, technological affordances, and other factors. When people go to search for “selfie” on Google Search, they are thinking different things as well. On Google Correlate, it is possible to identify the top correlating search terms that pattern-match the time patterns for the seeding search term. Based in this big data, these search term correlates (associated over extended time) provide a sense of the “group mind” around a particular topic.
Chapter Preview


Selfies have undeniably been part of a global phenomenon, with people creating and posting selfies from a number of countries across the continents. The types of selfies posted, though, vary based on local cultures, and the broad senses of what selfies are varied in part based on local practices. (A perusal of selfie images from a web browser search shows some similarities but also some major differences.) One way to empirically observe what some of these senses or gists are may be acquired using the online tool, Google Correlate. Google Correlate, released in 2011, enables users to use a seeding search term (anything expressible using UTF-8, so enabling all languages expressible on the Web…and not limited to one-grams but higher-sized ngrams) in order to find other search terms whose search frequency time series vary similarly (with a high r or Pearson’s Correlation Coefficient) from 2003 – present. To attain the top 10 (or 20) most correlated search terms with the target term, Google Correlate compares the seeding search term’s time series data with that of millions of candidate time series to find the top 20 most highly correlated time series matches.

These big data, based on logs of user searches, are anonymized, so it is not possible to use this tool to locate search patterns to a person, but these are aggregate search data that is anonymized and may be identified only to a time span and to select locations. The comparison occurs between the seeding term’s time series and “the frequency time series for every query in our database” (Google Correlate Tutorial, 2011, p. 1). In these cases, the search frequencies are normalized based on the average frequency for that search term over time (either weekly or monthly) and mapping those frequencies as standard deviations from the mean for comparability across various volumes of search data.

One interesting angle is that web searches were initially thought of—by many users—as something private done in their own homes or offices. Over time, though, it became clear that search data was not only individualistic and identifiable to a person with a few data points but that such data are public albeit in a de-identified or anonymized way (Stephens-Davidowitz, 2017). The tool is designed for high precision data. Google scientists, in their white paper introducing the tool, wrote:

To further improve the precision, we take the top one thousand series from the database returned by our approximate search system (the first pass) and reorder those by doing exact correlation computation (the second pass). By combining asymmetric hashes and reordering, the system is able to achieve more than 99% precision for the top result at about 100 requests per second on O(100) machines, which is orders of magnitude faster than exact search (Mohebbi, Vanderkam, Kodysh, Schonberger, Choi, & Kumar, 2011, p. 6).

Based on the intuition that people generally search for terms when they have a particular question or a need at a particular time, the time-tested correlations based on the frequencies of web searches may be indicative of underlying in-world phenomena that may be the reason for the relatedness of the time-correlated search terms. On this tool, it is possible to click on “Show more” and acquire not only the top-10 correlations but the following 10 below as well.

Google Correlate enables access to big data, in the billions of searches ranges, and it enables access to “natural experiments” based on people’s search behaviors globally (with filters for 50 countries and even greater spatial granularity, down to U.S. state levels). Google Correlate is a follow-on tool from Google Trends:

Complete Chapter List

Search this Book: