An Overview (and Criticism) of Methods to Detect Fake Content Online

An Overview (and Criticism) of Methods to Detect Fake Content Online

Antonio Badia (University of Louisville, USA)
Copyright: © 2020 |Pages: 9
DOI: 10.4018/978-1-5225-9715-5.ch072


The recent controversy over ‘fake news' reminds us of one of the main problems on the web today: the utilization of social media and other outlets to distribute false and misleading content. The impact of this problem is very significant. This article discusses the issue of fake content on the web. First, it defines the problem and shows that, in many cases, it is surprisingly hard to establish when a piece of news is untrue. It distinguishes the issue of fake content from issues of hate/offensive speech (while there is a relation, the issues involved are a bit different). It then overviews proposed solutions to the problem of fake content detection, both algorithmic and human. On the algorithmic side, it focuses on work on classifiers. The chapter shows that most algorithmic approaches have significant issues, which has led to reliance on the human approach in spite of its known limitations (subjectivity, difficulty to scale). Finally, it closes with a discussion of potential future work.
Chapter Preview

Background: Defining Fake Content

One of the most challenging aspects of the research about fake content is the difficulty of defining the concept. There is much disagreement among authors; while the idea of being ‘true’ or ‘false’ has a strong intuitive sense, there is a lack of formal definitions that are widely shared. A considerable amount of work does not formally define ‘fake’ or ‘false’ (or, equivalently, ‘true’ or ‘truth’). Thus, ad-hoc definitions are used in many cases (Shu et alia, 2017). For instance, a Stanford study on misinformation uses a set of fake content obtained by combining articles from the PolitiFact web site, the Buzzfeed website, and two previous academic articles (Allcot et alia, 2019). classifies as fake news any post from a short list of sites which are ‘well known to be providers of false information.” To make the issue even more complicated, there are a number of related concepts (fake reviews, clickbait, rumors, hate speech, cognitive hacking) that tend to get confused (that is why this article uses the more neutral label ‘fake content’ (Tandoc et al. 2017)).

An area that has looked in depth at the problem of true or authentic information is that of Intelligence studies; this area provides a starting point for trying to define false news (Hansen, 2017). Based on this work, we can distinguish the following aspects:

Key Terms in this Chapter

Fake Reviews: Product review produced with the goal of artificially improving (or damaging) the ratings of the product.

Disinformation: False content that is deliberately fabricated and distributed.

Misinformation: False content that is the product of error (i.e., whose originator or distributor may not be aware that the content is not truthful).

Clickbait: Web article with an attention-gathering headline designed to make users click on the link, but with content rarely connected to the headline at all.

Crowdsourcing: The practice of hiring a (large) group of people (the ‘crowd’) to accomplish a certain task, usually a repetitive task that does not require special training but that must be carried out over a large amount of data.

Hate Speech: Speech that attacks a group (or, sometimes, a person) based on categories like race, sex, religion, origin or disability. Its goal is to incite prejudice and spread bigoted views.

Fake News: False content that tries to appear as coming from a traditional news media outlet.

Complete Chapter List

Search this Book: