Article Preview
TopIntroduction
With the proliferation of web services, it is imperative to minimize spamming and security risks. Spammers use software to automate the process of creating fake online accounts to be used as a source of spam and illegal activities or simply to waste the resources of the website. With the ever increasing services offered online and the ever increasing hours spent by people online, spammers aim to exploit such resources to their advantage. Nowadays spammers are trying to exploit not only emails but also blogs, forums, and wikis. The most common and disruptive form of spamming is done through software that automatically creates new user accounts which are subsequently used to send spam and execute malicious activities. Such activities not only annoy the end users but also consume precious computer and communication resources which may result in disruption of business activities. Other forms of disruptive automatic script attacks are denial of service, ticket and event registration, recommendation and rating systems and online voting.
This problem has bred a new field of study called Human Interactive Proofs (HIP) aiming to mitigate risk. In this context, a testing device called CAPTCHA has been introduced which serves to identify whether the end-user is human or computer software (von Ahn, Blum, & Langford, the CAPTCHA Project homepage) (von Ahn, Blum, & Langford, 2004). The term CAPTCHA stands for “Completely Automated Public Turing Test to tell Computers and Humans Apart” and was first developed by researchers at Carnegie Mellon University in 2000. The seemingly easy task of distinguishing between human and bot is in fact one of the most classic and intriguing problems in computer science. Accurately recognizing human users is essential in fighting spam and abuse of online services.
A typical CAPTCHA is a distorted image that contains English words or a group of digits and characters of Latin script (see Figure 1). In online registration forms and often in the case of adding content to online forums, social networks and wikis, users are asked to type the distorted characters displayed usually within an image at the end of the form. The user’s entry is compared to the actual, intended characters; if they match, the user is allowed to continue through registration. The CAPTCHA approach capitalizes on an inherent weakness of computer programs in deciphering images with distorted text, termed OCR (Optical Character Recognition), whereas humans can easily read distorted text within images.
Figure 1. An example of character based CAPTCHA taken from Yahoo! and Live email registration services
The main purpose of CAPTCHA systems is to distinguish humans from software robots by providing challenges that are easily solved by humans but are too difficult for computers. The existence of effective CAPTCHA does not suggest that no software can be built to solve it with a reasonable success rate but rather that the cost of building such a tool would be too expensive in terms of development and computational requirements to be practical. The goal is to make the cost of building and using software to break CAPTCHA higher than the cost of using a human. All CAPTCHA systems must satisfy three basic properties:
- 1.
Must be easy for humans to solve,
- 2.
Must be difficult for software robots to solve, and
- 3.
Must be supported by a large and dynamic set of test cases that it is not possible for a computer to know in advance. The set should be easy to generate and grade. The goal of the large and dynamic set is to prevent the risk of an attacker generating all possible answers to all of the possible tests.
Nowadays, CAPTCHA is used to protect many types of internet services such as free email services, blogs, social networks, and even online banking. Thanks to the effectiveness of CAPTCHA, major free email providers such as Yahoo!, MSN, and Gmail are able to prevent spammers from creating millions of free accounts that would be used to send spam emails. As proof of CAPTCHA’s effectiveness, when MSN Hotmail service deployed its first CAPTCHA, Hotmail registration suddenly dropped by 19% (von Ahn, Maurer, McMillen, Abraham, & Blum, 2008). CAPTCHA systems are also known to be used effectively against automated programs that try to bias online voting and rating services.