Script Familiarity and Its Effect on CAPTCHA Usability: An Experiment with Arab Participants

Script Familiarity and Its Effect on CAPTCHA Usability: An Experiment with Arab Participants

Ashraf Khalil (College of Engineering and Computer Science, Abu Dhabi University, UAE), Salam Abdallah (College of Business Administration, Abu Dhabi University, UAE), Soha Ahmed (College of Engineering and Computer Science, Abu Dhabi University, UAE) and Hassan Hajjdiab (College of Engineering and Computer Science, Abu Dhabi University, UAE)
Copyright: © 2012 |Pages: 14
DOI: 10.4018/jwp.2012040105

Abstract

Many web-based services such as email, search engines, and polling sites are being abused by spammers via computer programs known as bots. This problem has bred a new research area called Human Interactive Proofs (HIP) and a testing device called CAPTCHA, which aims to protect services from malevolent attacks by distinguishing bots from human users. In the past decade, researchers have focused on developing robust and safe HIP systems but have barely evaluated their usability. To begin to fill this gap, the authors report the results of a user study conducted to determine the extent that English language proficiency affects CAPTCHA usability for users whose native language is not English. The results showed a significant effect of participants’ English language proficiency level on the time the participant takes to solve CAPTCHA, which appear to be related to multiple usability issues including satisfaction and efficiency. Yet, they found that English language proficiency level does not affect the number of errors made while entering CAPTCHA or reCAPTCHA. The authors’ results have numerous implications that may inform future CAPTCHA design.
Article Preview

Introduction

With the proliferation of web services, it is imperative to minimize spamming and security risks. Spammers use software to automate the process of creating fake online accounts to be used as a source of spam and illegal activities or simply to waste the resources of the website. With the ever increasing services offered online and the ever increasing hours spent by people online, spammers aim to exploit such resources to their advantage. Nowadays spammers are trying to exploit not only emails but also blogs, forums, and wikis. The most common and disruptive form of spamming is done through software that automatically creates new user accounts which are subsequently used to send spam and execute malicious activities. Such activities not only annoy the end users but also consume precious computer and communication resources which may result in disruption of business activities. Other forms of disruptive automatic script attacks are denial of service, ticket and event registration, recommendation and rating systems and online voting.

This problem has bred a new field of study called Human Interactive Proofs (HIP) aiming to mitigate risk. In this context, a testing device called CAPTCHA has been introduced which serves to identify whether the end-user is human or computer software (von Ahn, Blum, & Langford, the CAPTCHA Project homepage) (von Ahn, Blum, & Langford, 2004). The term CAPTCHA stands for “Completely Automated Public Turing Test to tell Computers and Humans Apart” and was first developed by researchers at Carnegie Mellon University in 2000. The seemingly easy task of distinguishing between human and bot is in fact one of the most classic and intriguing problems in computer science. Accurately recognizing human users is essential in fighting spam and abuse of online services.

A typical CAPTCHA is a distorted image that contains English words or a group of digits and characters of Latin script (see Figure 1). In online registration forms and often in the case of adding content to online forums, social networks and wikis, users are asked to type the distorted characters displayed usually within an image at the end of the form. The user’s entry is compared to the actual, intended characters; if they match, the user is allowed to continue through registration. The CAPTCHA approach capitalizes on an inherent weakness of computer programs in deciphering images with distorted text, termed OCR (Optical Character Recognition), whereas humans can easily read distorted text within images.

Figure 1.

An example of character based CAPTCHA taken from Yahoo! and Live email registration services

The main purpose of CAPTCHA systems is to distinguish humans from software robots by providing challenges that are easily solved by humans but are too difficult for computers. The existence of effective CAPTCHA does not suggest that no software can be built to solve it with a reasonable success rate but rather that the cost of building such a tool would be too expensive in terms of development and computational requirements to be practical. The goal is to make the cost of building and using software to break CAPTCHA higher than the cost of using a human. All CAPTCHA systems must satisfy three basic properties:

  • 1.

    Must be easy for humans to solve,

  • 2.

    Must be difficult for software robots to solve, and

  • 3.

    Must be supported by a large and dynamic set of test cases that it is not possible for a computer to know in advance. The set should be easy to generate and grade. The goal of the large and dynamic set is to prevent the risk of an attacker generating all possible answers to all of the possible tests.

Nowadays, CAPTCHA is used to protect many types of internet services such as free email services, blogs, social networks, and even online banking. Thanks to the effectiveness of CAPTCHA, major free email providers such as Yahoo!, MSN, and Gmail are able to prevent spammers from creating millions of free accounts that would be used to send spam emails. As proof of CAPTCHA’s effectiveness, when MSN Hotmail service deployed its first CAPTCHA, Hotmail registration suddenly dropped by 19% (von Ahn, Maurer, McMillen, Abraham, & Blum, 2008). CAPTCHA systems are also known to be used effectively against automated programs that try to bias online voting and rating services.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 2 Issues (2019): Forthcoming, Available for Pre-Order
Volume 10: 2 Issues (2018): 1 Released, 1 Forthcoming
Volume 9: 2 Issues (2017)
Volume 8: 1 Issue (2016)
Volume 7: 2 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing