Design and Construction of Distributed JavaScript Parsing System

Design and Construction of Distributed JavaScript Parsing System

Bo Shen (School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China), Wei Huang (School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China) and Xiaodi Li (School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China)
DOI: 10.4018/IJITN.2014100101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

With the rapid development of the Internet technology, JS (short for JavaScript), as one of the representative of script languages, which is very powerful, is becoming more and more popular to the developers and users. But JS programming is more complex than usual static technology. In the field of search engine and information acquisition, it's very difficult to get the information hidden in script code. In this paper, the authors design a distributed system for parsing the JS code embedded in HTML file and retrieving the underling information. the authors describe how to extract JS codes from HTML file and parse them. Also, they introduce a task scheduling algorithm for the JS parsing system by employing Hadoop distributed computing technology. The experimental results indicate that the proposed algorithm and system can achieve a reasonable task scheduling efficiency and parse JS codes rapidly.
Article Preview

Technology Analysis

Extracting the information and data hidden in html page by JS codes is useful for search engine, data collection recommendation system, public opinion analysis and so on. How to achieve the intended purpose efficiently is complex and difficult. It involves several kinds of technologies.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing