Design and Construction of Distributed JavaScript Parsing System

Design and Construction of Distributed JavaScript Parsing System

Bo Shen, Wei Huang, Xiaodi Li
DOI: 10.4018/IJITN.2014100101
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

With the rapid development of the Internet technology, JS (short for JavaScript), as one of the representative of script languages, which is very powerful, is becoming more and more popular to the developers and users. But JS programming is more complex than usual static technology. In the field of search engine and information acquisition, it's very difficult to get the information hidden in script code. In this paper, the authors design a distributed system for parsing the JS code embedded in HTML file and retrieving the underling information. the authors describe how to extract JS codes from HTML file and parse them. Also, they introduce a task scheduling algorithm for the JS parsing system by employing Hadoop distributed computing technology. The experimental results indicate that the proposed algorithm and system can achieve a reasonable task scheduling efficiency and parse JS codes rapidly.
Article Preview
Top

Technology Analysis

Extracting the information and data hidden in html page by JS codes is useful for search engine, data collection recommendation system, public opinion analysis and so on. How to achieve the intended purpose efficiently is complex and difficult. It involves several kinds of technologies.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 1 Issue (2022)
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing