Access Full-Text Recommend to Your Library

Buy Instant Access to This Chapter

Instant access upon order completion

Add to Cart

Share

Recommend to Librarian Recommend to Colleague Fair Use Policy

Free Content

Sample PDF

More Information

Rights & Permissions

Favorite Cite Chapter

MLA

Mangai, J. Alamelu, et al. "Web Page Classification Using MDAWkNN." Encyclopedia of Business Analytics and Optimization, edited by John Wang, IGI Global Scientific Publishing, 2014, pp. 2685-2695. https://doi.org/10.4018/978-1-4666-5202-6.ch239

APA

Mangai, J. A., Kumar, V. S., & Ramesh, K. (2014). Web Page Classification Using MDAWkNN. In J. Wang (Ed.), Encyclopedia of Business Analytics and Optimization (pp. 2685-2695). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-4666-5202-6.ch239

Chicago

Mangai, J. Alamelu, V. Santhosh Kumar, and Karthik Ramesh. "Web Page Classification Using MDAWkNN." In Encyclopedia of Business Analytics and Optimization, edited by John Wang, 2685-2695. Hershey, PA: IGI Global Scientific Publishing, 2014. https://doi.org/10.4018/978-1-4666-5202-6.ch239

Export Reference

For Librarians

Web Page Classification Using MDAWkNN

J. Alamelu Mangai (Birla Institute of Technology and Science Pilani, Dubai, India), V. Santhosh Kumar (Birla Institute of Technology and Science Pilani, Dubai, India), and Karthik Ramesh (Birla Institute of Technology and Science Pilani, Dubai, India)

Source Title: Encyclopedia of Business Analytics and Optimization

DOI: 10.4018/978-1-4666-5202-6.ch239

Chapter Preview

Top

Background

Many approaches for automatic Web page classification have been witnessed over years in literature. With no preprocessed data there is no quality mining results. Since Web pages are of higher dimensions and have noisy information they need to be properly preprocessed which would otherwise increase the learning time and complexity of the classifiers. Feature selection is one way of solving the curse of dimensionality for content based Web page classifiers. Web page classification is improved by selecting the features through various methods as in (Indra Devi, Rajaraman, & Selvakuberan, 2008; Han, Lim, & Alhashmi, 2010; Selamat & Omata 2004 ; Chen, Ming, & Chang, 2009 ; Wakaki, Itakura, & Tamura, 2004 ; Jensen, & Shen, 2006; Peng, Ming, & Wang, 2008; Farhoodi, Yari, & Mahmoudi, 2009; Xu & Wang 2011).

Key Terms in this Chapter

K-Nearest Neighbor Classification: Is a data mining algorithm that is used to classify a given set of data into pre-defined classes. This algorithm is an example of supervised learning.

Pre-Processing: This helps in transforming raw data into a format suitable for the mining task. It is done apriori to the mining task and is more significant, as with no quality data, there is no quality mining results.

Feature Selection: It is the process of identifying the features that are more significant to the mining task. This is a part of pre-processing and is one of the solutions to overcome curse of dimensionality. The redundant and irrelevant features to the mining task are eliminated

Machine Learning: Is a branch of artificial intelligence that deals with study and construction of intelligent systems by analyzing data. Data mining has its roots in machine learning.

Web Page Classification: A process of assigning labels to Web pages based on the kind of content they have.

Curse of Dimensionality: When processing Big Data of huge dimensions, much of the objects seem to be sparse after they are pre-processed. They are also dissimilar in many ways which prevents common data organization strategies from being efficient. This problem faced by the statistics community is known as curse of dimensionality.

Complete Chapter List

Search this Book:

Reset