A Two-Dimensional Webpage Classification Model

A Two-Dimensional Webpage Classification Model

Shih-Ting Yang, Chia-Wei Huang
Copyright: © 2017 |Pages: 32
DOI: 10.4018/IJDWM.2017040102
(Individual Articles)
No Current Special Offers


Regarding the webpage classification topics, most classification mechanisms may lack of consideration from the webpage article writer's perspective and the display characteristics of the webpage (color, graphic layout). Hence, this paper develops a Two-dimensional Webpage Classification model to analyze the webpage textual information and display characteristics from the perspectives of webpage users and designers. This model is consisted of the Webpage Block Distribution Analysis (WBDA) module, Webpage Emotion Category Determination (WECD) module and Webpage Specialty Category Determination (WSCD) module. Firstly, in WBDA module, the user and designer habits (such as the web browsing movement and the writing perspective of webpage) should be considered by combining with the eye movement tracking and tag-region judgment to determine the critical blocks and information of the webpage. Secondly, in WECD module, the webpage color codes are acquired to calculate the major colors of the webpage, and further determine the emotional category of webpage. Thirdly, the WSCD module analyzes the webpage textual information by integrating the keyword acquisition technology to identify the specialty category of the webpage. After that, the Two-dimensional category of the webpage can be obtained. In addition, this paper develops a web-based system accordingly for case verification to confirm the feasibility of the methodology. The verification results show that firstly for webpage emotion category judgment when 128 webpage files for training are imported into this system, the respondent's emotion evaluation score is increased to above Level 5 and the system recommendation success rate is increased to 75.78%. Secondly, for specialty category determination, when system uses 1010 to 1120 webpage files for training, the system performance can be increased to above 80%. Hence, the developed system has a high-performance level in webpage emotion category and specialty category determination. That is, this paper proposes a methodology of Two-dimensional Webpage Classification to classify the webpage file information contents and the effects on the emotions of the demanders to assist webpage providers in providing webpage suitable for demanders with the generated two-dimensional information of the webpage.
Article Preview

1. Introduction

Due to the penetration and convenience of Internet, people have been accustomed to acquiring knowledge from the Internet rather than through books. However, due to the large amount of information on the Internet, it is difficult for knowledge demanders to find what they need. Therefore, how to manage or filter webpage documents to allow the knowledge demanders to effectively find out what they need has become of the major topics of webpage classification and management. To attract demanders to browse the webpage, webpage designers often rely on webpage color configuration, graphic and textural design. However, such methods can easily result in impact on the emotion, reading efficiency and memory of knowledge webpage demanders due to the visual features of the webpage documents. Hence, in addition to recommendations of webpage in appropriate fields for the demanders, knowledge providers should also take into consideration of the webpage document visual characteristic. As a result, if the webpage information and emotional features can be effectively classified, the webpage documents having contents of discomfort and negative emotional impact on knowledge demanders can be filtered in addition to effective recommendations of expected webpage documents to allow the knowledge demanders more effectively to acquire the knowledge contained in the webpage documents.

The webpage document information classification and selection of appropriate color configuration by knowledge provider can be acquired from the webpage classification technology developed by experts in various fields. However, such classification technology lacks in considerations about the detailed perceptions of webpage designers and cannot obtain the original meanings of the designers, making it impossible to accurately classify the webpage documents. In addition, knowledge providers have no mechanism regarding the judgment of emotional impact on the demanders by color configuration of webpage document. As a result, knowledge providers can only judge from their own perspectives, which may lead to different perceptions of the demanders due to different viewpoints. In summary of the above descriptions, the existing operational model can be as illustrated by the AS-IS Model (as shown in Figure 1).

Figure 1.

AS-IS model


In view of this, this paper uses the color code and text information contained in the webpage document tags as the data for analysis and considers the user eye movement in webpage document browsing as well as the unconscious feeling viewpoints of the webpage document provider to judge webpage document classifications. Furthermore, this paper develops a “Two-dimensional Webpage Document Classification Model” to help knowledge webpage providers to effectively recommend webpage documents in appropriate fields (i.e. webpage specialty category) as required by demanders and filter the webpage documents that may have negative emotional impact on demanders. First, the proposed model determines major blocks of concern to users and analyzes the important information implying the original meanings of the designer by using the “Webpage Block Distribution Analysis (WBDA) module”. Then, from the user and design viewpoint, the “Webpage Emotion Category Determination (WECD) module” is used to determine the emotional impact of the color configuration of the webpage on users. Finally, “Webpage Specialty Category Determination (WSCD) module” is used to classify the webpage information content. The TO-BE model of this paper can be shown in Figure 2.

Figure 2.

TO-BE model


2. Literature Review

This research involves topics in three major topics including “webpage document mining”, “user webpage browsing behavior analysis” and “color psychology discussion”.

Complete Article List

Search this Journal:
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing