Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A New Insight on the Morphology of Web Mining

Joshua Ojo Nehinbe

Source Title: Advanced Practical Approaches to Web Mining Techniques and Application

DOI: 10.4018/978-1-7998-9426-1.ch015

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Recent surveys have revealed that about 199 million of active and over 1.2 Billion of inactive websites exist across the globe. The categories of websites have also increased beyond espionage networks of spies, computer networks for corporate organizations, networks for governments' agencies, networks for social interactions, search engines and networks for religious bodies, etc. These diversities have generated complex issues regarding the morphology and classification of webs and web mining. Thus, the validity of the generic web classification, web mining taxonomy, and contemporary studies on the regularities of web usage, web content, web semantic, web structures, and the process of extracting useful information and interesting patterns from the intricate of the Internet are frequently questionable. The existing web mining taxonomy can also lead to misinformation, misclassification, and crisscrossed issues such that numerous webs' patterns could be marked with crossing and inexplicable lines. By using qualitative virtual interviews of 26 skilled web-designers and a focus group-conference of 7 experts in web-usage to brainstorm on the above issues, this chapter comprehensively discusses the above concepts and how they relate to web classification and web mining taxonomy. The themes obtained elucidate the techniques that commonly underpin basic web mining taxonomy. New concepts like existence of esoteric web data, exoteric web data; mysterious, inexplicable, and mystifying patterns; and cryptic vocabularies are discussed to assist web analytics. Finally, the author suggests eight classification attributes for web mining patterns (illustrative, expositive, educative, advisory, interpretative, demonstrative, revealing, and informatory) and proposes a new web mining taxonomy to minimize the impacts of the above concerns on global settings.

Chapter Preview

Top

Introduction

The Art and science of websites’ designs require web designers to possess creative and imaginative skills and capability to combine some standard technologies such as Hypertext Markup Language (HTML), Cascading Style Sheets (CSS); Extensible Markup Language (XML), Scalable Vector Graphics (SVG) to build and synthesize images and Application Programming Interfaces (APIs) or the intermediary software that enable two web applications to link and communicate with each other (W3C, 2021). Visual studies suggest that the morphology of web classification hints that modern websites now combine various branches of creative activities like music, painting and literary composition to typically produce visual works on websites that are primarily appreciated for their beauty, innovativeness and quality. In other words, the ontology and structure of the words that are frequently published on different websites and parts of such words could be classified on the basis of root, stem, prefix and suffix. For these reasons, websites regularly publish and keep the records of countless web data in diverse morphological components and formats.

Web data is a combination of piece of information and semantic of facts on websites. Fundamentally, web data subsumes diversity of web users, variety of web structure and array of web content regarding websites that are hosted on the Internet (Busetto et al., 2020; Singh et al., 2014). Web users are computer services and end-users such as the consumers, customers and clients that access websites. Web structure includes the arrangement, organization, composition, configuration, framework and makeup of websites. Similarly, web content is the variety of information that is published on the websites for the audience or web users (end-users). Information retrieval is the central part of the concept of web mining. Logically, the information that is retrieved from the web is indirectly extracted from web servers. The web servers usually log rich on the above web data. Such information may include remote hosts, successful and unsuccessful responses; parameters required to identify web users, authentications, status codes such as resource requested and the HTTP protocol, etc in standard formats. Nonetheless, web data or information that is retrieved from the webs might require high level of pre-processing due to their uniqueness and diversities in their sources, kinds, purposes and meanings. The pre-processing of web data required and the changeable meaning of the linguistics together with the logical combinations of the above groups of web data begin to pose serious challenges to web data analytics in two different ways on a daily basis (Chawan & Pamnani, 2010). Modern web data is perceived in terms of the logical semantics and lexical semantics in the above context. The logical semantics of web data are concerned with the common sense, reference, preconception and the conclusion that can be implicitly drawn from web data while the lexical semantics of web data are concerned with the analysis of the meanings and the relationship between particular words or the entire texts (words) on the websites.

Recent survey has shown the complexity of extracting and classifying information from over 199 millions of active and over 1.2 billion of inactive websites that exist across the globe (Siteefy, 2021; WebsiteSetup, 2021). Lack of statistics on the exact numbers of the available websites that are mobile-friendly and the numbers of websites that are not mobile-friendly can limit the accuracy of web mining taxonomy in categorizing websites on the basis of mobile-friendliness and responsiveness (Siteefy, 2021). Consequently, web mining then becomes a complex issue in recent time. Web mining is a branch of data mining that deals with the extraction of hidden but interesting and predictive information from the interactive information on the web (Griazev & Ramanauskaitè, 2018). The fact is that categories of the existing websites on the Internet have drastically increased from the traditional websites designed for attracting customers, boosting profitability and gaining wider publicity to the categories of the websites that are hosted purposely to advance malicious and avenging socio-political ideologies in recent time. The above trend has also progressed over the years through the evolutions of the espionage networks of spies, and networks of researchers and academicians, networks of corporate organizations, networks of governments’ agencies, networks of social partners and networks of religious bodies to cite a few.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

A New Insight on the Morphology of Web Mining

Abstract

Introduction

Complete Chapter List