Management of Medical Website Quality Labels via Web Mining

Management of Medical Website Quality Labels via Web Mining

Vangelis Karkaletsis (National Center of Scientific Research “Demokritos”, Greece), Konstantinos Stamatakis (National Center of Scientific Research “Demokritos”, Greece), Karampiperis (National Center of Scientific Research “Demokritos”, Greece), Karampiperis (National Center of Scientific Research “Demokritos”, Greece), Pythagoras Karampiperis (National Center of Scientific Research “Demokritos”, Greece) and Pythagoras Karampiperis (National Center of Scientific Research “Demokritos”, Greece)
DOI: 10.4018/978-1-60566-218-3.ch010
OnDemand PDF Download:


The World Wide Web is an important channel of information exchange in many domains, including the medical one. The ever increasing amount of freely available healthcare-related information generates, on the one hand, excellent conditions for self-education of patients as well as physicians, but on the other hand, entails substantial risks if such information is trusted irrespective of low competence or even bad intentions of its authors. This is why medical Web site certification, also called quality labeling, by renowned authorities is of high importance. In this respect, it recently became obvious that the labelling process could benefit from employment of Web mining and information extraction techniques, in combination with flexible methods of Web-based information management developed within the Semantic Web initiative. Achieving such synergy is the central issue in the MedIEQ project. The AQUA (Assisting Quality Assessment) system, developed within the MedIEQ project, aims to provide the infrastructure and the means to organize and support various aspects of the daily work of labelling experts.
Chapter Preview


The number of health information websites and online services is increasing day by day. It is known that the quality of these websites is very variable and difficult to assess; we can find websites published by government institutions, consumer and scientific organizations, patients associations, personal sites, health provider institutions, commercial sites, etc. (Mayer, 2005). On the other hand, patients continue to find new ways of reaching health information and more than four out of ten health information seekers say the material they find affects their decisions about their health (Eysenbach, 2000; Diaz, 2002). However, it is difficult for health information consumers, such as the patients and the general public, to assess by themselves the quality of the information because they are not always familiar with the medical domains and vocabularies (Soualmia, 2003).

Although there are divergent opinions about the need for certification of health websites and adoption by Internet users (HON, 2005), different organizations around the world are working on establishing standards of quality in the certification of health-related web content (Winker, 2000; Kohler, 2002; Curro, 2004; Mayer, 2005). The European Council supported an initiative within eEurope 2002 to develop a core set of “Quality Criteria for Health Related Websites” (EC, 2002). The specific aim was to specify a commonly agreed set of simple quality criteria on which Member States, as well as public and private bodies, may build upon for developing mechanisms to help improving the quality of the content provided by health-related websites. These criteria should be applied in addition to relevant Community law. As a result, a core set of quality criteria was established. These criteria may be used as a basis in the development of user guides, voluntary codes of conduct, trust marks, certification systems, or any other initiative adopted by relevant parties, at European, national, regional or organizational level.

This stress on content quality evaluation contrasts with the fact that most of the current Web is still based on HTML, which only specifies how to layout the content of a web page addressing human readers. HTML as such cannot be exploited efficiently by information retrieval techniques in order to provide visitors with additional information on the websites’ content. This “current web” must evolve in the next years, from a repository of human-understandable information, to a global knowledge repository, where information should be machine-readable and processable, enabling the use of advanced knowledge management technologies (Eysenbach, 2003). This change is based on the exploitation of semantic web technologies. The Semantic Web is “an extension of the current web in which information is given a well-defined meaning, better enabling computers and people to work in cooperation” based on metadata (i.e. semantic annotations of the web content) (Berners-Lee, 2001). These metadata can be expressed in different ways using the Resource Description Framework (RDF) language. RDF is the key technology behind the Semantic Web, providing a means of expressing data on the web in a structured way that can be processed by machines.

In order for the medical quality labelling mechanisms to be successful, they must be equipped with semantic web technologies that enable the creation of machine-processable labels as well as the automation of the labelling process. Among the key ingredients for the latter are web crawling techniques that allow for retrieval of new unlabelled web resources, or web spidering and extraction techniques that facilitate the characterization of retrieved resources and the continuous monitoring of labeled resources alerting the labelling agency in case some changes occur against the labelling criteria.

Complete Chapter List

Search this Book:
Editorial Advisory Board
Table of Contents
Riccardo Bellazzi
Petr Berka, Jan Rauch, Djamel Abdelkader Zighed
Petr Berka, Jan Rauch, Djamel Abdelkader Zighed
Chapter 1
Jana Zvárová, Arnošt Veselý
This chapter introduces the basic concepts of medical informatics: data, information, and knowledge. Data are classified into various types and... Sample PDF
Data, Information and Knowledge
Chapter 2
Michel Simonet, Radja Messai, Gayo Diallo
Health data and knowledge had been structured through medical classifications and taxonomies long before ontologies had acquired their pivot status... Sample PDF
Ontologies in the Health Field
Chapter 3
Alberto Freitas, Pavel Brazdil, Altamiro Costa-Pereira
This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to... Sample PDF
Cost-Sensitive Learning in Medicine
Chapter 4
Arnošt Veselý
This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it... Sample PDF
Classification and Prediction with Neural Networks
Chapter 5
Patrik Eklund, Lena Kallin Westin
Classification networks, consisting of preprocessing layers combined with well-known classification networks, are well suited for medical data... Sample PDF
Preprocessing Perceptrons and Multivariate Decision Limits
Chapter 6
Xiu Ying Wang, Dagan Feng
The rapid advance and innovation in medical imaging techniques offer significant improvement in healthcare services, as well as provide new... Sample PDF
Image Registration for Biomedical Information Integration
Chapter 7
ECG Processing  (pages 137-160)
Lenka Lhotská, Václav Chudácek, Michal Huptych
This chapter describes methods for preprocessing, analysis, feature extraction, visualization, and classification of electrocardiogram (ECG)... Sample PDF
ECG Processing
Chapter 8
EEG Data Mining Using PCA  (pages 161-180)
Lenka Lhotská, Vladimír Krajca, Jitka Mohylová, Svojmil Petránek, Václav Gerla
This chapter deals with the application of principal components analysis (PCA) to the field of data mining in electroencephalogram (EEG) processing.... Sample PDF
EEG Data Mining Using PCA
Chapter 9
Darryl N. Davis, Thuy T.T. Nguyen
Risk prediction models are of great interest to clinicians. They offer an explicit and repeatable means to aide the selection, from a general... Sample PDF
Generating and Verifying Risk Prediction Models using Data Mining
Chapter 10
Vangelis Karkaletsis, Konstantinos Stamatakis, Karampiperis, Karampiperis, Pythagoras Karampiperis, Pythagoras Karampiperis
The World Wide Web is an important channel of information exchange in many domains, including the medical one. The ever increasing amount of freely... Sample PDF
Management of Medical Website Quality Labels via Web Mining
Chapter 11
Rainer Schmidt
In medicine, a lot of exceptions usually occur. In medical practice and in knowledge-based systems, it is necessary to consider them and to deal... Sample PDF
Two Case-Based Systems for Explaining Exceptions in Medicine
Chapter 12
Bruno Crémilleux, Arnaud Soulet, Jiri Kléma, Céline Hébert, Olivier Gandrillon
The discovery of biologically interpretable knowledge from gene expression data is a crucial issue. Current gene data analysis is often based on... Sample PDF
Discovering Knowledge from Local Patterns in SAGE Data
Chapter 13
Jirí Kléma, Filip Železný, Igor Trajkovski, Filip Karel, Bruno Crémilleux
This chapter points out the role of genomic background knowledge in gene expression data mining. The authors demonstrate its application in several... Sample PDF
Gene Expression Mining Guided by Background Knowledge
Chapter 14
Pamela L. Thompson, Xin Zhang, Wenxin Jiang, Zbigniew W. Ras, Pawel Jastreboff
This chapter describes the process used to mine a database containing data, related to patient visits during Tinnitus Retraining Therapy. The... Sample PDF
Mining Tinnitus Database for Knowledge
Chapter 15
Dinora A. Morales, Endika Bengoetxea, Pedro Larrañaga
Infertility is currently considered an important social problem that has been subject to special interest by medical doctors and biologists. Due to... Sample PDF
Gaussian-Stacking Multiclassifiers for Human Embryo Selection
Chapter 16
Mining Tuberculosis Data  (pages 332-349)
Marisa A. Sánchez, Sonia Uremovich, Pablo Acrogliano
This chapter reviews the current policies of tuberculosis control programs for the diagnosis of tuberculosis. The international standard for... Sample PDF
Mining Tuberculosis Data
Chapter 17
Mila Kwiatkowska, M. Stella Atkins, Les Matthews, Najib T. Ayas, C. Frank Ryan
This chapter describes how to integrate medical knowledge with purely inductive (data-driven) methods for the creation of clinical prediction rules.... Sample PDF
Knowledge-Based Induction of Clinical Prediction Rules
Chapter 18
Petr Berka, Jan Rauch, Marie Tomecková
The aim of this chapter is to describe goals, current results, and further plans of long-time activity concerning application of data mining and... Sample PDF
Data Mining in Atherosclerosis Risk Factor Data
About the Contributors