Refining the Results of Automatic e-Textbook Construction by Clustering

Refining the Results of Automatic e-Textbook Construction by Clustering

Jing Chen (City University of Hong Kong, China), Qing Li (City University of Hong Kong, China) and Ling Feng (Tsinghua University, China)
DOI: 10.4018/978-1-60566-342-5.ch007
OnDemand PDF Download:


The abundance of knowledge-rich information on the World Wide Web makes compiling an online etextbook both possible and necessary. In our previous work, we proposed an approach to automatically generate an e-textbook by mining the ranked lists of the search engine. However, the performance of the approach was degraded by Web pages that were relevant but not actually discussing the desired concept. In this article, we extend the previous work by applying a clustering approach before the mining process. The clustering approach serves as a post-processing stage to the original results retrieved by the search engine, and aims to reach an optimum state in which all Web pages assigned to a concept are discussing that exact concept.
Chapter Preview


The World Wide Web has evolved into one of the largest information repositories. It now becomes feasible for a learner to access both professional and amateurish information about any interested subject. Professional information often includes compiled online dictionaries and glossaries; course syllabi provided by teachers; tutorials of scientific software; overviews of research areas by faculties from research institutes; and so forth. Discussion boards sometimes offer intuitive descriptions of the interested subjects that are beneficial for students or beginning learners. Both these resources greatly enrich and supplement the existing printed learning material. The abundance of knowledge-rich information makes compiling an online e-textbook both possible and necessary.

The most common way of learning through the Web is by resorting to a search engine to find relevant information. However, search engines are designed to meet the most general requirements for a regular user of the Web information. Use Google (Brin & Page, 1998) as an example. The relevance of a Web page is determined by a mixture of the popularity of the page and textual match between the query and the document (Chakrabarti, 2002). Despite its worldwide success, the combined ranking strategy still has to face several problems, such as ambiguous terms and spamming. In the case of learning, it becomes even harder for the search engine to satisfy the need of finding instructional information, since the ranking strategy cannot take into account the needs of a particular user group, such as the learners.

Recently, many approaches have been proposed to improve the appearance of Web search engine results. A popular solution is clustering, providing users a more structured means to browse through the search engine results. Clustering mainly aims at solving the ambiguous search term problem. When the search engine is not able to determine what the user’s true intention is, it returns all Web pages that seem relevant to the query. The retrieved results could cover widely different topics. For example, a query for “kingdom” actually referring to biological categories could result in thousands of pages related to the United Kingdom. Clustering these results by their snippets or whole pages is the most commonly used approach to address this problem (Ferragina & Gullí, 2004; Zamir & Etzioni, 1999; Zeng, He, Chen, & Ma, 2004). However, the structure of the hierarchy presented is usually determined on the fly. Cluster names and their organized structure are selected according to the content of the retrieved Web pages and the distribution of different topics within the results. The challenge here is how to select meaningful names and organize them into a sensible hierarchy. Vivisimo is an existing real-life demonstration of this attempt.

The clustering approach works well to meet the needs of a regular user. But when the application is narrowed down to an educational learning assistant, it is possible to provide the learners with more “suitable” Web pages that satisfy their needs in the pursuit of knowledge. Users seeking for educational resources prefer Web pages with a higher quality of content. Such Web pages often satisfy the criterion of being “self-contained,” “descriptive,” and “authoritative” (Chen, Li, Wang, & Jia, 2004). Limited work has been done to distinguish higher quality data from the Web. An important one (Liu, Chin, & Ng, 2003) is where the authors attempt to mine concept definitions of a specific topic on the Web. They rely on an interactive way for the user to choose a topic and the system to automatically discover related salient concepts and descriptive Web pages, which they call informative pages. Liu et al.’s work (2003) not only proposed a practical system that successfully identified informative pages, but also more importantly pointed out a novel task of compiling a book on the Web.

Complete Chapter List

Search this Book:
Associate Editors
Table of Contents
Mahbubur Rahman Syed
Mahbubur Rahman Syed
Chapter 1
Hiroshi Takeda, Hisashi Yaginuma, Hajime Kiyohara, Akira Tokuyasu, Masami Iwatsuki, Norio Takeuchi, Hisato Kobayashi, Kazuo Yana
This article describes a new automatic digital content generation system we have developed. Recently some universities, including Hosei University... Sample PDF
Automatic Digital Content Generation System for Real-Time Distance Lectures
Chapter 2
Filomena Ferrucci, Giuseppe Scanniello, Genoveffa Tortora
In this chapter the authors present E-World, an e-learning platform able to manage and trace adaptive learning processes which are designed and... Sample PDF
E-World: A Platform for the Management of Adaptive E-Learning Processes
Chapter 3
Judy C.R. Tseng, Wen-Ling Tsai, Gwo-Jen Hwang, Po-Han Wu
In developing traditional learning materials, quality is the key issue to be considered. However, for high technical e-training courses, not only... Sample PDF
An Efficient and Effective Approach to Developing Engineering E-Training Courses
Chapter 4
Te-Hua Wang, Flora Chia-I Chang
The sharable content object reference model (SCORM) includes a representation of distance learning contents and a behavior definition of how users... Sample PDF
A SCORM Compliant Courseware Authoring Tool for Supporting Pervasive Learning
Chapter 5
WenYing Guo
Selecting appropriate learning services for a learner from a large number of heterogeneous knowledge sources is a complex and challenging task. This... Sample PDF
An Ontology-Based e-Learning Scenario
Chapter 6
Dan Phung, Giuseppe Valetto, Gail E. Kaiser, Tiecheng Liu, John R. Kender
The increasing popularity of online courses has highlighted the need for collaborative learning tools for student groups. In this article, we... Sample PDF
Adaptive Synchronization of Semantically Compressed Instructional Videos for Collaborative Distance Learning
Chapter 7
Jing Chen, Qing Li, Ling Feng
The abundance of knowledge-rich information on the World Wide Web makes compiling an online etextbook both possible and necessary. In our previous... Sample PDF
Refining the Results of Automatic e-Textbook Construction by Clustering
Chapter 8
Yueting Zhuang, Xiafen Zhang, Weiming Lu, Fei Wu
Chinese brush calligraphy is a valuable civilization legacy and a high art of scholarship. It is still popular in Chinese banners, newspaper... Sample PDF
Chinese Brush Calligraphy Character Retrieval and Learning
Chapter 9
William K. Cheung, Anders I. Mørch, Kelvin C. Wong, Cynthia Lee, Jiming Liu, Mason H. Lam
In this article we investigate the use of latent semantic analysis (LSA), critiquing systems, and knowledge building to support computer-based... Sample PDF
Grounding Collaborative Learning in Semantics-Based Critiquing
Chapter 10
Giuliana Dettori, Paola Forcheri, Maria Grazia Ierardi
Learning Objects (LOs) are increasingly considered potentially helpful to improve teachers’ work and to spread innovation in the school system.... Sample PDF
Improving the Usefulness of Learning Objects by Means of Pedagogy-Oriented Design
Chapter 11
Frederick W.B. Li, Rynson W.H. Lau, Taku Komura, Meng Wang, Becky Siu
Human motion animation has been one of the major research topics in the field of computer graphics for decades. Techniques developed in this area... Sample PDF
Adaptive Animation of Human Motion for E-Learning Applications
Chapter 12
Gennaro Costagliola, Vittorio Fuccella
On-Line Testing is that sector of e-learning aimed at assessing learner’s knowledge through e-learning means. In on-line testing, due to the... Sample PDF
eWorkbook: An On-Line Testing System with Test Visualization Functionalities
Chapter 13
Brian Stewart, Derek Briton, Mike Gismondi, Bob Heller, Dietmar Kennepohl, Rory McGreal, Christine Nelson
Athabasca University—Canada’s Open University evaluated learning management systems (LMS) for use by the university. Evaluative criteria were... Sample PDF
Choosing MOODLE: An Evaluation of Learning Management Systems at Athabasca
Chapter 14
Damien Clark, Penny Baillie-de Byl
Computer aided assessment is a common approach used by educational institutions. The benefits range into the design of teaching, learning, and... Sample PDF
Enhancing the IMS QTI to Better Support Computer Assisted Marking
Chapter 15
Ali Dashti, Maytham Safar
Distance education created new challenges regarding the delivery of large size isochronous continuous streaming media (SM) objects. In this paper... Sample PDF
Streaming of Continuous Media for Distance Education Systems
Chapter 16
Manjulika Srivastava, Venugopal Reddy
The question why some learners successfully study through distance mode and others do not is increasingly becoming important as open and distance... Sample PDF
How Did They Study at a Distance? Experiences of IGNOU Graduates
Chapter 17
Gwo-Jen Hwang, Ting-Ting Wu, Yen-Jung Chen
The prosperous development of wireless communication and sensor technologies has attracted the attention of researchers from both computer and... Sample PDF
Ubiquitous Computing Technologies in Education
Chapter 18
S. Grunwald, B. Hoover, G.L. Bruland
In this chapter the authors describe the implementation of an emerging virtual learning environment to teach GIS and spatial sciences to distance... Sample PDF
An eLearning Portal to Teach Geographic Information Sciences
Chapter 19
Maria Manuela Cunha, Goran D. Putnik
Individualised open and distance learning at the university continuing education and post-graduate education levels is a central issue of today. The... Sample PDF
A Changed Economy with Unchanged Universities? A Contribution to the University of the Future
Chapter 20
Richard Y.D. Xu, Jesse S. Jin
This article presents a schematic application of computer vision technologies to e-learning that is synchronous, peer-to-peer-based, and supports an... Sample PDF
Rationale, Design and Implementation of a Computer Vision-Based Interactive E-Learning System
Chapter 21
Dorothée Rasseneur-Coffinet, Georgia Smyrniou, Pierre Tchounikine
This article presents an approach and tools that can help learners appropriate a Web-based learning curriculum and become active participants in... Sample PDF
Supporting Learners' Appropriation of a Web-Based Learning Curriculum
Chapter 22
Gwo-Jen Hwang, Hsiang Cheng, Carol H.C. Chu, Judy C.R. Tseng, Gwo-Haur Hwang
In the past decades, English learning has received lots of attention all over the world, especially for those who are not native English speakers.... Sample PDF
Development of a Web-Based System for Diagnosing Student Learning Problems on English Tenses
Chapter 23
Chi-Syan Lin, C. Candace Chou, Ming-Shiou Kuo
The paper outlines a new paradigm and its underlying rationales for implementing networked learning environments that is emerging from new... Sample PDF
Inhabited Virtual Learning Worlds and Impacts on Learning Behaviors in Young School Learners
Chapter 24
Rory McGreal, Terry Anderson
Any view of e-learning in Canada must be informed by the uniquely Canadian feature of provincial jurisdiction over education. Therefore any... Sample PDF
Research and Practice of E-Learning in Canada 2008
About the Contributors