This chapter addresses the issues of resource discovery in digital libraries (DLs) and the importance of knowledge organization tools in building DLs. Using the Greenstone digital library (GSDL) software as a case example, we describe a taxonomy generation tool (TGT) prototype, a hierarchical classification of contents module, designed and built to categorize contents within DLs. TGT was developed as a desktop application using Microsoft .NET Framework 2.0 in Visual C# language and object-oriented programming. In TGT, Z39.19 was implemented providing standard guidelines to construct, format, and manage monolingual controlled vocabularies, usage of broader terms, narrower terms and related terms as well as their semantic relationships, and the simple knowledge organization system (SKOS) for vocabulary specification. The XML schema definition was designed to validate against rules developed for the XML taxonomy template, hence, resulting in the generated taxonomy template supporting controlled vocabulary terms as well as allowing users to select the labels for the taxonomy structure. A pilot user study was then conducted to evaluate the usability and usefulness of TGT and the taxonomy template. In this study, we observed four subjects using TGT, followed by a focus group for comments. Initial feedback was positive, indicating the importance of having a taxonomy structure in GSDL. Recommendations for future work include content classification and metadata technologies in TGT.
Overview Of Greenstone
Greenstone is a software suite designed to build and distribute DL collections for publishing on the Internet or on CD-ROM. It is an open-source application developed under the terms of the general public license (GNU) and is particularly easy to install and use (Witten, 2003). In cooperation with UNESCO and Human Info, Greenstone has helped to support user testing, internationalization, and mount courses (Witten & Bainbridge, 2005). Aligning with the goal of UNESCO for the preservation and distribution of educational, scientific, and cultural information of developing countries, Greenstone came in as an important tool in this context. The core facilities aiming to provide in Greenstone were for designing and construction of the document collections, distributing them on the Web and/or CD-ROM, as well as to providing customizable structure on available metadata, easy-to-use collection-building interface, multilingual support, and multiplatform operation (Witten & Bainbridge, 2005). Although initially focused on helping developing countries, its user base has expanded to 70 countries and the reader’s interface has been translated into 45 languages to-date, with increasing volume of download hits from a steady 4,500 times per month to 6,500 over the last 2 years (Witten & Bainbridge, 2007). Greenstone’s popularity comes from a simple, user-friendly interface providing:
Key Terms in this Chapter
Greenstone: Greenstone (http://www.greenstone.org) is produced under the New Zealand Digital Library Project, a research project for text compression at University of Waikato. It focuses on personalization and construction of the digital collection from end-user perspectives.
Usefulness: This is debatable. Some make the distinction between usability and usefulness. Although it is impossible to quantify the usefulness of a system, attempts have been made to measure its attainment in reference to system specifications and the extent of coverage of end users’ tasks supported by the system, but not on end user performance testing.
Usability: ISO 9241-11 defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” Usability of hypertext/Web is commonly measured using established usability dimensions covering categories of usability defects such as screen design, terminology and system information, system capabilities and user control, navigation, and completing tasks.
Fedora: It was originally implemented as a DARPA and NSF funded research project at Cornell University and later funded by the Andrew W. Mellon foundation. Fedora (http://www.fedora-commons.org) offers a service-oriented architecture by providing a powerful digital object model which supports multiple views for digital objects.
DSpace: It is jointly implemented by Massachusetts Institute of Technology (MIT) and Hewlett-Packard (HP) laboratories and was released in November 2002. DSpace (see http://www.dspace.org) aims to provide a digital institutional repository system to capture, store, index, preserve, and redistribute an organization’s research data.
Metadata: A set of attributes that describes the content, quality, condition, and other characteristics of a resource.
Digital Libraries: They mean different things to different people. The design of digital libraries is, therefore, dependent of the perceptions of the purpose/functionality of digital libraries. To the library science community, the roles of traditional libraries are to: (a) provide access to information in any format that has been evaluated, organized, archived, and preserved; (b) have information professionals that make judgments and interpret users’ needs; and (c) provide services and resources to people (e.g., students, faculty, others, etc.). To the computer science community, digital libraries may refer to a distributed text-based information system, a collection of distributed information services, a distributed space of interlinked information system, or a networked multimedia information system.
Taxonomy: According to the definition by ANSI/NISO (2005), taxonomy is a collection of controlled vocabulary terms organized into a hierarchical structure with each term having one or more parent/child (broader/narrower) relationships to others. It gives a high level view of contents systematically and provides users a roadmap for discovering knowledge available. Taxonomies can appear as lists, trees, hierarchies, polyhierarchies, matrices, facets, or system maps.
Complete Chapter List
Detailed Table of Contents
Yin-Leng Theng, Schubert Foo, Dion Goh, Jin-Cheon Na
Leonardo Candela, Donatella Castelli, Pasquale Pagano
Mohammed Nasser Al-Suqri, Esther O.A. Fatuyi
Jian-hua Yeh, Shun-hong Sie, Chao-chen Chen
Juan C. Lavariega, Lorena G. Gomez, Martha Sordia-Salinas, David A. Garza-Salazar
George Pyrounakis, Mara Nikolaidou
Ian H. Witten, David Bainbridge
Yin-Leng Theng, Nyein Chan Lwin Lwin, Jin-Cheon Na, Schubert Foo, Dion Hoe-Lian Goh
Schubert Foo, Yin-Leng Theng, Dion Hoe-Lian Goh, Jin-Cheon Na
Fu Lee Wang, Christopher C. Yang
K. S. Chudamani, H. C. Nagarathna
Payam M. Barnaghi, Wei Wang, Jayan C. Kurian
Giovanni Semeraro, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops
Shiyan Ou, Christopher S.G. Khoo, Dion Hoe-Lian Goh
Wooil Kim, John H.L. Hansen
Irene Lourdi, Mara Nikolaidou
Neide Santos, Fernanda C.A. Campos, Regina M.M. Braga Villela
Svenja Hagenhoff, Björn Ortelbach, Lutz Seidenfaden
Stefano Paolozzi, Fernando Ferri, Patrizia Grifoni
Ana Kovacevic, Vladan Devedzic
Jin-Cheon Na, Tun Thura Thet, Dion Hoe-Lian Goh, Yin-Leng Theng, Schubert Foo
Dion Hoe-Lian Goh, Khasfariyati Razikin, Alton Y.K. Chua, Chei Sian Lee, Schubert Foo
Taha Osman, Dhavalkumar Thakker, Gerald Schaefer
Stephen Kimani, Emanuele Panizzi, Tiziana Catarci, Margerita Antona
Spyros Veronikis, Giannis Tsakonas, Christos Papatheodorou
Mila M. Ramos, Luz Marina Alvaré, Cecilia Ferreyra, Peter Shelton
Robert Neumayer, Andreas Rauber
Gerald Schaefer, Simon Ruszala
Cláudio de Souza Baptista, Ulrich Schiel
Nuria Lloret Romero, Margarita Cabrera Méndez, Alicia Sellés Carot, Lilia Fernandez Aquino
Rubén Béjar, J. Nogueras-Iso, Miguel Ángel Latre, Pedro Rafael Muro-Medrano, F. J. Zarazaga-Soria
O. Cantán Casbas, J. Nogueras-Iso, F. J. Zarazaga-Soria
Piedad Garrido Picazo, Jesús Tramullas Saz, Manuel Coll Villalta
Wan Ab. Kadir Wan Dollah, Diljit Singh
Frances L. Lightsom, Alan O. Allwardt
Stephan Strodl, Christoph Becker, Andreas Rauber
Thomas Lidy, Andreas Rauber
Leonardo Bermón-Angarita, Antonio Amescua-Seco, Maria Isabel Sánchez-Segura, Javier García-Guzmán
Kanwal Ameen, Muhammad Rafiq
Seungwon Yang, Barbara M. Wildemuth, Jeffrey P. Pomerantz, Sanghee Oh
Faisal Ahmad, Tamara Sumner, Holly Devaul
Yongqing Ma, Warwick Clegg, Ann O’Brien
Chang Chew-Hung, John G. Hedberg
Michael B. Twidale, David M. Nichols
Soh Whee Kheng Grace