User Profiles for Personalizing Digital Libraries

User Profiles for Personalizing Digital Libraries

Giovanni Semeraro (University of Bari, Italy), Pierpaolo Basile (University of Bari, Italy), Marco de Gemmis (University of Bari, Italy) and Pasquale Lops (University of Bari, Italy)
DOI: 10.4018/978-1-59904-879-6.ch015
OnDemand PDF Download:
$37.50

Abstract

Exploring digital collections to find information relevant to a user’s interests is a challenging task. Information preferences vary greatly across users; therefore, filtering systems must be highly personalized to serve the individual interests of the user. Algorithms designed to solve this problem base their relevance computations on user profiles in which representations of the users’ interests are maintained. The main focus of this chapter is the adoption of machine learning to build user profiles that capture user interests from documents. Profiles are used for intelligent document filtering in digital libraries. This work suggests the exploiting of knowledge stored in machine-readable dictionaries to obtain accurate user profiles that describe user interests by referring to concepts in those dictionaries. The main aim of the proposed approach is to show a real-world scenario in which the combination of machine learning techniques and linguistic knowledge is helpful to achieve intelligent document filtering.
Chapter Preview
Top

Our research was mainly inspired by the following works.

  • Syskill & Webert (Pazzani & Billsus, 1997) is an agent that learns a user profile to identify interesting Web pages. The learning process is performed by first converting a hypertext markup language (HTML) source into positive and negative examples, represented as keyword vectors, and then using learning algorithms like Bayesian classifiers, a nearest neighbor algorithm, and a decision tree learner.

  • Personal WebWatcher (Mladenic, 1999) is a Web browsing recommendation service that generates a user profile based on the content analysis of the requested pages. Learning is done by a naïve Bayes classifier where documents are represented as weighted keyword vectors, and classes are “interesting” and “not interesting.”

  • Mooney and Roy (2000) adopt a text categorization method in their Libra system that performs content-based book recommendations by exploiting product descriptions obtained from the Web pages of the Amazon online digital store. Also in this case, documents are represented by using keywords, and a naïve Bayes text classifier is adopted.

The main limitation of these approaches is that they represent items by using keywords. The objective of our research is to create accurate semantic user profiles. Among the state-of-the-art systems that produce semantic user profiles, SiteIF (Magnini & Strapparava, 2001) is a personal agent for a multilingual news Web site that exploits a sense-based representation to build a user profile as a semantic network, whose nodes represent senses of the words in documents requested by the user.

The role of linguistic ontologies in knowledge-retrieval systems is explored in OntoSeek (Guarino, Masolo, & Vetere, 1999), a system designed for content-based information retrieval from online yellow pages and product catalogs. OntoSeek combines an ontology-driven content-matching mechanism based on WordNet with a moderately expressive representation formalism. The approach has shown that structured content representations coupled with linguistic ontologies can increase both recall and precision of content-based retrieval.

We adopted a content-based method able to learn user profiles from documents represented by using senses of words obtained by a word sense disambiguation strategy that exploits the WordNet IS-A hierarchy.

Key Terms in this Chapter

Synset: A group of data elements that are considered semantically equivalent for the purposes of information retrieval.

Personalization: The process of tailoring products or services to users based on their user profiles.

Word Sense Disambiguation: The problem of determining in which sense a word having a number of distinct senses is used in a given sentence.

User Profile: A structured representation of interests (and disinterests) of a user or group of users.

NLP (Natural Language Processing): A subfield of artificial intelligence and linguistics that studies the problems of automated generation and understanding of natural human languages. It converts samples of human language into more formal representations that are easier for computer programs to manipulate.

WordNet: A semantic lexicon for the English language. It groups English words into sets of synonyms called synsets. It provides short, general definitions, and records the various semantic relations between these synonym sets.

Recommender system: A system that guides users in a personalized way to interesting or useful objects in a large space of possible options.

Complete Chapter List

Search this Book:
Reset
List of Reviewers
Table of Contents
Detailed Table of Contents
Foreword
Gary Gorman
Preface
Yin-Leng Theng, Schubert Foo, Dion Goh, Jin-Cheon Na
Acknowledgement
Chapter 1
Leonardo Candela, Donatella Castelli, Pasquale Pagano
This chapter introduces OpenDLib, a digital library service system developed at ISTI-CNR for easing the creation and management of digital... Sample PDF
OpenDLib: A Digital Library Service System
$37.50
Chapter 2
Mohammed Nasser Al-Suqri, Esther O.A. Fatuyi
Deliberate exploitation of natural resources and excessive use of environmentally abhorrent materials have resulted in environmental disruptions... Sample PDF
Digital Library Service System; Digital Library System; Document Model; Grid Computing; Grid Infrastructure; Institutional Repository; Service Oriented Architecture
$37.50
Chapter 3
Sarah-Jane Saravani
This chapter describes a learning object repository case study undertaken at the Waikato Institute of Technology, Hamilton, New Zealand, during 2005... Sample PDF
Access and Control; Digital Libraries; Information Ethics; Privacy; Security
$37.50
Chapter 4
Jian-hua Yeh, Shun-hong Sie, Chao-chen Chen
In this chapter, we describe X-system, a general digital library platform which is capable of handling large-scale digital contents with flexible... Sample PDF
Extensible Digital Library Service Platform
$37.50
Chapter 5
Juan C. Lavariega, Lorena G. Gomez, Martha Sordia-Salinas, David A. Garza-Salazar
This chapter presents the services and functionality that a personal digital library (PDL) system should provide. The chapter includes a reference... Sample PDF
Personal Digital Libraries
$37.50
Chapter 6
George Pyrounakis, Mara Nikolaidou
In the last years, a great number of digital library and digital repository systems have been developed by individual organizations, mostly... Sample PDF
Comparing Open Source Digital Library Software
$37.50
Chapter 7
Ian H. Witten, David Bainbridge
This chapter describes the evolution of the Greenstone digital library project through its first 10 years of development. It provides an overview of... Sample PDF
The Greenstone Digital Library Software
$37.50
Chapter 8
Yin-Leng Theng, Nyein Chan Lwin Lwin, Jin-Cheon Na, Schubert Foo, Dion Hoe-Lian Goh
This chapter addresses the issues of resource discovery in digital libraries (DLs) and the importance of knowledge organization tools in building... Sample PDF
Design and Development of a Taxonomy Generator: A Case Example for Greenstone
$37.50
Chapter 9
Schubert Foo, Yin-Leng Theng, Dion Hoe-Lian Goh, Jin-Cheon Na
Digital archives typically act as stand-alone digital libraries to support search and discovery by users to access its rich set of digitized... Sample PDF
From Digital Archives to Virtual Exhibitions
$37.50
Chapter 10
Carmen Galvez
This chapter presents the different standardization methods of terms at the two basic approaches of nonlinguistic and linguistic techniques, and... Sample PDF
Standardization of Terms Applying Finite-State Transducers (FST)
$37.50
Chapter 11
Fu Lee Wang, Christopher C. Yang
As more information becomes available online, information-overloading results. This problem can be resolved through the application of automatic... Sample PDF
Extracting the Essence: Automatic Text Summarization
$37.50
Chapter 12
Metadata Interoperability  (pages 122-130)
K. S. Chudamani, H. C. Nagarathna
Metadata is data about data. Metadata originated in the context of digital information in databases. This chapter looks at the various standards... Sample PDF
Metadata Interoperability
$37.50
Chapter 13
Payam M. Barnaghi, Wei Wang, Jayan C. Kurian
The Semantic Web is an extension to the current Web in which information is provided in machine-processable format. It allows interoperable data... Sample PDF
Semantic Association Analysis in Ontology-Based Information Retrieval
$37.50
Chapter 14
Gerald Schaefer
As image databases are growing, efficient and effective methods for managing such large collections are highly sought after. Content-based... Sample PDF
Effective and Efficient Browsing of Large Image Databases
$37.50
Chapter 15
Giovanni Semeraro, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops
Exploring digital collections to find information relevant to a user’s interests is a challenging task. Information preferences vary greatly across... Sample PDF
User Profiles for Personalizing Digital Libraries
$37.50
Chapter 16
Shiyan Ou, Christopher S.G. Khoo, Dion Hoe-Lian Goh
This chapter describes various text summarization techniques and evaluation techniques that have been proposed in literature and discusses the... Sample PDF
Automatic Text Summarization in Digital Libraries
$37.50
Chapter 17
Wooil Kim, John H.L. Hansen
This chapter addresses a number of advances in formulating spoken document retrieval for the National Gallery of the Spoken Word (NGSW) and the... Sample PDF
Speechfind: Advances in Rich Content Based Spoken Document Retrieval
$37.50
Chapter 18
Sofia Stamou
This chapter introduces a personalized ranking function as a means of offering Web information seekers with search results that satisfy their... Sample PDF
Using Topic-Specific Ranks to Personalize Web Search
$37.50
Chapter 19
Irene Lourdi, Mara Nikolaidou
his chapter presents basic guidelines for maintaining digital cultural collections in order for them to be interoperable and easily retrievable from... Sample PDF
Guidelines for Developing Digital Cultural Collections
$37.50
Chapter 20
Neide Santos, Fernanda C.A. Campos, Regina M.M. Braga Villela
Nowadays, social, economical, cultural, and technological changes deeply stress the professional profiles. As a consequence, everyone needs to be... Sample PDF
Digital Libraries and Ontology
$37.50
Chapter 21
Svenja Hagenhoff, Björn Ortelbach, Lutz Seidenfaden
Information and communication technologies seem to bring new dynamics to the established, but partly deadlocked, system of scholarly communication.... Sample PDF
A Classification Scheme for Innovative Types in Scholarly Communication
$37.50
Chapter 22
Stefano Paolozzi, Fernando Ferri, Patrizia Grifoni
This chapter describes multimodality as a means of augmenting information retrieval activities in multimedia digital libraries. Multimodal... Sample PDF
Improving Multimedia Digital Libraries Usability Applying NLP Sentence Similarity to Multimodal Sentences
$37.50
Chapter 23
Ana Kovacevic, Vladan Devedzic
Our research efforts are oriented towards applying text mining techniques in order to help librarians make more informative decisions when selecting... Sample PDF
Duplicate Journal Title Detection in References
$37.50
Chapter 24
Jin-Cheon Na, Tun Thura Thet, Dion Hoe-Lian Goh, Yin-Leng Theng, Schubert Foo
This chapter introduces word segmentation methods for Indo-China languages. It describes six different word segmentation methods developed for the... Sample PDF
Word Segmentation in Indo-China Languages for Digital Libraries
$37.50
Chapter 25
Dion Hoe-Lian Goh, Khasfariyati Razikin, Alton Y.K. Chua, Chei Sian Lee, Schubert Foo
Social tagging is the process of assigning and sharing among users freely selected terms of resources. This approach enables users to... Sample PDF
On the Effectiveness of Social Tagging for Resource Discovery
$37.50
Chapter 26
Taha Osman, Dhavalkumar Thakker, Gerald Schaefer
While many digital image libraries allow access to large repositories of images, unfortunately, often the provided free-text search returns... Sample PDF
Semantic Annotation and Retrieval of Images in Digital Libraries
$37.50
Chapter 27
Ali Shiri
This chapter introduces a new category of digital library user interfaces called metadata-enhanced visual interfaces. Drawing on the earlier... Sample PDF
Metadata and Metaphors in Visual Interfaces to Digital Libraries
$37.50
Chapter 28
Judy Jeng
This chapter introduces the concept of usability and provides examples of how usability has been used in digital library evaluations. Usability is a... Sample PDF
Usability Evaluation of Digital Library
$37.50
Chapter 29
Stephen Kimani, Emanuele Panizzi, Tiziana Catarci, Margerita Antona
The gathering of user requirements is key to the gaining of a deeper understanding of the needs evolving from the user’s operational context and... Sample PDF
Digital Library Requirements: A Questionnaire-Based Study
$37.50
Chapter 30
Spyros Veronikis, Giannis Tsakonas, Christos Papatheodorou
The present chapter introduces digital library services’ utilization through handheld devices, such as personal digital assistants (PDAs) and... Sample PDF
Handhelds for Digital Libraries
$37.50
Chapter 31
Mila M. Ramos, Luz Marina Alvaré, Cecilia Ferreyra, Peter Shelton
This chapter introduces the Consultative Group on International Agricultural Research (CGIAR) Virtual Library as a tool for linking researchers and... Sample PDF
The CGIAR Virtual Library Bridging the Gap Between Agricultural Research and Worldwide Users
$37.50
Chapter 32
Robert Neumayer, Andreas Rauber
In this chapter, we introduce alternative ways to access digital audio collections. We give an overview of existing applications based on... Sample PDF
Map-Based User Interfaces for Music Information Retrieval
$37.50
Chapter 33
Hideyasu Sasaki
In this chapter, we discuss the issues on patent and trade secret issues on digital libraries, especially patentable parameter-setting components... Sample PDF
Patent and Trade Secret in Digital Libraries
$37.50
Chapter 34
Thomas Mandl
This chapter describes personalization strategies adopted in digital libraries. Personalization and individualization are introduced as means to... Sample PDF
User-Adapted Information Services
$37.50
Chapter 35
Hepu Deng
Digital resources are readily available and easily accessible with the rapid development of information and communication technologies nowadays.... Sample PDF
An Empirical Analysis of the Utilization of University Digital Library Resources
$37.50
Chapter 36
Gerald Schaefer, Simon Ruszala
Following the ever-growing sizes of image databases, effective methods for visualising such databases and navigating through them are much sought... Sample PDF
Visualisation of Large Image Databases
$37.50
Chapter 37
Cláudio de Souza Baptista, Ulrich Schiel
A multimedia digital library copes with the storage and retrieval of resources of different media such as video, audio, maps, images, and text... Sample PDF
Towards Multimedia Digital Libraries
$37.50
Chapter 38
Nuria Lloret Romero, Margarita Cabrera Méndez, Alicia Sellés Carot, Lilia Fernandez Aquino
The Biblioteca Valenciana was created by the decree 5/1985 of the 8th of January and is presented primarily as “upper library centre of the... Sample PDF
BIVALDI the Digital Library of the Valencian Bibliographic Inheritance
$37.50
Chapter 39
Rubén Béjar, J. Nogueras-Iso, Miguel Ángel Latre, Pedro Rafael Muro-Medrano, F. J. Zarazaga-Soria
This chapter introduces Spatial Data Infrastructures (SDI) and establishes their strong conceptual and technical relationships with geographic... Sample PDF
Digital Libraries as a Foundation of Spatial Data Infrastructures
$37.50
Chapter 40
O. Cantán Casbas, J. Nogueras-Iso, F. J. Zarazaga-Soria
A new collaboration paradigm is in order between Digital Libraries (DL) and Geographic Information Systems (GIS). These important Information... Sample PDF
DL and GIS: Path to a New Collaboration Paradigm
$37.50
Chapter 41
Piedad Garrido Picazo, Jesús Tramullas Saz, Manuel Coll Villalta
This chapter introduces digital libraries as a means of cultural heritage access and diffusion. It argues that digital libraries, combined with... Sample PDF
Digital Libraries Beyond Cultural Heritage Information
$37.50
Chapter 42
Wan Ab. Kadir Wan Dollah, Diljit Singh
Information and communication technologies have been used to assist in various functions of library and information units. Digital reference... Sample PDF
Reference Services in Digital Environment
$37.50
Chapter 43
Frances L. Lightsom, Alan O. Allwardt
The U.S. Geological Survey (USGS) has developed three related digital libraries providing access to topical and georeferenced information for... Sample PDF
USGS Digital Libraries for Coastal and Marine Science
$37.50
Chapter 44
Digital Preservation  (pages 431-440)
Stephan Strodl, Christoph Becker, Andreas Rauber
The rapid ongoing changes in software and hardware put digital information at risk. The challenge is to keep electronic data accessible, viewable... Sample PDF
Digital Preservation
$37.50
Chapter 45
Gerald Schaefer
While image retrieval and image compression have been pursued separately in the past, compressed domain techniques, which allow processing or... Sample PDF
Visual Pattern Based Compressed Domain Image Retrieval
$37.50
Chapter 46
Thomas Lidy, Andreas Rauber
This chapter provides an overview of the relatively young but increasingly important domain of Music Information Retrieval, an Information Retrieval... Sample PDF
Music Information Retrieval
$37.50
Chapter 47
Juha Kettunen
This abstract describes the networked cooperation of the academic libraries and the consortium of the digital libraries of the Finnish universities... Sample PDF
The Strategic Plan of Digital Libraries
$37.50
Chapter 48
Leonardo Bermón-Angarita, Antonio Amescua-Seco, Maria Isabel Sánchez-Segura, Javier García-Guzmán
This paper establishes the incorporation of knowledge management techniques as a means to improve actual software process asset libraries. It... Sample PDF
Software Process Asset Libraries Using Knowledge Repositories
$37.50
Chapter 49
Lee Yen Han
In recent years, the development of information technologies and network distributions has brought about the creation of useful learning resources... Sample PDF
The Role and Integration of Digital Libraries in E-Learning
$37.50
Chapter 50
Kanwal Ameen, Muhammad Rafiq
This chapter aims to discuss the development of digital libraries in Pakistan. It gives an account of the digital transformation taking place in the... Sample PDF
Development of Digital Libraries in Pakistan
$37.50
Chapter 51
Seungwon Yang, Barbara M. Wildemuth, Jeffrey P. Pomerantz, Sanghee Oh
This chapter introduces the effort of developing a digital library (DL) curriculum by an interdisciplinary team from Virginia Tech and the... Sample PDF
Core Topics in Digital Library Education
$37.50
Chapter 52
Natalie Pang
Using historical perspectives from ancient libraries in Europe, this chapter is focused on the core role of libraries as centres of knowledge.... Sample PDF
Digital Libraries as Centres of Knowledge: Historical Perspectives from European Ancient Libraries
$37.50
Chapter 53
Wolfgang Ratzek
Triggered by a rapid diffusion of ICT within the last two decades, libraries have undergone a (r)evolutionary change in both mission and services.... Sample PDF
The European Approach Towards Digital Library Education: Dead End or Recipe for Success?
$37.50
Chapter 54
Faisal Ahmad, Tamara Sumner, Holly Devaul
The limited scope of digital libraries can be attributed to the brick and mortar vision of the library metaphor. In order to extend scope of digital... Sample PDF
New Roles of Digital Libraries
$37.50
Chapter 55
Yongqing Ma, Warwick Clegg, Ann O’Brien
In this entry, we review the history, development and current status of digital library (DL) courses and programmes now being offered, mainly by... Sample PDF
A Review of Progress in Digital Library Education
$37.50
Chapter 56
Chang Chew-Hung, John G. Hedberg
While the prospect of using digital libraries for learning becomes more appealing with growing repositories of resources, it is not clear what... Sample PDF
The Future of Learning with Digital Libraries
$37.50
Chapter 57
Michael B. Twidale, David M. Nichols
This chapter discusses the role of technology in digital library education. It explores how elements of computer science and library science can be... Sample PDF
Computational Sense for Digital Librarians
$37.50
Chapter 58
Soh Whee Kheng Grace
Library digitization on a global basis is essential in the twenty-first century. The digital library development initiatives in most countries... Sample PDF
Digital Libraries Overview and Globalization
$37.50
About the Contributors