NLP and Digital Library Management

NLP and Digital Library Management

Lyne Da Sylva (University of Montreal, Canada)
DOI: 10.4018/978-1-4666-2169-5.ch011
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The field of study of Natural Language Processing (NLP) has developed over the past 50 years or so, producing an array of now mature technology, such as automatic morphological analysis, word sense disambiguation, parsing, anaphora resolution, natural language generation, named entity recognition, etc. The proliferation of large digital collections (evolving into Digital Libraries) and the emerging economic value of information demand efficient solutions for managing the information which is available, but which is not always easy to find. This chapter presents the requirements for handling documents in digital libraries and explains how existing NLP technology can be used to facilitate the task of document management.
Chapter Preview
Top

Introduction

The field of study of Natural Language Processing (NLP) has developed and ripened in the past 50 years or so, from the first machine translation and information retrieval applications to the present. These two areas of research have been far-reaching and pervasive. In the process of resolving issues of understanding natural language, for both translation and retrieval, many sub-areas of NLP have emerged: automatic morphological analysis, word sense disambiguation, parsing, anaphora resolution, natural language generation, named entity recognition, etc.

In today’s research in NLP, attention has shifted from machine translation over to different versions of Information Retrieval (IR) applications. The increasing availability of large collections of digital documents has spurred interest in devising useful technology to handle these. Specifically, the notion of “digital libraries” (Adams, 1995; Fox, et al., 1995; Arms, 2000) has emerged, with specific architecture and functionality. This is an area where many mature NLP applications can be brought into play. It is an area mostly associated with IR, which has traditionally used little NLP and yet produced efficient tools; methods needed to include more sophisticated, NLP-based approaches were, up to recently, beyond the reach of IR systems. But digital libraries are much more than simply IR.

This chapter has the following three objectives: (1) to describe the issues relating to the task of managing a digital library; (2) to explore various NLP applications which can be applied to the task; (3) to identify new research problems related to these issues.

Complete Chapter List

Search this Book:
Reset