This chapter introduces digital libraries as a means of cultural heritage access and diffusion. It argues that digital libraries, combined with superimposed information techniques, offers a potentially more substantive approach to understanding the historical documentation analysis problem. Furthermore, the authors hope that understanding the documental and technological assumptions constructs through the use of programming and automatic interpreter will not only inform researches of a better scheme for labelling cultural heritage information but also assist in the need of involved other areas such as multiagent systems, pattern matching, information management and information visualization based on content association, to solve the vast majority of problems set out in the work context, and the result is a versatile digital library prototype which covers the cultural heritage information that the users need.
By and large, cultural heritage digital libraries are characterized by the diversity of their collection contents (Crane, 2002). The bibliography available on these digital library types shows that in situations of heterogeneity, the metadata schemes used to describe the different contents become extremely important since they are used to deal with problems related to information retrieval and interoperability. Baldonado, Chang, Gravano, and Paepcke (1997) proposed a metadata-based interoperability model which used the Dublin Core (Tolosana-Calasanz, Nogueras-Iso, Béjar, Muro-Medrano, & Zarazaga-Soria, 2006) and MARC schemes (Chandler, Foley, & Hafez, 2000).
Given its simplicity, the use of the nonqualified Dublin Core Metadata Scheme in cultural heritage digital libraries predominates.
Nevertheless, the use of Dublin Core presents problems which have been pointed out by some researchers. Foulonneau, Cole, Habing, and Shreeves (2005) posed the problem, which assumes the lack of consistency of the Dublin Core metadata in OAI repositories. The comparative study on repositories with cultural heritage metadata undertaken by Hutt and Riley (2005) pointed out it and Halbert, Haczmarek, and Hagedorn (2003) had already warned about the problems detected in the use of Dublin Core in these contexts.
In the field on which our work centers, the use of Dublin Core has had to be gradually completed by integrating tools which offer the semantic relations that the metadata scheme does not offer (De Gendt, Isaac, Van Der Meijt, & Schlobach, 2006). So, we can present two starting points from the works reviewed:
Dublin Core is excessively simple for its use in cultural heritage digital libraries without other complementary schemes.
A more complete model must be proposed which incorporates a descriptive metadata scheme of both an element and collection type, and the description tool scheme (thesaurus, ontology).
Historical Information Analysis
The Web information service, under consideration, is the Digital Aragon Encyclopedia (http://www.enciclopedia-aragonesa.com).
Since the cultural heritage information preservation is a very wide and diverse field; the current chapter shows how to manage a portion of this information: historical texts.
Key Terms in this Chapter
Lucene: Lucene is a free/open source information retrieval library (http://lucene.apache.org/).
Freeling: An Open Source Suite of Language Analyzers (http://www.lsi.upc.es/~nlp/freeling/).
Open Archives Initiative (OAI): An initiative to develop and promote interoperability standards to facilitate the efficient dissemination of content (http://www.openarchives.org/).
JDO, JPOX: Java Persistent Objects (JPOX) is a free and fully compliant implementation of the JDO specifications, and Java Data Objects (Java Data Objects). API is a standard interface-based Java model abstraction of persistence.
Open Archive Cataloguer (zOAC): Applies the OAI-PMH protocol for automatic metadata harvesting and aggregation of bibliographic records and has been developed over the Web application server Zope.
Dublin Core (DC): A set of metadata descriptors about resources on the Internet. It contains fifteen element descriptions for the use in resource description endorsed in the ISO Standard 15836-2003.
JADE: Java Agents Development Framework (http://jade.tilab.com/).
Machine Readable Cataloging Record (MARC): The former normalized format of automatized bibliographic register.