Software engineers develop an information model in the systems analysis and design process to represent the concepts, specification or implementation design of a software system (Fowler and Scott, 1997). This information model is designed using a modeling language such as the Unified Modeling Language (UML) defined by Rumbaugh, Jacobson, and Booch (1999). The software is implemented by translating the information model into code. Similarly, data engineers develop an information model in the database design process to represented the types of data to be stored in a database. This conceptual information model is typically defined using one of the semantic data modeling languages (Hull and King, 1987) such as Entity-Relationship diagrams (Chen, 1976), or NIAM conceptual schemas (Leung and Nijssen, 1988). The database is implemented by translating the information model into a database schema (defined using an implementation data model such as the relational data model or an object-oriented data model). Likewise, document engineers will develop an information model when designing the structure of a collection of documents. This information model will be implemented by translating it into a document schema. Traditional database information modeling has dealt with structured data such as that found in relational databases. However, much of the information produced using and stored in computers involves documents that do not contain data with a fixed structure - rather it is referred to as semi-structured data. The need for better modeling of documents is no more apparent than in the rapid and chaotic development over the last few years of the World Wide Web. In response to this need, various information models have been proposed to model the semi-structured data found in documents.