A Comparison of Data Modeling in UML and ORM

A Comparison of Data Modeling in UML and ORM

Terry Halpin (Neumont University, USA)
DOI: 10.4018/978-1-60566-026-4.ch100
OnDemand PDF Download:
No Current Special Offers


The Unified Modeling Language (UML) was adopted by the Object Management Group (OMG) in 1997 as a language for object-oriented (OO) analysis and design. After several minor revisions, a major overhaul resulted in UML version 2.0 (OMG, 2003), and the language is still being refined. Although suitable for object-oriented code design, UML is less suitable for information analysis, since its graphical language provides only weak support for the kinds of business rules found in data-intensive applications, and its textual Object Constraint Language (OCL) is too technical for most business people to understand. Moreover, UML’s graphical language does not lend itself readily to verbalization and multiple instantiation for validating data models with domain experts. These problems can be remedied by using a fact-oriented approach for information analysis, where communication takes place in simple sentences, each sentence type can easily be populated with multiple instances, and attributes are avoided in the base model. At design time, a fact-oriented model can be used to derive a UML class model or a logical database model. Object Role Modeling (ORM), the main exemplar of the fact-oriented approach, originated in Europe in the mid-1970s (Falkenberg, 1976), and has been extensively revised and extended since, along with commercial tool support (e.g., Halpin, Evans, Hallock, & MacLean, 2003). Recently, a major upgrade to the methodology resulted in ORM 2, a second-generation ORM (Halpin 2005). Neumont ORM Architect (NORMA), an open source tool accessible online at http://sourceforge.net/projects/orm, is under development to provide deep support for ORM 2 (Curland & Halpin, 2007). This article provides a concise comparison of the data modeling features within UML and ORM. The next section provides background on both approaches. The following section summarizes the main structural differences between the two approaches, and outlines some benefits of ORM’s factoriented approach. A simple example is then used to highlight the need to supplement UML’s class modeling notation with additional constraints, especially those underpinning natural identification schemes. Future trends are then briefly outlined, and the conclusion motivates the use of both approaches in concert to provide a richer data modeling experience, and provides references for further reading.
Chapter Preview


Detailed treatments of early UML use are provided in several articles by Booch, Rumbaugh, and Jacobson (Booch et al., 1999; Jacobson et al., 1999; Rumbaugh et al., 1999). The latest specifications for UML 2 may be accessed at www.uml.org/. The UML notation includes hundreds of symbols, from which various diagrams may be constructed to model different perspectives of an application. Structural perspectives may be modeled with class, object, component, deployment, package, and composite structure diagrams. Behavioral perspectives may be modeled with use case, state machine, activity, sequence, collaboration, interaction overview, and timing diagrams. This article focuses on data modeling, so considers only the static structure (class and object) diagrams. UML diagrams may be supplemented by textual constraints expressed in the Object Constraint Language (OCL). For detailed coverage of OCL 2.0, see Warmer and Kleppe (2003).

ORM pictures the world simply in terms of objects (entities or values) that play roles (parts in relationships). For example, you are now playing the role of reading, and this article is playing the role of being read. Overviews of ORM may be found in Halpin (2006, 2007b) and a detailed treatment in Halpin and Morgan (2008). For advanced treatment of some specific ORM topics, see Bloesch and Halpin (1997), De Troyer and Meersman (1995), Halpin (2001, 2002, 2004a), Halpin and Bloesch (1999), and Hofstede, Proper, and van der Weide (1993).

Key Terms in this Chapter

Unified Modeling Language (UML): Language adopted by the Object Management Group as a modeling language for object-oriented analysis and design of software systems. UML includes several sublanguages and diagram notations for modeling different aspects of software systems.

Entity Type: An entity is a non-lexical object that in the real world is identified using a definite description that relates it to other things (e.g., the Country that has CountryCode ‘US’). Typically, an entity may undergo changes over time. An entity type is a kind of entity, for example, Person, Country. In UML, an entity is called an object, and an entity type is called a class.

Role: In ORM, a role is a part played in a fact type (relationship type). In UML, this is known as an association-end. For example, in the fact type Person works for Company, Person plays the role of employee, and Company plays the role of employer.

Business Rule: A constraint or derivation rule that applies to the business domain. An alethic/deontic static constraint restricts the possible/permitted states of the business, and a dynamic constraint restricts the possible/permitted transitions between states. A derivation rule declares how a fact may be derived from existing facts, or how an object is defined in terms of existing objects.

Complete Chapter List

Search this Book: