Article Preview
TopIntroduction
Information systems were developed in early 1960s to process orders, billings, inventory controls, payrolls, and accounts payables. Soon information systems research began. Harry Stern started the “Information Systems in Management Science” column in Management Science journal to provide a forum for discussion beyond just research papers (Banker & Kauffman, 2004). Ackoff (1967) led the earliest research on management information systems for decision-making purposes and published it in Management Science. Gorry and Scott Morton (1971) first used the term ‘decision support systems’ (DSS) in a paper and constructed a framework for improving management information systems. The topics on information systems and DSS research diversifies. One of the major topics has been on how to get systems design right.
In late 1970s, the growing success of database management systems (DBMSs) proliferated the use of databases in organizations around the world (Takecian et al., 2013). These kinds of databases are designed to handle routine business transactions. Meanwhile, with the growth of the data collected and stored, the need to analyze the data for managerial decision making increased and the drive to optimize the transactional database for analytical purpose intensified. In the early 1990s, data warehouse was coined and later developed. As an active component of DSS, which is part of today’s business intelligence systems, data warehousing became one of the most important developments in the information systems field during the mid-to-late 1990s. Since business environment has become more global, competitive, complex, and volatile, customer relationship management (CRM) and e-commerce initiatives are creating requirements for large, integrated data repositories and advanced analytical capabilities. Data warehouse is a system which can integrate heterogeneous data sources to support the decision making process (Vela, et al., 2013). By using a data warehouse, companies can make decisions about customer-specific strategies such as customer profiling, customer segmentation, and cross-selling analysis (Cunningham et al., 2006). Thus how to design and develop a data warehouse have become important issues for information systems designers and developers.
Data modeling for a data warehouse is different from operational database data modeling. An operational system, e.g., online transaction processing (OLTP), is a system that is used to run a business in real time, based on current data. An OLTP system usually adopts entity-relationship (ER) modeling and application-oriented database design (Han & Kamber, 2006). An information system, like a data warehouse, is designed to support decision making based on historical point-in-time and prediction data for complex queries or data mining applications (Hoffer, et al., 2007). A data warehouse schema is viewed as a dimensional model (Ahmad et al., 2004, Han & Kamber, 2006; Levene & Loizou, 2003). It typically adopts either a star or snowflake schema and a subject-oriented database design (Han & Kamber, 2006). The schema design is the most critical to the design of a data warehouse.
Data warehouse design is a lengthy, time-consuming, and costly process. Any wrongly calculated step can lead to a failure. A study from Gartner Group in 2005 shows that about 50% of data warehouse projects tend to fail due to problems during data warehouse design and construction (Takecian, et al., 2013). Lengthy development process is attributed to the most important cause of the failure. Often, by the time the systems become available some of the functional features are already obsolete. Therefore, researchers have placed important efforts to the study of design and development related issues and methodologies.