Article Preview
TopBig data management is a critical issue for state-of-the art enterprises. However, not only size, which is exceeding petabytes in some cases, does matter. Global distribution, and variable degree of structure complicate efficient data management for complex business objects in the globally distributed environment. Generally, such collections of versatile data objects used in strategic corporate activities are often referred to as the enterprise content. Thus, an innovative set of data models is required for efficient management of enterprise content, which would embrace the versatile data and metadata objects used in mission-critical software products and applications. The models to be developed should support the entire software development lifecycle: from problem domain modeling to enterprise system implementation, maintenance and component-based expansion. The set of models should embrace data objects representation for both problem domain and computing environment. Thus, the primary aim of the paper is to instantiate such a set of models.
The problem of the big data management is even more complex due to heterogeneous nature of the data, which varies from well-structured relational databases to non-normalized trees and lists, and weak-structured multimedia data. The approach presented is focused at more efficient heterogeneous enterprise and uniform data management procedures. It involves a set of novel mathematical models, formal methods, and the supporting CASE tools for object-based representation and manipulation of heterogeneous enterprise systems data. The suggested architecture is based on enterprise web portal technologies.
Unfortunately, a brute force application of the so-called “industrial” enterprise software development methodologies (such as IBM RUP, Microsoft MSF, Oracle CDM etc.) to heterogeneous enterprise data management, without an object-based model-level theoretical basis, results either in unreasonably narrow “single-vendor” solutions, or in inadequate time-and-cost expenses. On the other hand, the existing generalized approaches to information systems modeling and integration (such as category and ontology-based approaches, Cyc, SYNTHESIS, and similar projects (Barendregt, 1984; Birnbaum et al., 2005; Guha & Lenat, 1990; Güngördü & Masters, 2003; Kalinichenko & Stupnikov, 2009; Lenat & Reed, 2002; Panton et al., 2004)) do not result in practically applicable (i.e. scalable, robust, ergonomic) implementations since they are separated from state-of-the-art industrial technologies. A number of international and federal research programs prove that the technological problems of heterogeneous enterprise data management are critical (Kanazawa et al., 2009).
Another weak point of the approaches existing is their focus on severe problem domain uncertainties rather than partial instance inconsistencies. In general, the enterprise software resource management problem domain is well-defined due to a number of standards adopted. However, certain objects of well-defined classes may remain undefined for quite a long time. Thus, the models developed should be focused on well-defined class associations rather than on the large dictionaries with uncertain items (i.e., semantic networks rather than ontologies).