Article Preview
TopIntroduction
Object-relational impedance mismatch (ORIM) occurs when object-oriented application development – a hierarchical paradigm - meets the relational database layer, a set-based paradigm. ORIM is categorised into several layers of granularity, from concept to language (Chen et al., 2014; Ireland et al. 2009). In implementation terms, ORIM means overcoming the mismatch between invoking a method within an application to generating the Structured Query Language (SQL) that is required by the method. Using inline SQL can meet this need, but this method is intolerant of schema changes and can introduce security flaws, such as injection. Using a stored procedure layer as an alternative can help mimic the object-oriented model in the database, but comes at the cost of moving the application logic into the data layer, tightening the coupling between these two layers, potentially moving the logic out of source control and necessitating SQL skills to make changes in the future. The need to address relations as objects in the application-database interface instead bred a third solution, a class of tools known as object-relational mapping (ORM) frameworks, designed to bridge the gap between object-oriented method calls and the generation of Structured Query Language (SQL) queries. ORM frameworks differ in specifics, but typically store an internal data model and use rule bases and heuristics to generate SQL from this data model in response to application requests. The resultant SQL is presented to the relational database management system (RDBMS).
Relational databases are data stores that operate according to the long-established principles of relational algebra (Date, 1990; Astrahan et al., 1976; Held, Stonebraker & Wong, 1975; Codd, 1974; Stoll, 1963). In contrast to document-style databases, storing unstructured or semi-structured attribute-value pairs, relational database design is based on relations, or sets of tuples of related values, which are stored in tables, linked with keys and queried with SQL. It is claimed that 4 of the top 5 most popular database tools in recent use are based on the relational model (Solid IT, 2018).
This paper investigates whether ORM frameworks are well-suited to producing SQL queries, given that ORM tools are, themselves, object-oriented constructs; more specifically, whether the mismatch postulated by ORIM can be observed in the methods and outputs of ORM tools as measured by relative performance; in essence, are ORM tools producing efficient SQL?
We investigate this aim in three ways: firstly, through a literature review of ORIM and associated relevant topics; secondly, through the administration of a survey of practising database professionals aimed at gathering expert opinions on the perceived effectiveness of ORM tooling, using thematic analysis to construct appropriate narratives; and thirdly by investigating, through empirical experimentation and using industrial software, the operation of ORM-generated performance impacts on a relational database using a benchmark data set, extending our prior research (Colley, Stanier & Asaduzzaman, 2018) in this area. By combining all three approaches, we validate whether the opinions of our survey participants are borne out by the findings of the ORM testing; whether the results of the ORM testing concord with the findings of other researchers; and to establish whether ORIM is a solved problem through the use of ORMs, or whether ORIM at the application-database interface remains a current and relevant issue, to be addressed by future research.
The remainder of this paper is structured as follows. The Literature Review section defines ORIM in more detail, summarises prior research into the issue, and describes how the issue can be manifested in relational database systems. The Problem Investigation section describes the investigation; split into two sections, the Domain Expert Views sub-section explains the methodology and process of gathering domain-expert views and presents the results; and the Empirical Investigation sub-section describes the experimental investigation into the impacts of ORM-generated queries and the results of this work. The Conclusion section draws together these results in the context of existing research, and Future Work discusses ideas for further research in mitigating ORM-generated query performance issues and future strategies for addressing ORIM in the data layer.