Software Architecture Patterns in Big Data: Transition From Monolithic Architecture to Microservices

Software Architecture Patterns in Big Data: Transition From Monolithic Architecture to Microservices

Serkan Ayvaz (Bahçeşehir University, Turkey) and Yucel Batu Salman (Bahçeşehir University, Turkey)
DOI: 10.4018/978-1-7998-2142-7.ch004
OnDemand PDF Download:
No Current Special Offers


Traditional monolithic systems are composed of software components that are tightly coupled and composed into one unit. Monolithic systems have scalability issues as all components of the entire system need to be compiled and deployed even for simple modifications. In this chapter, the evolution of the software systems used in big data from monolithic systems to service-oriented architectures was explored. More specifically, the challenges and strengths of implementing service-oriented architectures of microservices and serverless computing were investigated in detail. Moreover, the advantages of migrating to service-oriented architectures and the patterns of migration were discussed.
Chapter Preview


With the widespread use of the Internet, and rapid technological advancements in big data, the amount of data in all parts of modern life such as finance, trade, health, and science has been increasing exponentially (Khan et al., 2014). Nowadays, the amount of data created in a single day is estimated to be around 2.5 quintillion bytes (Petrov, 2019). The data created in the last two years alone exceeds 90% of entire data available in the world today. According to the DOMO report (“Data never sleeps,” 2018), “By 2020, it’s estimated that 1.7MB of data will be created every second for every person on earth.”

Technological advancements have transformed how individuals access to information in recent decades. Now, data are being collected from variety of resources. Mobile phones, wearable technologies, and other smart devices have become essential part of daily life (Çiçek, 2015). Data are plentiful and made easily accessible. More than 80 percent of data are gathered from unstructured data from the web such as posts on social media sites, digital images, videos, news feeds, journals, blogs (“The most plentiful,” 2016).

As a result, the software systems have been evolving over the decades to adapt the technological changes. Large scale software solutions require more complex system architectures nowadays. Monolithic systems and von Neuman architecture have served their purposes for a long time. Currently, the complex nature of enterprise software solutions demands scalable data abstraction for distributed systems (Gorton & Klein, 2014) and necessitates moving beyond von Neuman architecture.

From early days of Remote Procedure Calls (RPC) (Birrell & Nelson, 1984) in 1980s to Simple Object Access Protocol (SOAP) (Box et al., 2000) and Representational State Transfer (REST)(Fielding, 2000) over HTTP in early 2000s, the distributed technologies have evolved rapidly (“Beyond buzzwords,” 2018). These technological advances heavily influenced the development of software systems that communicate over network in a distributed environment and led to wider industry adoption. Moving forward a few years, the progress in Cloud computing, virtualization and containerization technologies has provided infrastructure support for large scale distributed software systems. Concurrently, emergence of big data technologies such as Hadoop (Borthakur, 2007; White, 2012) and Spark (Zaharia, Chowdhury, Franklin, Shenker, & Stoica, 2010) enabled processing massive amount of data over distributed hardware over the cloud. These technologies facilitated breaking large software systems into smaller software components that serve for a single functionality over distributed hardware.

In the traditional monolithic systems, all components of software application are tightly-coupled, self-contained and composed into one unit. A major drawback of monolithic architecture is that all components of entire system must be compiled and deployed even when a simple update occurs in application logic (Fazio et al., 2016). Since entire code base must be regression tested, it hinders the stability of the system. Additionally, the costs of testing and deployment are high (Singleton, 2016). Thus, the code releases usually occur less frequently. The time from development to production is often very long. Monolithic architectures can satisfy the needs of the small to medium-sized systems. However, this pattern is not a well-suited solution for large scale software systems as they require frequent code releases for bug fixes, modifications, new features. Even successful small scale applications tend to grow rapidly over time. In such cases, small scale monolith applications can quickly become overwhelmingly large in size. Thus, management and maintenance can be extremely difficult as monolithic systems get more and more complex (Chen, Li, & Li, 2017). Consequently, the risks of monolithic system deployment may lead to delays in continued development of new features and fixing bugs.

Complete Chapter List

Search this Book: