Resource Cost and Workflow Aware Sizing and Scaling of Microservices
Application development evolution has changed from moving through a single monolithic fragment of code within a file to encapsulated modular pieces using data and function abstractions using various techniques, different software architectures, and design patterns. Generally, the majority of the software deployment strategies revolve around monolithic code executions. With the advent of service-oriented architecture (SOA) and Cloud computing paradigm, the deployment aspect being factored in the software design has recently emerged by way of microservice-based applications. This requirement is associated with the scalability aspect of an application and its readiness to exploit the pay-by-use model in the cloud setups.
Monolithic application design is an architecture where all the application components are bundled together in a single application. Such architecture design creates a bulky, rigid combination of various business functionalities where the unified functions are built, deployed, and scaled together. The architecture is easy to maintain as a single application needs to be monitored. It also seems easy to scale as the scaling manager only needs to create multiple application instances under a load balancer. But it is soon observed that as more and more functionalities get added to the monolithic application, it becomes bulky and rigid. This leads to some significant drawbacks of this architecture like heavy code base, laborious to deploy, version and scale control, lesser fault tolerance, and lesser cost efficiency. SOA architecture came into existence to overcome these drawbacks and adopts more disintegrated application architecture. The application is disintegrated into several components that talk to each other through a standard communication mode, introducing flexibility. As the elements are separate, they make it easier to re-use functionality and share the same functionality among other services. Though it is a much flexible architecture than a monolithic application, it still has issues of coupling of a component with others that leads to performance, deployment, and maintainability issues. As an alternative to this tight coupling, the micro-service architecture emanated based on the concept of loosely coupled components using a stateless messaging framework.
Microservices are small, independent, and loosely coupled services implementing business functionalities of the application. The microservice architecture is adopted to benefit the independent demand curve, better scalability, agility, smaller code base, better development, deployment, fault tolerance, and cost-efficiency. Each microservice is independently scalable according to the workflow which comes to a particular microservice. Scaling the microservice with the right-size is vital to avoid under-allocation or over-allocation.
In this work, the sizing and scaling algorithms are designed to optimize the cost of provisioning and gain performance benefits. Considering workload demand prediction beyond the current scheduling interval allows making scaling decisions in the current cycle, keeping in view the near-future workload pattern, aids in better provisioning and ensuring SLAs. Adding to this, using application characterization further provides an insight into choosing the type and size of microservice to scale with the provisioning cost. It plays a crucial role in selecting the right size and scaling strategy in microservice deployments. The sizing algorithm designed should not only aim for cost optimization but should also be less computationally expensive. It is necessary to make the algorithm useful for making real-time and quick decisions aiming for global cost optimization. The thesis proposes various heuristic algorithms that consider the correlation among the workload characterization and workload pattern for making sizing decisions with low computational complexity. In the case of homogeneous sizing, the right-size of the microservice from among various sizes is chosen. In the case of heterogeneous sizing, the right combination of microservice sizes to scale with at any given scheduling interval is found.
The second part of the work deals with the idea of coordinated micro-service scaling and is motivated by the fact that though microservices are independent of each other, they are still part of a single application. As microservices are part of the same application, they exhibit various correlations among them. Various microservices implement business functionality, and user request flows through the microservices, creating a workflow, which introduces correlations among them. Also, as different functionalities can be invoked in a sequence, according to application and user behavior, the invocation of functionality after the other can be found using probabilistic models, introducing correlations. Coordinated scaling of microservice application is presented to capture these correlations, and gain in performance is evaluated.
The proposed algorithms are evaluated for various applications and workloads. It is done to gain insights into how the application uses its resources, which can then be exploited to make the sizing and scaling decisions according to the workload. Also, the sizing algorithms are discussed for both homogeneous and heterogeneous cases. Along with the same, the algorithms are evaluated in both cases when migration is considered and not considered. The sizing algorithms are compared with the state-of-art algorithms, and various useful observations and insights are discussed. It is found that microservice characterization provides vital insight into selecting the right size/s and type of scaling strategy (homogeneous or heterogeneous) to adopt while optimizing on provisioning cost. Further, the exploitation of correlations among microservices and functionalities of the application aids to scale in a coordinated way demonstrates gain up to 50% reduction in performance degradation. In conclusion, some ideas for further research are elaborated.