Cloud Architecture Key Design Principles

"Memories are the architecture of our identity." -- Biran Solis

What are the fundamental design principles that help build well-architected cloud solutions? These design principles are scalability and elasticity, automation, loose coupling, security, caching, cost optimization, thinking parallel, and failure design. First, we will start with scalability and elasticity as these two are among the most compelling reasons for cloud adoption besides cost and other features. 

Table of Contents


Scalability is the ability of a system to scale without changing the design as input or workload increases. Cloud infrastructure and applications are designed with the premise that the load on the application can grow. In this scenario, if proper mechanisms are not in place in the design, the system will suffer – either the system will stop functioning, or it will underperform. We need to design the system to allow components to be added when demand increases on the system – without changing the design.

Though additional components can be added to manage the extra load to manage seasonal traffic, automatic scalability — where additional components are added automatically based on the runtime metrics such as CPU, memory, or storage utilization – is a much better design. 

Design horizontally scalable cloud applications. There are two ways to manage scalability: horizontal and vertical. Vertical Scalability is an old style in which the application is ported to a new server with more CPU, memory, or storage. As a result, it could lead to some downtime. The other one is horizontal scalability which is more modern and a common approach to handling scalability. In horizontally scalable systems, additional resources such as servers are added automatically to maintain the same performance as the load increases—design horizontally scalable cloud applications.  

To summarize, scalable architecture is critical to take advantage of a scalable infrastructure. Increasing resources results in a proportional increase in the system's performance. In addition, a scalable service can handle heterogeneity, is operationally efficient, resilient, and becomes more cost-effective when it grows.


Let's talk about elasticity as a design principle in architecting cloud applications. Elasticity and scalability are generally considered together when architecting solutions on the cloud application. Elasticity is the ability of a system to use resources dynamically and efficiently to maintain the SLA as the workload on the system increases and release them as the workload on the system decreases. The critical aspect of elasticity is deallocation or removal of the resources dynamically when they are not needed. It avoids the cost of over-provisioned resources such as servers, power, space, and maintenance.

Don't assume that components will always be in good health. Don't assume the fixed location of features. Use designs to re-launch and bootstrap your instances. Enable dynamic configuration to help answer instances on boot question: Who am I & what is my role?


DevOps, in which automation is one of the key features, has become an essential role in many software engineering organizations. Automation is one of the fundamental design principles for architecting applications on a cloud platform. It avoids human intervention – particularly if it relates to repetitive tasks, integrating systems, or batch jobs. Thus, many operations become more automated and efficient, and organizations save time on staff – particularly maintenance staff. This frees up some staff time. In addition, time saved from the automation could be utilized on some other high-priority tasks in line with the organization's business objectives.

Moreover, with automation with thoroughly tested scripts, we not only automate start, stop, terminate operations, but we also minimize failures by handling failures in codes. For example, as the system throws an error, we look up the error and fix the script so that we don't need to take it manually the next time. Over time, these automated processes make the system resilient — running with significantly less human intervention.

Loose Coupling

Enterprise systems have many modules or services (term used in modern micro-services architecture) encapsulating unique business features such as shopping cart service, checkout service, billing service, warehouse service, support service. These modules are loosely coupled in well-architected systems – typically using web services or messaging frameworks (for example, JMS in Java).

Design everything as a Black Box.

Loose coupling is a fundamental design principle in building any system – even in monolith systems. Loose coupling becomes critically important in building distributed and cloud applications. The reasons are many. We can replace, modify, maintain, or test part of the application in isolation as a separate module or as an individual component by not taking down the entire application as the price of taking down the whole system could be huge. Imagine if Google, CNN, BBC, Amazon, or any critical applications are going down for a few seconds for maintenance, add a new release feature, or fix some bugs.


Security is paramount for any organization, startup, small, medium-sized, or large enterprise. It is even critical for organizations to handle public health and money data. These organizations are also bound to many security and compliance regulations.

Design security in mind. Design security in every layer.  

When designing systems, security must be thought of from the beginning instead of thinking and implementing security in bits and pieces when the application is deployed on production. This is because it could be catastrophic if any security-related incident happens.

We can divide security into physical safety, platform security on which application runs such as operating system and web server, and application security.   When designing data security, data should be secured at transit and rest. In other words, data should be stored in encrypted form both at rest and at transit.

You lose some part of physical control with the cloud but not your own—a few guidelines related to cloud security. Restrict external access to specified IP ranges. Encrypt data “at-rest” and encrypt data “at-transit” (SSL). Consider an encrypted file system. Rotate your credentials. When passing arguments, pass the arguments as encrypted. Use Multi-factor factor authentication.


Cloud computing's architecture basis is distributed computing. Distributed computing, on the one hand, by using the basic computer science technique divide-and-conquer, helps to improve processing workload and loose coupling improves modifiability, maintainability, scalability. These are extremely important features for enterprise systems. On the other hand, however, distributed computing adds in some challenges, for example, more indirections, more layers to communicate to get the final result. Moreover, the indirections and layers increase latency and thus impact how fast the end-user or the caller will retrieve an output.

Not all information in the system needs to fetch or calculated each time to process a request. For example, there is information that changes significantly less; for example, country names, city names, persons' demographic and such —   master data, or lookup values in database terms. Similarly, many contents such as images, videos, or documents don't frequently change in production systems.

Since this information is of mostly static nature, we can leverage the caching design principle of cloud architecture to improve request processing time and help reduce operational costs. Data movement from the bottom layer to top layer is reduced by a few layers reducing data transfer cost. Computing resource usage is also reduced due to caching.

As discussed above, caching improves request processing time saves cost on data transfer and saves cost on computing resource utilization. First, let's understand the type of caching. There are two types of caching: application data caching and edge caching. In application data caching, essentially, mainly static data, such as master data, are cached in an in-memory cache. You can leverage many products to manage application data caching, such as Amazon ElasticCache (managed Memcache), Redis (in-memory database), and Hibernate Ehcache. You can also implement your custom application data caching for problems of smaller scope.

The other type of caching mechanism which is, by and large, is very common in cloud architecture is edge caching. Essentially, for content management, the standard caching solution is edge caching. In edge caching, content is served by the infrastructure's edge node servers closer to the viewer, thus improving latency and overall system performance. Amazon CloudFront is a typical example of edge caching.

Cost Optimization

Cost optimization is the most important design principle. Cloud costs, to a large part – particularly in the public cloud- are based on the OpEx (operating expenditure) model. Cost optimization essentially becomes an extremely important consideration.

Some principles are common: utilizing the right services for a right duration. For example, if the EC2 medium size instance provides the required performance, utilizing large or z-large will cost more. After services are being utilized, terminate them or stop them if you are using on-demand instances. You can also consider reserved instances and spot instances instead of on-demand instances for EC2 instances to optimize costs.

Auto-scaling is also a very good feature to optimize the cost. Using Auto-scaling, you can not only scale-out by adding more instances horizontally to maintain performance if workload increases, but you can also scale-in to terminate the resources automatically, if they are not needed by adding enabling configuration using AWS CloudWatch service, for example.

The main points here are: right service for the right job, do not use more resources and for more time if you don't need it. Look into various cost options and their pros and cons (for example, on-demand instances, reserved instances, and spot instances) provided by the cloud provider, and select the best option for your use case to optimize the cost.

Think Parallel

Many software engineering problems can be solved in less time if the concept of parallel processing is used. For example, a data processing job can be divided into many parts, and each part can be processed parallel. Map-Reduce job is a good example of parallel processing. Extending on the parallel processing, when you are designing applications to run an on the cloud platform, parallel thinking becomes even more important and valuable as the cloud has massive resources. Parallel processing helps solve large problems in less time.

There are two main reasons for using parallel computing. First, parallel computing saves time (wall clock time), and it  also helps solve large problems.

Some guidelines for parallel thinking: experiment with different architectures for multi-threading and concurrent requests. Run parallel MapReduce jobs. Use Elastic Load Balancing with Auto-Scaling to distribute loads across multiple machines.

Design for Failure

Design for Failure Werner Vogels, CTO of AWS, is famously quoted as having once said:  “Everything fails, all the time.” Let's face it — he's right.  In distributed computing and cloud computing, having the thinking — design for failure — in mind helps design better systems with respect to handling failures.

Design for failure, and nothing will really fail!  

A few guidelines: Avoid single points of failure — assume everything fails. Design with a backward goal as applications should continue to function even if the underlying physical hardware fails, is removed, or replaced.


Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses User Verification plugin to reduce spam. See how your comment data is processed.
Hide picture