Serverless, containers or VMs? How to choose the right compute type for your workloads on AWS?
AWS offers the broadest and most sophisticated range of cloud services. It is one of its most important strengths in the public cloud market.
When you first open the console, you can see +175 services to choose from. It can be quite intimidating for a first introduction to the platform! For a fair number of new users, this can be enough to discourage them or put off their first trial on AWS.
There is a lot of high quality documentation out there, but it might be difficult to find what you really need — nothing more, nothing less — when you first start on AWS. It is easy to get lost in this wide ocean of information. This is even truer if you are trying to learn by yourself and do not have anyone to guide you as you discover AWS. I know, I’ve been through that!
If you are a more experienced user, it can also be challenging for you to choose the right services for your project. We are usually specialized in specific areas of AWS and it is difficult to keep up with all the others. How can I be sure to choose the correct services out of the +175 options out there? Is it okay not to make the most suitable choice? Do you feel any fear of missing out the very best option?
This is a broad topic. However, this post focuses on making the best (or good enough!) choice for your situation when it comes to Compute services. Which flavor works best for you at the moment: serverless-, containers- or VM-style? As you’ve often heard for many other choices you make, there is no one-size fits all…
VM-style compute
This relates to AWS services such as EC2 and LightSail, which are similar to Virtual Private Server (VPS).
This compute flavor is the oldest one in the cloud and is still the most common one, so there is no risk cloud vendors will consider it as legacy anytime soon.
It is not going to significantly affect your current way of working if you are already familiar with managing your servers on premises or in a VPS environment. You will continue to maintain the OS, patch it, install software, and so forth. It can be practical for you to perform troubleshooting at the OS level, run commands, etc. It can also be handy if you want to migrate existing workloads (e.g. apps, servers) to AWS by following a ‘lift-and-shift’ approach.
AWS is responsible for the physical hosting, hardware and virtualization (the hypervisor). You are responsible for the OS and anything above that. This can be a great improvement if you are moving from hosting by yourself on premises. You can also benefit from many cool features such as snapshots, auto-scaling, sharing images (AMI), etc.
Many organizations start with this type of compute when they move their first workloads to AWS because it is easier to migrate existing stuff and it may not fundamentally change their way of working.
Container-style compute
This relates to AWS services such as ECS, EKS, ECR, Fargate and LightSail Containers, which are based on open source Docker and Kubernetes technologies.
If you are already familiar with containers, or willing to make the move soon, this is a great place to start. ECS and EKS both provide orchestration services for your containers. ECS is for Docker and EKS is for Kubernetes containers. I am not going to discuss here which one is best because it is not the topic of this post and again, there is no one-size fits all. The advantage of both ECS and EKS is that you can control all your containers deployments from a single console. With the newly announced ECS Anywhere and EKS Anywhere features (re:Invent 2020), you can even manage containers deployed outside AWS, like on premises or in a different cloud platform. Both consoles also embed ECR, which allows you to manage your own container images.
When it comes to the deployment of your containers from ECS or EKS, this can be done to 3 different types of environments:
- Deploy containers outside AWS (ECS Anywhere and EKS Anywhere)
- Deploy containers on your own EC2 instances
- Deploy containers on AWS Fargate
Containers on EC2 allow you to have full control of the underlying OS. You still manage the EC2 instances the containers are deployed to. That means you need to patch them, manage disk space, auto-scaling, etc. What ECS and EKS bring is that you can orchestrate all the containers deployments from their console. Having access to the OS can be practical for troubleshooting or running commands directly inside a live container. However, this can mean more operations overhead, especially if you don’t have sysadmin skills and have a large number of EC2 instances to maintain to host your containers.
Fargate is an AWS service that can allow you to host your containers in an AWS-managed environment. You don’t need to worry at all about the OS because it is entirely taken care off by AWS, which means less operations overhead for you. The hourly cost for compute capabilities is higher than the equivalent on EC2 but you only pay for the time you use Fargate and for what your containers need. It fits more closely to the actual demand and there is less under-utilization of the capacity you pay for than with EC2.
Using containers capabilities on AWS is generally a good fit if you are moving towards microservices architecture patterns and DevOps practices. They require less operations overhead so you can focus more on the ‘Dev’ part and less on the ‘Ops’ one. Fargate is even more relevant if you have limited sysadmin skills to maintain the underlying OS because AWS will do it for you.
Containers may also be useful for more traditional monolithic applications. If you have limited containers in scope, you might find the new LightSail Containers service interesting for you. It gives you access to a fully managed Docker container service (similar to Fargate) in a much simpler interface than ECS. It comes with less functionalities but you may not need them so you can quickly onboard your containers without in-depth AWS expertise. The starting price is also very competitive and it includes a managed load balancer so your containerized websites can be reached from the Internet.
Serverless-style compute
This relates to the AWS Lambda service. Lambda is often referred to as Function-as-a-Service (FaaS) because all you have to bring is your code and its dependencies, which constitute what we call a ‘function’. AWS takes care of everything else. Behind the hood, a Lambda function is basically a container but AWS takes it a step further by managing most aspects of the orchestration. You just need to choose one of the runtime flavors provided by AWS Lambda (Python, Node.js, Go, .NET, Java or Ruby) and bring your function on the top of it. In case none of these runtimes fits your need, you can now even bring your own container image into Lambda. This recently announced feature (re:Invent 2020) allows you to easily build your custom runtime.
The paradigm behind serverless functions is that your DevOps team focuses even more on the ‘Dev’ part and can abstract away the ‘Ops’ one. This is why serverless is often brought as a ‘No-Ops’ model, an evolution of DevOps that clearly indicates the shift of emphasis from ‘Ops’ to ‘Dev’.
Serverless is not only a shift towards DevOps or No-Ops, but also in terms of architecture patterns. Adopting microservices architecture is almost always necessary if you want to create serverless-based applications or replatform existing ones. A serverless function is intended to be simple, loosely-coupled and execute for the shortest possible time. It is ‘ephemeral’ by nature and requires a trigger for every invocation. For example, if serverless functions are used to serve an e-commerce application, each functionality will require a set of functions (e.g. one function for log-in, another one for password reset, another one for ordering, etc.).
A serverless function is charged for each occurrence according to its lifetime duration and memory. This is the compute option that fits the most closely to the actual demand. You are only charged for the time when your function needs to run. It also scales perfectly well and supports an important number of concurrent functions (up to 1.000 per region) without having to worry about capacity. One performance drawback that is often heard of is the ‘cold-start’ delay that occurs when the function is not ‘warm’ because has not been invoked recently. This can be a problem for applications where delay is an important factor. However, the actual delay has been reduced by recent AWS improvements and it will probably continue to improve in the near future. Furthermore, developers have been able to use some workarounds in order to minimize ‘cold-start’ occurrences.
Comparison
Now that we have outlined the different compute types, the following table will provide a quick overview of their pros & cons. The following criteria have been used to create the table:
- Operations burden / DevOps friendliness:
The ‘Ops’ vs. ‘Dev’ ratio. The less infrastructure maintenance, the more ‘DevOps’ friendly.
- Ease of use / AWS skills required:
The level of AWS services understanding needed, both in terms of breath and depth.
- Scalability effectiveness:
The effectiveness of the scaling capabilities and the delay for scaling up and down.
- Pricing:
The AWS ‘raw’ costs. You can of course use some optimization tweaks and should not forget to estimate your own costs associated with a technology or another (e.g. operations overhead).
- Vendor lock-in:
The difficulty to migrate from AWS to another cloud vendor. Generally speaking, the more dependent you are on AWS managed services, the more difficult it is to move away from AWS.
I did not include security because it is a whole topic in itself. It is difficult to determine if one option is more secure than another because it really depends on how it is used.
Conclusion
There are many criteria to consider when choosing the right compute type. Organizations that are new to the cloud are likely to start with virtual machines, especially if they have existing IT infrastructure that they are used to maintain by themselves. As they grow in their cloud adoption journey and get more familiar with other technologies, they may move towards containers and serverless. However, this implies new paradigms in development, operations and architecture practices. On the other hand, many startups that are more cloud-savvy and do not have any legacy make the jump directly to managed containers and serverless. They build cloud-native solutions utilizing the full potential delivered by the cloud and the different AWS services that are relevant to them. Tech startups focus heavily on the ‘Dev’ side and understand most value is created in spending time on the business logic. They are happily outsourcing as much ‘Ops’ as they can, while leveraging on AWS managed services to build their solution upon and automate what they can using tools such as CI/CD and Infrastructure-as-Code (IaC).
Is serverless the future of cloud computing? There is undoubtedly a trend here, but I would not say the other types of compute in the cloud are going to go obsolete anytime soon. Serverless is still relatively new (Lambda started in 2014) in comparison to VM in the cloud (EC2 in 2006) and there is still a long way before the former replaces the latter. There can be strong justifications for using VMs rather than serverless functions. Cloud vendor lock-in concerns, legacy applications, m on olithicarchitectures, skills and development practices are all valid reasons against making the move to serverless now. When a new customer moves from on premises computing to the AWS cloud, it typically starts with VMs, then containers, and only then serverless. New ways of working and building applications are necessary as it shifts from one type of compute to the next one. Quite often, organizations are not using a single type of compute. Their applications landscape consists of a mix of VMs, containers and serverless functions. Some legacy applications have been lifted and shifted to the cloud, others have been replatformed, whereas new cloud-native applications have been built entirely serverless.
What about security? It has not been covered much here because it is a whole subject in itself. The short answer is that your security posture can be improved as you increase the AWS responsibility domain. AWS will do more for you and you have less to do by yourself. AWS embeds many security controls when they manage things for you and it is their core business to do it well. However, you have to understand the new security paradigms containers and serverless bring. If you don’t use them the right way, it might also make your security worse. What is the ‘right way’ of doing security for the different types of compute might be a good topic for a future follow-up post.