In my previous article I provided a high level overview of the differences between Microservices and Monoliths. In this article I do the same, but for containerization. We are slowly reaching the point of my Istio blog series where we finally start working with Istio itself. Following this article will be one focusing on Kubernetes. From there it’s Istio all the way.
A Quick Anecdote
I had a tech podcast playing on my iPhone whilst on my way to a client. The topic was Docker containerization and its benefits vs virtualization. At one point during the conversation, the guest speaker mentioned that on his MacBook Pro, he could run about 4-6 virtual machines at the same time before hitting serious performance issues. I still remember thinking to myself: “Well that’s pretty good. I’m happy if I get away with running 4 at the very most on my MacBook.” He then continued that on the same MacBook, he can run over 65 docker containers at the same time and that his machine handles it well enough. I almost swerved my car! How is that even possible? Well, fast forward a few weeks and you’ll hear me saying what I say now: It’s Entirely Possible!
It’s more about the “Why”, than it is about the “How”
I’ve read countless articles that explain how to set up a docker container and deploy it on-premise or in the cloud. The problem was that I couldn’t get my head around why docker containers are the way forward as opposed to virtual machines. When approaching new tech, it’s vital that one grasps the reasoning and use of that technology, which is why this article is focused on “why” containerization would be preferred over virtualization.
What is Containerization
For those who don’t know yet, containerization is an operating system level virtualization method, that allows the deployment of distributed applications without the need of setting up a virtual machine for each app.
If you currently work with virtualization, you’ll know it requires the setup and configuration of a VM image. These images are launched using virtual machine software such as VirtualBox, VMWare, etc. For each of these VM images, you’ll need to configure the amount of memory and CPU it would use. You’ll also specify additional attributes, such as what operating system it would run, shared folders, disk space, etc. This type of environment works very well for running multiple operating systems on a single host/machine. One might even beg the question: “Why containerization if we have virtualization?”
Firstly, with containerization, and more specifically Docker (A very well known container engine), the setup is different. To create a container, you first need the Docker software installed on the host operating system. The Docker software is used to run one or more containers on the host.
Onward to creating a container, you then need to create a docker file. The docker file is a simple text file that contains a few commands primarily related to Docker, with some being general command line instructions for your container. Once created, you can execute the docker file to create what is called a docker image. Now if you’ve heard some discussions around Docker, you’ll constantly hear references to images and containers. So what’s the difference?
Docker Images vs Docker Containers
In short, a docker image is a template for the actual container or containers that will run your application. So, if I have a node.js app that I want to run inside a docker container, I will first need to create a docker file, which is used to create the docker image. Then, when I want to launch the actual runtime container that starts up my node.js app, I execute a docker command, which will use the docker image that was created to create and launch the runtime container.
Let’s try explaining that one more time…
Because I have a node.js app that I want to run inside a container, my dependencies are node.js itself. Node.js has a dependency of some core operating system such as Windows, Linux, OS X, etc. Long story short is, inside my container, I will need a core OS, node.js and then my actual app. What I do here is create a docker file that includes the following instructions (NOTE: The instructions below are given in layman’s terms and are not actual instructions):
- Create a docker image based on the Alpine Linux template on DockerHub (Alpine is a 12 meg bare-metal linux operating system)
- Execute the command to install node.js inside the image
- Copy over my node.js project from my x directory to the x directory inside the image
- Execute the command to start my node.js app inside the image
The actual instructions inside the docker file will obviously look different, but hopefully you get the idea of how a docker file is used.
Now that our docker file is created, we run a docker command to create an image based on these instructions. If successful, you will have a template image, that has Alpine Linux and node.js installed, as well as your project inside a specific directory. The image itself does not get launched. It is merely used to launch one or more containers that would use the image template as a benchmark. This is very powerful, as you might want 2 or more runtimes of your app that’s based on the very same image.
By now you should have an understanding of the differences between an image and a container, as well as how each are created. The next step is to understand why a container would be preferred over a virtual image.
Containerization vs Virtualization
To understand if you should be using virtual machines or containers, you need to get your head around their core focuses:
If you’re looking to run a monolith app on-premise that will not be used in the cloud, then virtual machines are probably the way to go. Also, if you want 1 virtual instance to house a SQL Server, IIS, and a number of web apps, then once again virtual machines are preferred. Virtualization provides data persistence and allows the installation of a full fledged environment, which you would then run one or many of your applications and data stores on. Everything is managed and exists in one virtual environment.
Containerization, however, follows a different path. Firstly, containers are stateless, which means that as a default, they do not persist data stored within the container. The data would need to be stored externally. This might seem like a bad thing, until you start realizing that containers are not meant to run complete environments, but are instead recommended to run small, lightweight application runtimes.
In my previous article, we converted a monolith app into 4 separate microservices applications. Each of these apps are a node.js runtime that contains logic exposed through APIs (Web Services). These apps could easily exist by themselves and don’t store any data or files on the file system. Containerization would be perfect for these apps, as they do not require excessive memory or CPU usage. The data they store can be hosted remotely and integration into that data store would be via API calls.
Let’s take a look at a use case
Meet the Developer
So I’m developer Joe. Referencing my previous article, let’s say I developed a node.js microservice application that does the following:
- Provides 5 additional web APIs with similar functionality
For my app to run, it needs node.js installed on the underlying operating system. The total size of my app is around 3 megabytes. When my app runs idle, it needs around 15 megs of RAM, and a fraction of the CPU.
Meet Administrator Sam
I want to deploy my app for Acme Corp. I go to administrator Sam, who is in charge of virtualization. He asks for my app’s minimum requirements. I explain to him I need about 100 megs of RAM and 10 megs of hard drive space to be safe, and that I need a certain version of Node.js to be installed on the OS. He doesn’t want my app conflicting with other apps in existing VMs, so he sets up a new virtual machine for me. He installs Alpine Linux as the operating system, assigns 1 CPU, 512MB RAM, and 1GB Hard drive space. I get remote access to the VM and off I go.
There’s nothing really wrong with this approach, except that it’s a bit overkill. My app only needs a fraction of what was given to me in the end. And i’m being kind in my example. In the real world, because virtual machines are managed by other specialists, I have to adhere to their standards. In other words, Sam the administrator might not know anything about linux, and instead sets up a VM running Windows Server, 4GB RAM, 20GB hard drive space, and so on. This is honestly what happens most of the time.
Meet Administrator Kim
Kim is in charge of containerization. I approach her with the same requirements as I did for Sam. However, because she has a server running Docker (A basic server running Window, OS X or Linux with Docker installed), I merely have to provide her with my app’s docker file (less than 1kb in size), which she then extracts to her server. Kim runs the command to create the docker image and executes the instructions to launch my container. My container is up and running on the server and can be accessed using the server’s ip address and a port I specified in the docker file.
With containerization, there was no need to pre-configure memory and CPU resources to my app’s container. The Docker server will assign the container only what it requires and nothing more. If we go back to the anecdote in the first section of the article, you can start seeing how it was possible for the guest speaker to run 65 containers on 1 machine.
Some benefits of Containerization
Below are just some of the benefits of going the container route, keeping in mind once again the context of your solution:
- Performance – In my example above, my container would be accessible within a second if it launching. Containers startup seriously fast if they are configured correctly
- Size – Again, using my example above, my app’s docker image would be around 40 megs in size uncompressed. I say uncompressed, because if I had to deploy my image to IBM Bluemix, the image would be compressed to 15 megs.
- Portability – Many of the preferred cloud vendors already have infrastructure and services to support containerization. This includes IBM Bluemix, Microsoft Azure, AWS and Google Cloud Services. Deploying your images and launching containers in these clouds are as easy as 2-3 terminal commands
- Scalability – This is a big one. Let’s say your app is running inside a container on the Bluemix cloud. It gets bombarded by users and starts peaking in terms of usage. Bluemix has the means of scaling your app by spawning new instances of your app’s container to support the demand. Kubernetes is a perfect tool that does this and much more.
Hopefully you now have an understanding of containerization and “why” one would use it vs using a virtual machine. Again, this topic is not aimed and teaching you containerization, but helping you understand why and when to consider it.Cheers for now