Virtual containers have seen tremendous adoption and growth within all industries. However, in terms of IT asset management, containers are not being managed and are an unknown area of risk for many of our clients. Because it is a newer technology, there is very little information about managing containers and how to address the emerging SAM and ITAM challenges they bring.
Due to this lack of public information, Anglepoint has published this two-part article on navigating the world of virtual containers, with an emphasis on asset management and licensing.
In part I we cover the history of containers, we will define what containers are and the different pieces of the container ecosystem, and we will take a good look at the benefits of virtual containers. In part II we look at containers in the cloud and CaaS, asset management best practices, and licensing considerations. Read Virtual Containers Part II.
Let’s get started!
A brief history of virtual containers.
The first proper containers came from the Linux world as LXC (LinuX Containers) in 2008. However, it wasn’t until 2013 that containers entered the IT public consciousness, when Docker came onto the scene with Enterprise usage in mind. Even then, though, it was more of an enthusiast’s technology. In 2015, Google released and open sourced Kubernetes which manages and ‘orchestrates’ containers. However, it wasn’t until 2017 that Docker and Kubernetes had matured enough to be considered for production use within corporate environments. 2017 also saw VMware, Microsoft, and Amazon beginning to support and offer solutions for Kubernetes and Docker on their top-tier cloud infrastructure.
What is a container?
Often, people conflate the term ‘container’ with multiple technologies that make up the container ecosystem. Let’s look at what a modern container is at the most fundamental level.
On the left side of the diagram 1 is an Operating System which has several different processes (applications) that are installed and running. These processes are all installed in the same environment, or namespace if you’re talking about Linux, and can interact with each-other. A container is simply the isolation of a single process and wrapping it up in – just as it sounds – a container. This container is isolated from the host-operating system and can only “see” and interact with what is explicitly allowed. See the example below to illustrate our point.
Let’s start with a traditional model in which we are installing applications on the OS: In this example, we‘ve installed NGINX Web Server (a process), but there are also several dependencies installed that support the main application, NGINX Web Server.
Let’s say that we also want to install NodeJS, which requires some of the same dependencies as NGINX Web Server , but perhaps the version of NodeJS requires a different version of those dependencies. Using the traditional model, this would require a complicated configuration to ensure that each of our applications are pointing to the correct versions of the dependencies. It would also be important to ensure that once an application or dependency was updated, the configuration changes were maintained.
Now if we were to use containers in this scenario, it would become easier to manage. The process (NGINX Web Server in this example) would be bundled in a container with the dependencies that it relies on. When we want to add another process (NodeJS), it resides in its own container along with its dependencies. This way, we don’t have to worry about version conflicts as everything is isolated.
Using containers is especially useful when developing applications. Someone might be developing on a laptop, testing on a server, and then deploying to the cloud or a co-worker’s desktop. All these environments are likely different, with different versions of a dependency installed or perhaps the hardware configuration is slightly different which would create additional troubleshooting efforts. Containers, however, obfuscate the hardware layer. They are platform agnostic. You could run the container on a laptop, server, or the cloud and it’s going to run the same. Using the traditional model, migrating an application from on-premises to the cloud or across cloud platforms is an onerous process. However, this process is streamlined and overall greatly simplified with containers.
So we‘ve gone over containers themselves, but there are other terms and technologies in the container ecosystem that we need to be familiar with. Let’s take a look at those.
Container images are what most people are referring to when they talk about a container. A container image is the actual static container file or bit that contains the process and its dependencies. A container image becomes a container when running.
Container images themselves are immutable; all changes made to a container image become new ‘layers’ of the image. This happens because when changes are made a git-like push/ pull mechanism is used. One benefit of image ‘layers’ is that they create a natural audit trail when used in conjunction with a container registry (defined below). All changes are visible over time, we can see the details of each change including by whom each change was made. A hierarchical nature to these ‘layers’ also exists, and container images can have parent/child relationships. E.g.: In our previous example container NGINX was running, but let’s say that we also needed a container running NGINX and PHP. A child container could be created that references and builds off our main NGINX container.
Let’s imagine that we discovered a vulnerability in one of the dependencies we had deployed. In the traditional virtual machine (VM) world we would have to patch each of our VMs that had this vulnerability. Now hopefully we would have an automated way of doing this, but even still, verifying that the patches were successful and the applications unaffected would be extremely time-consuming tasks. With containers, we would only need to update the container image and all containers running from that image would be updated. Additionally, any child container images referencing the now updated parent image would be updated as well.
Part of the container image is the manifest, better known as a ‘Dockerfile’ – if using Docker‘s terminology. The container manifest is a structured text file that contains the configuration settings and instructions needed to build the container image.
The container registry is a repository of container images. Public registries exist, such as Dockerhub, as do private registries which organizations can run to host their own internally developed images or clone public images.
Nodes & Clusters
A node is the hardware supporting the container environment. This could be a server, VM, or a cloud instance. In some cases, a group of nodes will be working together to support a container environment – this is referred to as a cluster.
Pods & Orchestrator
A pod is one or more containers which are grouped and managed by an orchestrator. An orchestrator is where rules and operations for scaling, failover, and running container workloads are created. So, while Docker offers tools and solutions for container creation and deployment, Kubernetes is an example of an orchestrator.
Virtual Containers vs. Virtual Machines
Another way to understand containers is comparing them with virtual machines, as people are more familiar with them as a technology.
Referring to diagram 2, we see that both VMs and containers start with Infrastructure, which could be a physical host or a cloud platform like AWS or Azure. The Host Operating System comes next, this would be something like Windows Server or ESX. After the Host OS, comes the hypervisor technology for VMs and the container runtime (e.g. Docker) for containers.
Now, on the VM side, we see that each individual VM has a full OS installed – the applications and dependencies are also installed on the VMs. Additionally, the hypervisor is virtualizing the hardware the VMs are running on which requires compute resources.
Conversely, with containers, we don’t need to install an entire OS. All that runs in the container is the process and its dependencies; this means that from a storage standpoint, the container size is only a fraction of the size of the VM. The container is also much less resource-intensive to run from a computational power standpoint.
Another significant difference between virtual machines and containers is that a VM is typically running on a start-and-stop schedule. Whereas the lifecycle of a container traditionally mirrors the lifecycle of the process it’s running. Or in other words – when the process starts, the container starts, when the process ends, the container stops running. Let’s illustrate this: Google is one of the largest contributors to the container platform. When we go to Google Search or YouTube, these are processes running in containers. Starting YouTube, for example, creates a new container and when we exit YouTube, that kills the container. In fact, Google starts and stops over 2 billion containers each week and they are able to manage demand dynamically using container orchestration.
This is an extreme use-case and it may not make sense to start up and kill other instances on whim, like SQL server for example. However, containers do give a scalability and elasticity that isn’t easily achieved through traditional virtualization means.
Quick recap of container benefits.
Let’s quickly recap the benefits of containers.
Containers are lightweight
Containers are predictable in offering a consistent sandbox
Containers are isolated
Containers are platform agnostic
These benefits allow for an increase in development agility and ease-of-deployment. That’s why container adoption rates and growth have increases year-over-year. In fact, Gartner predicts that by 2020, 50% of organizations will have deployed containers in their environments.