A Beginner’s Guide to Containers: Taking Action and Mastering Containerization

8 min readJun 21, 2023

Containerization

Most people in tech have heard of containers by this point. While most technology is driven by hype and inevitably fades out of memory, containers are the most important contribution to technology since Linux.

“It works on my machine” has been a joke repeated ad nauseam between developers and sysadmins alike. The process of getting an application into production has been a problematic ordeal for years.

Virtual Machines (VMs) were a temporary solution aimed at solving that very issue, but with convenience came a heavy expense in system resources. This approach was unsustainable as applications got bigger, network traffic increased, and scalability became a necessity rather than a nice feature.

Containers bridged the gap between running the application in a development environment, and in a VM. They serve as a way to ship the same environment as the developer to production without the need for a sysadmin to read a list of dependencies to install and services to configure.

A prepackaged and modular Linux environment complete with environment variables, dependencies, runtime, libraries, and versions, containers are non-competing due to their isolation. They ensure interoperability between networked services with the help of network namespaces and most importantly, efficiency.

Containers can be created and destroyed without causing the same impact that a failed VM would. Due to their smaller size, it takes only seconds to restart a container.

These are just a few of the benefits that containers can bring to the table. For this reason, the industry has doubled down on containerization.

The Rise of Docker

Container technology is revolutionary, not because it’s a completely novel idea, but due to the fact that it combines previously existing Linux tools like cgroups, kernel namespaces, and union file systems in a way that is easy for someone who doesn’t have a decade of industry experience to utilize effectively.

The first company to chain these tools together was DotCloud. DotCloud aimed to create a platform as a service company. Facing the familiar issue of isolation, founder Solomon Hykes sought to develop a solution that would solve the problem for DotCloud and the world as a whole.

This directly led to the creation of the universally known containerization tool, Docker, with its grand unveiling at PyCon 2013.

Docker’s initial release generated waves across the community, so much so that DotCloud re-branded to Docker, Inc. and started focusing on their flagship product. Being first to market, Docker is here to stay and continues to be the most widely used containerization system.

Container Fundamentals

Before containers can be understood, what must come first is understanding images.

An image is an executable software package that holds everything needed to run an application.

That can include:

Runtime
Source code
Libraries
System tools

Images serve as the blueprint for a container. The container is the running instance of that blueprint.

The concept of Object Oriented Programming provides a wonderful analogy for this. The image is the class, and the container is the object. Multiple separate containers can be created from the same image.

Additionally, an image that already exists can be used as a building block for a new custom image. This operates much like inheritance.

Many pre-built images of popular software such as NGINX and PostgreSQL exist to serve as a base image. These provide a platform for developers and sysadmins to build their software on top of.

Understanding the difference between images and containers is an important distinction that must be grasped going forward.

REMEMBER:

Images = Blueprint
Containers = Running instance

Docker Architecture

Docker isn’t a simple piece of software it’s comprised of multiple components in order for containers to be ran efficiently.

It’s easy to get very deep into the architecture, however, the fundamental components that are at play are as follows:

Docker Daemon— The process that facilitates the creation and management of containers on the host OS.
Docker CLI — The primary interface that sends commands to the daemon through a REST API.
Docker Registry — Repo that stores images in a public or private registry, similar to a package manager.

In Docker, the daemon that runs on the host OS requires Docker to run with root privileges.

NOTE: This has become a point of contention for some sysadmins and developers, leading to the creation of Podman by Red Hat, an alternative container runtime. Podman won’t be covered in this article; it will have its own.

In order to interface with the Docker daemon, Docker provides a CLI (Command Line Interface) tool. The Docker CLI allows for a user to build, pull, and push images. Additionally, containers can be started, stopped, and monitored with the CLI.

The Docker registry is a public registry where developers can upload and download their images to be used in Docker environments. DockerHub acts as the default location for images. Private registries can also be used by individuals and companies alike.

Running a Container

The popularity of container technology is partially due to it’s low barrier to entry. As long as you have docker installed, you can run:

‘docker run -it ubuntu’

And you will be in an Ubuntu environment in a matter of seconds.

Let’s breakdown that command:

docker = The docker command
run = Runs a command in new container
-it = Stands for interactive and tty respectively. Most people think it means “interactive terminal”, which can be useful if you’re just getting into it.
ubuntu = This is the image that you want to run interactively

The reason you don’t have to provide a final argument, such as ‘bash’, to indicate that you want to start a shell in the container is because the entrypoint of the Ubuntu image is already set to ‘bash’.

This becomes more important when we move on to building images. Running containers interactively is useful, but it’s not as useful if we can’t save any data to our host machine. Fortunately, when we run a container, we can specify a directory on our host machine and link it to a directory inside the container. Any changes made to files within that directory while the container is running will be saved on the host machine and persist even after the container is stopped or if it fails.

This is advantageous because we can perform operations on data without relying on programs installed on our host machine. Fedora Silverblue embraces this concept and takes it even further.

To set volumes at runtime, we need to pass an additional flag to our Docker command:

‘docker run -it -v host_directory:container_directory ubuntu’

breakdown:

-v = This option tells the docker command that the next argument is going to be a volume mapping.
host_directory = This can be any directory on your host, typically you will create a directory specifically for whatever container image you run.
container_directory = This directory is that directory that will exist inside of the container. If the directory doesn’t exist, the directory will be created at runtime.

The important thing to remember is that whenever you perform any mapping in Docker, the order will always be host:container.

The last important topic to get you started is the concept of ports. By default, your container runs inside its own virtual network called a network namespace. If I host an Nginx web server and try to access it via localhost, nothing will happen. That’s because the port needs to be forwarded to the host.

This sounds confusing but getting it to work is extremely simple.

‘docker run -p host_port:container_port’ nginx’

breakdown:

-p = This stands for ‘publish’ in the documentation, but everyone remembers it by port.
host_port = the network port on the host that you will map to a port on the container.
container_port = the port that the image publishes by default

A real example would be something like this:

‘docker run -p 8080:80 nginx’

Port 8080 on my host maps to port 80 on the container. Whenever I access localhost:8080 it will translate to 80 on the container. This is extremely useful because you can have multiple nginx instances that are mapped to different ports on your host machine.

Building a Custom Image

NOTE: Since Docker is the most commonly used, these tutorials will be focusing on the docker ecosystem. Podman commands have a 1–1 compatibility with Docker and can be used as well.

The beauty of containerization is that you can infinitely build and expand upon images, adding your own files, configurations, or packages. This allows you to create a custom image with your desired changes, and once you’ve tested it, you can be confident that it will run as intended.

To start creating images, begin with a base image. This can be any image available in a registry, whether it’s public or private. On top of the base image, build instructions are added to further extend the image with your desired changes and configurations.

To create a custom image, it’s important to have an understanding of how filesystems work within the context of containerization.

Technologies like Podman and Docker rely on a layered approach to their images. The lowest layer is the base image, and anything you add to it will create its own separate layer. This aspect is crucial to consider when creating custom images. Each command you execute in your Dockerfile will add an entire filesystem layer on top of the previous changes.

In this Dockerfile, there are two ‘RUN’ commands. Once the first command completes, a new filesystem is layered over the previous one, upgrading my packages. In total, there are four layers.

This is not ideal. The aim is to have fewer layers to keep the size of the image as lean as possible.

A quick optimization would be to combine the two ‘RUN’ commands into one.

The previously four-layered image is now three layers. It doesn’t make the greatest difference, but the whole is greater than the sum of its parts when it comes to building images.

To build your image, go into the directory where your Dockerfile is located and run:

‘docker build -t <image name> .’

One thing to note is that images can take time to build. It’s important to utilize Docker’s and Podman’s built-in caching solutions.

If you copy the hypothetical ‘api’ directory in the very beginning of the Dockerfile and make changes to a file in that directory, the image will have to be rebuilt and the ‘RUN’ commands will have to run again.

If you specify the ‘COPY’ command further down in the Dockerfile, the previous build instructions will be cached and the build process will be significantly shortened.

Another thing to consider is interactive programs. Running a command such as ‘apt upgrade’ will prompt you for confirmation. If you don’t set ‘-y’ to auto-approve the upgrade, your build will fail.

Closing

A lot of these concepts may seem novel, and foreign. But with anything else, familiarity comes with time. Simply practicing and playing will help to solidify these concepts.

A Beginner’s Guide to Containers: Taking Action and Mastering Containerization

Written by Timothy Rudenko