What is Kubernetes and Why it is so Important for Machine Learning Engineering and MLOps


Source: UnsplashIf you follow up technology trends, data science, artificial intelligence and machine learning, chances are that you have come across with the term Kubernetes (or K8s, its geek nickname).

Created by Google, Kubernetes is an open source container orchestration platform. The name “Kubernetes” comes from the greek language, meaning “helmsman” or “captain”. But before understanding the reason for all these nautical references, we first need to address the question: what are containers?

Why do we need containers

Source: UnsplashContainers exploded in 2013 with the creation of Docker. Their creation was motivated by a longstanding issue in software engineering: how to get software to run reliably when moved from one environment to another. Such environments could range from anything from laptops, bare metal servers sitting in a good old private data center or virtual machines in the cloud. Packaging an application into a container means that the end result of running it across any of these environments will be consistent.

This is specially critical as technology organizations shift away from monolithic architecture patterns, such as service oriented architecture (SOA), to more loosely coupled, microservices based approaches such as service mesh — which in itself are good candidates for another article.

The need for container orchestrators

Containers solve a large part of the issue, but they are not enough, in that they are more of an abstraction. To get a working product, we still need a container runtime — an additional layer to coordinate and allocate proper resources to running containers. To solve that problem, the folks at Docker created containerd, a container runtime.

“Why not use containerd, then?”— you might rightfully ask.Yes, you could simply spin up a virtual machine and use Docker or Docker Compose to run one or more different containers.Well, the challenge here is that as application architectures become more complex, we need to manage things such as:

This is where Kubernetes comes in.

Meet the Helmsman

Source: UnsplashImagine that we have a transportation business in the Port of Rotterdam, and we must routinely ship multiple containers to Port of Southampton, in the United Kingdom. One way to do it would be using a simple, small ship that which is able to carry a couple of containers. It is able to move fast and take these containers from Rotterdam in a seamless fashion. However, it has the following caveats:

If we instead decide to leverage an bigger, full fledged cargo ship to move our containers from Rotterdam to Southampton, we can do it in a much more reliable, scalable way:

In this basic example we can picture a simple container runtime as a the small ship, and Kubernetes as a full fledged cargo ship. It was created by Google with a basic feature set in mind:

The combination of these features also makes it possible to leverage some bonus aspects. Being able to easily replicate application instances also eases the task of deploying application versions without any kind of disruption — which enables for continuous deploymentand more agile practices.

And that’s not all. By leveraging Kubernetes manifests’ declarative approach for creating resources within the cluster, we are paving the way for, amongst other things, enabling self-service architectures where data engineers, data scientists and users in general can provision their own infrastructure with very low (human) effort.

Why Kubernetes Is So Important for Machine Learning and MLOps

Source: UnsplashRecently, companies have started realising that in machine learning, research and development plays a huge part. But if not enough time, budget and effort is spent in properly designing and deploying machine learning systems in a reliable, available, discoverable and efficient manner, it becomes challenging to reap all of its benefits.

Keeping Your Machine Learning Models on the Right Track: Getting Started with MLflow, Part 2
Learn how to use MLflow Model Registry to track, register and deploy Machine Learning Models effectively.mlopshowto.com
Keeping Your Machine Learning Models on the Right Track: Getting Started with MLflow, Part 1
Learn why Model Tracking and MLflow are critical for a successful machine learning projectmlopshowto.com
About MLOps And Why Most Machine Learning Projects Fail — Part 1
If you have been involved with Machine Learning (or you are aiming to be), you might be aware that reaping the…mlopshowto.com
By thinking of machine learning systems with an engineering mindset, it becomes clear that Kubernetes is a good match to achieve the aforementioned reliability, availability and time-to-market challenges. Kubernetes helps adopting the main principles of MLOps, and by doing that we are increasing the chances that our machine learning systems will provide the expected value after all the investment made in the research, development and design steps.

Finally, there is also a secondary aspect which is related to the current MLOps ecosystem which builds upon Kubernetes — with existing tools and frameworks such as Kubeflow, Pachyderm, ArgoandSeldonhaving been created to address common MLOps challenges as examples.

Like with everything in life, it is not all fun and games

Conceptualizing and implementing a Kubernetes architecture does not happen from one day to the other, and managing Kubernetes clusters comes with additional responsibilities, namely in terms of governance, security, support, maintenance and costs —critical components which are also good topics for a next article.