Intro & Kubernetes Overview

On this page: Introduction to Kubernetes, its components, and expectations for how to secure it.

In this guide, we’re directing our guidance to those of us that have been tasked, possibly begrudgingly, with analyzing the risk of a Kubernetes cluster; a task we often needed to perform before we decide how much effort we’re going to invest into securing it.

You never know who’s going to need to one day understand the basics of Kubernetes: Security engineers, developers, administrators, fire fighters, day care workers… anyone, may one day find out that there’s a cluster in their organization, and someone needs to figure it out.

In my experience, the most common scenario has been larger organizations with teams expertly trained to handle monolithic stacks, suddenly told they need to determine if a new company with more agile development practices presents any risk prior to the acquisition. And that turns into a Friday meeting to discuss what Kubernetes is… is it virtualization? What does it mean that machines are ephemeral?

Goals of this Guide

Whatever your role is in this scenario, we can break it down into 3 steps:

  1. Assess the risk

  2. Identify low hanging fruit

  3. “Secure” it with whatever means necessary

The goal here is to help you prioritize where and how to gather enough information for you to determine the impact your Kubernetes clusters have on your organization, how likely they are to be exploited, and build a case for establishing an initial risk assessment to help prioritize resources for you and your team.

You’ll learn how un-intuitive it may be to accurately describe your cluster from an impact perspective. We won’t use terms like “pretty big” or “average” or “kinda pretty.”

And how Kubernetes doesn’t necessarily care about IPs or networks.

Maybe you’ll even pick up some attacker perspectives on what exactly could happen if someone compromised a Kubernetes component. And how to mitigate that threat.

Kubernetes Overview

Kubernetes is a massive open source project that started at Google with what was then named Borg (see this blog post and academic paper for more details). Google turned it over to the open source community and it took on a life of its own now with more than 3.4m lines of Golang code, at least 3,000 contributors, and over 100 vendors selling a Kubernetes distribution of some sort.

But why is Kubernetes so popular and why did it appear in your environment?

For many, the common answer is related to scalability and multi-cloud support. While a container can run your application in a semi-confined environment, it doesn’t have the ability to natively scale up and down, provide authorization controls, or integrate into other services. It’s just a container.

Kubernetes, as a container orchestrator, does all the heavy lifting of what we were told containers were going to be good at: Upward and downward scalability, inter-container network controls, load balancing, monitoring, etc.

Sysdig has a great blog series on Securing Kubernetes Components that can give you an introduction to how each of a cluster’s components work together. This won’t apply every cluster, but it’s good information to help you get an understanding of which components do what.

Kubernetes isn’t the only orchestrator (others include Rancher, Nomad, and Mesos) but it has trounced the competition in terms of number of installations, vendor buy-in, and just buzz “wordiness.”

Think of Kubernetes as a platform or operating system, not a network of containers. At the center of the OS is the kernel, which in this case is the Kubernetes API hosted in its control plane.

This API controls all the core services in the cluster and care must be taken that the necessary authentication controls are put in place, you have transport security, and it isn’t being exposed to attackers.

From there, an OS has a bunch of small components that provide a glue between components such as IPC or storage management – for Kubernetes, this would be the variety of core services that run within the cluster like:

  • etcd: a distributed key-value store

  • DNS

  • kubelet: the primary “node agent” that runs on each node

  • kube-proxy: the network proxy running on each node)

These are all critical components.

Up to this point, if you were using a managed service provider, you can lean into the shared responsibility model because your provider is likely managing these services for you.

GCP’s managed Kubernetes (GKE), for example, doesn’t give you access to the control plane at all and if you need to configure something, you do it via the GCP console.

Finally, your OS is running a set of applications - your workloads.

This is where you really need to start paying attention to the security of the applications. A workload in Kubernetes could be Pods of containers that are running an application in some kind of container runtime. (For now, that’s likely Docker.)

Other Key Kubernetes Terms

  • Node: an individual machine or VM that make up the cluster

  • Namespace: an abstraction used to support multiple virtual clusters on the same physical cluster

  • Pod: the smallest and simplest Kubernetes object. A Pod represents a set of running containers on your cluster

    For a broader summary of key Kubernetes terms and concepts, see the official glossary docs or this cheat sheet.

In some ways, this is easier because you can rely on your understanding of applications or ask developers to weigh in. Where it gets messy is understanding the risks involved with a service mesh: which capabilities are necessary? What happens when a container runs in privileged mode?

For a more detailed overview of a Kubernetes cluster, take a look at Kubernetes Concepts.

While Kubernetes has features-aplenty, it also has many associated security challenges. Some have been discussed for years like insecure defaults and lack of default network controls. Others are more nuanced problems like Pod Security Policies [having some known weaknesses] or what happens when you edit the status of a load balancer in a multi-tenant cluster. Only you and your team will be able to determine to what degree you need to compensate for the problem.

Eventually you'll want to know about all complexities and insecurities. Here's a few to get you started:

Instead, let’s talk about what information you’re going to need to determine the overall risk in the context of your environment, be it an agile Fintech start-up or a droning monolithic mega-corporation.