On this page: You need to understand your environment before you can try to secure it. We'll start with where your clusters are and their size, which turns out to be surprisingly complicated.
Every company’s use of Kubernetes, business goals, and risk profile is different, so it’s essential that you first understand your environment.
Otherwise, you’re prone to not knowing what’s live in your environment or spending a lot of time doing hardening steps that don’t maximally minimize risk to your company.
In this section of the guide, we define the impact your Kubernetes clusters have on your organization and five key aspects to understanding your environment:
- Where are the clusters?
- How “big” are the clusters?
- How are you deploying Kubernetes?
- What’s running in your cluster?
- What’s running next to your cluster?
Defining the Impact
Taking the simplified equation:
IMPACT * EXPLOITABILITY = RISK
We’ll work through how to establish the impact that your clusters have on your organization. That is, if there was a compromise, what areas of the business would be affected?
1. Where are the Clusters?
Are your clusters deployed as EC2 instances within AWS or on-prem in your data center? In modern times, its smart to ask whether there is a hybrid approach to using Kubernetes where some components are in a cloud, and some are on-prem. This step is as much about taking inventory of the services as it is about identifying the hosting platforms.
Cloud providers all provide easy reporting on all your services. Logging into GCP, for example will give you a list of all your clusters in a nice interface or you can collect a list of them like so:
> gcloud container clusters list
On-prem services may be harder to track down. If the organization is deploying the clusters through some deployment tools or continuous delivery system, they might show up in the source code.
Consider hunting through your organization’s
source code for strings like
kubernetes. This might give you hints where Kubernetes is being used, at least
2. How “Big” Are The Clusters?
Imagine this scenario: your manager says that they want to know the scale of their Kubernetes footprint within the organization and you’ll need to provide the numbers at the next all-hands call.
A reasonable request, but how do you measure the size or complexity of a Kubernetes cluster?
Things that Don’t Work
- You can’t say “We have X clusters,” because each cluster can have any number of Nodes/servers in it.
- You can’t say “We have X Nodes today,” without knowing if Nodes are going to autoscale up or down depending on the workload.
- You can’t say “We have X processes running,” because there are so many different processes that don’t necessarily represent a workload. Some of which execute for only a few minutes at a time.
Determining the “size”of a cluster is like measuring the balloons at a kids’ birthday party. Do you care about the number of balloons in total or do you care more about whether each kid gets enough to be happy at the party? And just because you have 1,000 balloons, doesn’t mean that you need to maintain the space for that many kids before and after the party.
If this was the monolith, you’d have one gigantic, dusty black balloon, eclipsing the sun, looming over crying children… But measurement would be much easier.
Then how should you answer the size question?
Measuring By Complexity
I keep teetering between— Jon Barber 🤖 (@BonJarber) June 12, 2020
"Kubernetes is a brilliant abstraction that removes so much work"
"Kubernetes is bloated garbage that adds ridiculous complexity to otherwise simple tasks"
There are a few ways to represent the “size” of a Kubernetes cluster that are helpful from a security perspective.
I think the most useful one is to measure by complexity.
The number of Nodes isn’t as important as the number of NodePools – usually associated with a single set of workloads. The number of Pods isn’t as important as the number of objects like Deployments – sets of Pods that are configured to automatically scale and rebuild themselves.
Here are other important objects that help you figure out the scale:
- Affects resources: Daemonsets, Services, Jobs
- Affects complexity: Ingress, StatefulSets, ConfigMaps, Operators
Here’s a one-liner that will dump a Kubernetes cluster using
kubectl and summarize the various objects:
$ kubectl get all -A -o json | grep "\"kind\": \"[A-Z]" | wc -l
Measuring by Cost
The other common way to measure the size is by historical data that you can find regarding compute usage.
In some ways, this will give you a more accurate picture of the overall costs when you need to present the information to the higher-ups, but it doesn’t give your security team any information on the amount of effort that’s going to be necessary to take control of the environments.
If you want to help analyze your cluster from a cost perspective, I can recommend Kubecost. They are free for a single cluster and you can look at their source code to run some of the metrics yourself.