Tainting and Labeling Kubernetes Nodes to Run Special Workload — A quick guide that is finally NOT confusing

All right folks, I intend to keep this one short and that’s what I will do. I mean, it’s supposed to be easy but the official documentation(1, 2) makes it unnecessarily confusing. So I think maybe I can help to fill in the gap.

I will be using one of our business requirements at Buffer in this project, as an example for this blog post.

Quick recap

So, we need a few nodes that are dedicated to running cronjobs, and nothing else. At the same time, we want to make sure the cornjobs are scheduled to these nodes, and nowhere else. This means we need 2 things

  • Tainted nodes that don’t take other workloads
  • The workload that only goes to the destination nodes

Now, let’s start from nodes, then the workload

Nodes

Since the requirement is broken down to 2 aspects (see above), there are 2 things we will need to specify for nodes. As always, kops is my weapon of choice.

In kops, you can do this kops edit ig <INSTANCE GROUP IN INTEREST>

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: steven.k8s.com
  name: frequent-cronjob-nodes
spec:
  image: kope.io/k8s-1.13-debian-stretch
  machineType: m4.xlarge
  maxSize: 2
  minSize: 2
  nodeLabels:
    kops.k8s.io/instancegroup: frequent-cronjob-nodes
  role: Node
  subnets:
  - us-east-1b
  - us-east-1c
  taints:
  - dedicated=frequent-cronjob-nodes:NoSchedule

Tainting nodes

This prevents other workloads from being scheduled to them. It’s achieved by these 2 lines

taints: 
- dedicated=frequent-cronjob-nodes:NoSchedule

Labeling nodes

This helps a specialized workload to locate the nodes. It’s achieved by these 2 lines

nodeLabels:   
kops.k8s.io/instancegroup: frequent-cronjob-nodes

I know there are people who don’t use kops out there. If you are one of them, here are 2 commands to help

kubectl taint nodes <NODE IN INTEREST> dedicated=frequent-cronjob-nodes:NoSchedule

kubectl label nodes <NODE IN INTEREST> kops.k8s.io/instancegroup=frequent-cronjob-node

Workload

Similar to nodes, we will need to do 2 things to the deployment/cronjob yaml file. I’m including a complete yaml to save our eyes from this (yeah, you know what I’m talking about).

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  namespace: dev
  name: steven-cron
  labels:
    app: steven-cron
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          nodeSelector:
            kops.k8s.io/instancegroup: frequent-cronjob-nodes
          tolerations:
          - key: dedicated
            value: frequent-cronjob-nodes
            operator: "Equal"
            effect: NoSchedule
          containers:
          - name: steven-cron
            image: buffer/steven-cron
            command: ["php", "./src/Crons/index.php"]
          imagePullSecrets:
            - name: buffer

Tolerating taints

This makes sure the workload can be scheduled to the tainted nodes. It’s achieved by these lines

tolerations: 
- key: dedicated   
  value: frequent-cronjob-nodes   
  operator: "Equal"   
  effect: NoSchedule

Specifying destination nodes

This makes sure the workload is only to be scheduled to the specified nodes. It’s achieved by these 2 lines

nodeSelector:   
  kops.k8s.io/instancegroup: frequent-cronjob-nodes

Profit

This is it. We can now rest assure the right workload will be going to the right nodes. In this way, we can start building some specialized node groups for specialized workloads, say GPU nodes for machine learning or memory-intensive nodes for local caching.

I hope this helps in any way. Until next time, please feel free to hit me up on Twitter should you have any questions. ?