PaperLB

Adil H
6 min readFeb 21, 2023

--

This blog has moved to https://didil.substack.com/

A Kubernetes Network Load Balancer Implementation

PaperLB Logo

Introduction

Edit (2023–02–24): The initial version of this article / library used service annotations for configuration. This has been changed as I received helpful feedback to use a CRD for configuration instead.

You might have noticed that vanilla Kubernetes does not come with a Load Balancer implementation. If you create a LoadBalancer Service in a self-hosted cluster setup, its status will remain “pending” and it won’t show an external IP that you can use to access the service. It should look something like this:

$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
k8s-pod-info-api-service LoadBalancer 10.43.12.233 <pending> 5000:31767/TCP 6s

On the other hand, when you create a LoadBalancer Service in a managed kubernetes cluster such as GCP GKE or AWS EKS, the service will receive an External IP from a Load Balancer assigned by the cloud provider. If you’re curious how that works, you can check the Kubernetes documentation about the Cloud Controller Manager.

The idea behind PaperLB is to allow “LoadBalancer” type Kubernetes services oto work with external network load balancers in any environment. PaperLB allows you to use an external L4 Load Balancer of your choice (an nginx server for example) in front of your Kubernetes cluster services. It should work on your development clusters running locally as well as cloud virtual machines or bare metal.

PaperLB High Level Architecture

How Does it Work ?

PaperLB is implemented as a Kubernetes Operator and includes:

  • Custom Resource Definitions
  • Kubernetes Controllers that manage the Custom Resources and interact with your Load Balancer.

The main idea is:

  • You create a Kubernetes LoadBalancer type service and a LoadBalancerConfig configuration object.
  • A controller notices the Service and LoadBalancerConfig and creates a “LoadBalancer” object.
  • A controller notices the “LoadBalancer” object and updates your network Load Balancer via an HTTP request using the config data + the service/nodes info.

Code Repositories

While working on this project I have created 3 Github repositories that support the demo. Please feel free to explore/fork/contribute to these repositories:

  • K8s Pod Info API: A simple JSON API that returns information about the Kubernetes Pod where it’s running. This will allow us to see that the load balancing is effectively happening.
  • NGINX Load Balancer Updater API: Allows updating an Nginx L4 Load Balancer setup through a JSON API. This is the component that will receive the updates from the PaperLB Operator and that is responsible for updating the Network Load Balancer configuration.
  • PaperLB: The Kubernetes Operator implementation.

Load Balancer Updater API

The NGINX Load Balancer Updater API implementation I have built works for Nginx, but this is just an example of an updater implementation. Updaters can be built for other Network Load Balancers as long as you can think of a programmatic update mechanism. This diagram shows how the Nginx implementation works:

Load Balancer Updater API
  • PaperLB sends the updates via HTTP requests to the NGINX Load Balancer Updater API.
  • The NGINX Load Balancer Updater API updates the Nginx config files on disk.
  • The Nginx Config Watcher notices the config files changes
  • The Nginx server is reloaded
  • The new load balancing rules are applied

PaperLB operator implementation

The operator was built using Kubebuilder. Here is how it works:

When a service of type “LoadBalancer” is created and a “LoadBalancerConfig” Custom Resource exists, the Service Controller creates a LoadBalancer custom resource in the cluster.

Service :

apiVersion: v1
kind: Service
metadata:
labels:
app: k8s-pod-info-api
name: k8s-pod-info-api-service
#optional annotation to use a config different than the default config
#annotations:
# lb.paperlb.com/config-name: "my-special-config"
spec:
ports:
- port: 5000
protocol: TCP
targetPort: 4000
selector:
app: k8s-pod-info-api
type: LoadBalancer

LoadBalancerConfig :

apiVersion: lb.paperlb.com/v1alpha1
kind: LoadBalancerConfig
metadata:
name: default-lb-config
namespace: paperlb-system
spec:
default: true
httpUpdaterURL: "http://192.168.64.1:3000/api/v1/lb"
host: "192.168.64.1"
portRange:
low: 8100
high: 8200

Operator logs :

2023-02-21T17:43:24Z    INFO    Creating a Load Balancer        {"controller": "service", "controllerGroup": "", "controllerKind": "Service", "Service": {"name":"k8s-pod-info-api-service","namespace":"default"}, "namespace": "default", "name": "k8s-pod-info-api-service", "reconcileID": "fb694caf-376f-4dc7-bfc0-fd981be7eda7", "LoadBalancer.Name": "k8s-pod-info-api-service"}

LoadBalancer Custom Resource:

apiVersion: lb.paperlb.com/v1alpha1
kind: LoadBalancer
metadata:
name: k8s-pod-info-api-service
spec:
configName: default-lb-config
host: 192.168.64.1
httpUpdater:
url: http://192.168.64.1:3000/api/v1/lb
port: 8100
protocol: TCP
targets:
- host: 192.168.64.6
port: 31767
- host: 192.168.64.5
port: 31767
status:
phase: PENDING

The IP address for our Load Balancer (“host” in the spec) is “192.168.64.1”. The port set on the targets is the NodePort set for the service. The IP addresses “192.168.64.6” and “192.168.64.5” that you can see on the file are the external IPs of the Kubernetes cluster nodes. The demo runs on a local k3s cluster.

$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3s-local-agent-1 Ready <none> 4d21h v1.26.1+k3s1 192.168.64.6 192.168.64.6 Ubuntu 22.04.1 LTS 5.15.0-60-generic containerd://1.6.15-k3s1
k3s-local-server Ready control-plane,etcd,master 4d21h v1.26.1+k3s1 192.168.64.5 192.168.64.5 Ubuntu 22.04.1 LTS 5.15.0-60-generic containerd://1.6.15-k3s1

The Load Balancer Controller then notices the new LoadBalancer resource. It uses an HTTP client to notify the NGINX Load Balancer Updater API via the url provided in the config. After the HTTP request succeeds, the Load Balancer Controller updates the LoadBalancer Resource and sets “.status.phase” to “READY”.

Operator logs:

2023-02-21T17:43:24Z    INFO    Updating load balancer via http updater {"controller": "loadbalancer", "controllerGroup": "lb.paperlb.com", "controllerKind": "LoadBalancer", "LoadBalancer": {"name":"k8s-pod-info-api-service","namespace":"default"}, "namespace": "default", "name": "k8s-pod-info-api-service", "reconcileID": "ea06e903-860a-4e28-bb8a-c7d3945eac63", "oldPhase": "", "newPhase": "READY"}

The Service Controller now notices that the LoadBalancer resource status phase is “READY”, it updates the Service resource to set the Load Balancer External IP on it.

Operator Logs:

2023-02-21T17:43:24Z    INFO    Adding Load Balancer Host to service    {"controller": "service", "controllerGroup": "", "controllerKind": "Service", "Service": {"name":"k8s-pod-info-api-service","namespace":"default"}, "namespace": "default", "name": "k8s-pod-info-api-service", "reconcileID": "117b697e-227d-486a-b9e1-641e16ce457f", "host": "192.168.64.1"}

Services list:

$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
k8s-pod-info-api-service LoadBalancer 10.43.12.233 192.168.64.1 5000:31767/TCP 59m

Let’s now have a look at the Nginx configuration file that was generated:

# /etc/nginx/streams.d/default_k8s-pod-info-api-service.conf 
# generated by nginx-lb-updater, do not edit as changes can be overwritten
upstream default_k8s-pod-info-api-service {
server 192.168.64.6:31767;
server 192.168.64.5:31767;
}

server {
listen 8100 ;
proxy_pass default_k8s-pod-info-api-service;
proxy_timeout 5s;
proxy_connect_timeout 2s;
}

This looks like a valid stream configuration for an NGINX L4 Load Balancer listening on port TCP 8100.

We can now test if our setup works via a couple of curl commands:

# first request
$ curl -s 192.168.64.1:8100/api/v1/info|jq
{
"pod": {
"name": "k8s-pod-info-api-84dc7c9bdd-mz74t",
"ip": "10.42.0.27",
"namespace": "default",
"serviceAccountName": "default"
},
"node": {
"name": "k3s-local-server"
}
}
# second request
$ curl -s 192.168.64.1:8100/api/v1/info|jq
{
"pod": {
"name": "k8s-pod-info-api-84dc7c9bdd-22p6g",
"ip": "10.42.1.15",
"namespace": "default",
"serviceAccountName": "default"
},
"node": {
"name": "k3s-local-agent-1"
}
}

We can see that our 2 requests have hit different pods, running on different servers in this case. It works ! 🎆 🎊

Notes:

  • The setup described also works for a UDP Load balancer.
  • Turning on a node / shutting down a node triggers a network load balancer update.
  • Deleting the service also triggers a network load balancer update to delete the corresponding config. This works using Kubernetes Finalizers.

Conclusion

PaperLB aims to provide a simple but extensible mechanism to integrate network load balancers to self-hosted Kubernetes clusters. It’s a very young open source library but I hope to receive feedback and contributions from others to build it into a more mature solution.

Leave comments below if you have any questions or feedback !

--

--