04: MetalLB Load Balancing¶

4.1 Introduction¶

In cloud environments like AWS, Azure, or GCP, when you create a Service of type LoadBalancer, the cloud provider automatically provisions a load balancer, assigns it a public IP, and routes traffic to your Kubernetes nodes. On bare‑metal or hosted servers (like Hetzner), there is no cloud provider to do this for you.

This is exactly the gap that MetalLB fills.

MetalLB is a load‑balancer implementation for bare‑metal Kubernetes clusters. It gives you the same experience as a cloud LoadBalancer:

• You create a Service of type LoadBalancer.
• An external IP is assigned from a pool you define.
• Traffic arriving on that IP is routed to the correct Kubernetes Service.

In LocalCloudLab, MetalLB is a critical piece because:

• Envoy Gateway will be exposed via a LoadBalancer Service.
• HTTPS (port 443) and HTTP (port 80) must be reachable from the internet.
• You may want additional external services (e.g., Grafana, Seq) reachable from outside.

This section explains how MetalLB works, how to choose IP ranges, how to install it with Helm (using non‑deprecated charts), and how to verify and troubleshoot it.

4.2 How Load Balancing Works on Bare‑Metal¶

On cloud platforms, the provider controls routers, firewalls, and public IP pools. In your single‑server LocalCloudLab setup, the network is much simpler:

• Your provider gives your server 1 public IP.
• The server is connected to a Layer 2 network (a switch).
• Other IPs in the same subnet can be claimed by your server via ARP announcements.

MetalLB uses this property of Layer 2 networks: it pretends to be the owner of additional IP addresses and tells the network switch:

“If someone looks for IP 172.18.255.200, send the packets to me.”

This is done via ARP (Address Resolution Protocol). Once traffic arrives to your server, the Linux kernel and kube‑proxy handle forwarding to the right Service and Pods inside Kubernetes.

High‑level flow:

[Internet] → [Your provider router] → [Server NIC] → [MetalLB‑owned IP] → [Service] → [Pod]

MetalLB provides the “extra IP” illusion that cloud load balancers normally provide automatically.

4.3 Choosing a MetalLB IP Range¶

Before installing MetalLB, you must choose a pool of IP addresses that MetalLB is allowed to use. This is one of the most important decisions.

Criteria for picking the range:

1. The IPs must be in the same Layer 2 network as your server.
2. The IPs must NOT be already in use.
3. The range must NOT conflict with DHCP ranges.
4. The provider must route that subnet to your server or allow ARP/neighbor advertisement.

On many hosted environments (like Hetzner):

• Your server may have a main internal IP, e.g. 172.18.0.5
• You can safely choose a small pool at the high end of that subnet:
      172.18.255.200 – 172.18.255.220

You must confirm in your hosting panel or documentation which private range is attached to your server, then pick a small, unused part of that range.

Important guidelines:

• Use a small range (10–20 IPs) for easier management.
• Reserve specific IPs on paper for specific purposes (e.g., 1 for Envoy Gateway, 1 for Grafana).
• Do not pick 127.0.0.0/8, 10.0.2.0/24 (used by some hypervisors), or random public ranges.

Example pool we’ll use throughout this guide (adapt to your real network):

Address range: 172.18.255.200–172.18.255.220

4.4 Installing MetalLB with Helm (Non‑Deprecated)¶

MetalLB used to be configured via a ConfigMap; newer versions use CRDs like IPAddressPool and L2Advertisement. Older Helm charts or YAMLs you might find online are often deprecated. We will use the current recommended Helm installation flow.

Prerequisites:

• k3s cluster running and Ready.
• kubectl configured and working.
• Helm installed on the server (or on your Windows dev machine, using kubeconfig).

If Helm is not installed on the server:

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

4.4.1 Create Namespace for MetalLB¶

kubectl create namespace metallb-system

4.4.2 Add the Official MetalLB Helm Repository¶

helm repo add metallb https://metallb.github.io/metallb
helm repo update

4.4.3 Install MetalLB¶

helm install metallb metallb/metallb -n metallb-system

This will:

• Install the MetalLB controller.
• Install the MetalLB speaker DaemonSet (runs on every node).
• Create the required CRDs (IPAddressPool, L2Advertisement, etc.).

Check that everything is running:

kubectl get pods -n metallb-system

Expected pods:

• metallb-controller-xxxxx
• metallb-speaker-xxxxx (DaemonSet, 1 per node)

4.5 Configuring MetalLB in Layer 2 Mode¶

MetalLB supports two operating modes:

• Layer 2 (ARP / NDP)
• BGP

For LocalCloudLab, Layer 2 mode is simpler and works perfectly if your provider allows ARP announcements (most do for a private subnet attached to your server).

Newer versions of MetalLB use CRDs rather than a ConfigMap. You must define at least:

• An IPAddressPool resource.
• An L2Advertisement resource that references that pool.

4.5.1 IPAddressPool Configuration¶

Create a file named metallb-ip-pool.yaml (on your dev machine Git repo, under e.g. k8s/metallb/):

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: localcloudlab-pool
  namespace: metallb-system
spec:
  addresses:
    - 172.18.255.200-172.18.255.220

Apply it:

kubectl apply -f k8s/metallb/metallb-ip-pool.yaml

Make sure you adjust the addresses range to your real network.

4.5.2 L2Advertisement Configuration¶

Create metallb-l2advertisement.yaml:

apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: localcloudlab-l2
  namespace: metallb-system
spec:
  ipAddressPools:
    - localcloudlab-pool

Apply:

kubectl apply -f k8s/metallb/metallb-l2advertisement.yaml

MetalLB is now ready to hand out IPs from 172.18.255.200–172.18.255.220 to any Service of type LoadBalancer in the cluster.

4.6 Validating MetalLB Behavior¶

Now we confirm that MetalLB actually assigns IPs and forwards traffic correctly.

4.6.1 Deploy a Test LoadBalancer Service¶

Create a basic nginx deployment and service:

kubectl create namespace lb-test

kubectl create deployment nginx-test         --image=nginx         -n lb-test

kubectl expose deployment nginx-test         --port=80         --type=LoadBalancer         -n lb-test

Now check the Service:

kubectl get svc -n lb-test

You should see something like:

NAME         TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)        AGE
nginx-test   LoadBalancer   10.43.120.45   172.18.255.200    80:31544/TCP   1m

Key points:

• EXTERNAL-IP is in the pool you configured.
• It may take a few seconds for the IP to appear.

4.6.2 Test from Inside the Cluster¶

kubectl run curl-test         --image=radial/busyboxplus:curl         -it --rm -n lb-test --         curl http://nginx-test.lb-test.svc.cluster.local

You should receive the default nginx HTML page.

4.6.3 Test from Outside (Your PC or Any External Client)¶

From your Windows machine:

curl http://172.18.255.200

Or open the IP in a browser. You should see the nginx welcome page.

If this works:

• MetalLB pool is configured properly.
• ARP is working on the provider network.
• Traffic is successfully routed into the cluster.

4.7 Troubleshooting MetalLB Issues¶

If MetalLB does not behave as expected, use the following troubleshooting guide.

4.7.1 EXTERNAL-IP Stuck at ""¶

If kubectl get svc shows:

EXTERNAL-IP   <pending>

Possible causes:

• MetalLB not running (check pods in metallb-system).
• IPAddressPool or L2Advertisement misconfigured or missing.
• Wrong namespace used in YAML.
• MetalLB CRDs not installed (old or partial Helm install).

Check:

kubectl get ipaddresspools.metallb.io -n metallb-system
kubectl get l2advertisements.metallb.io -n metallb-system
kubectl describe svc nginx-test -n lb-test

4.7.2 External IP Assigned but Not Reachable¶

Symptoms:

• EXTERNAL-IP is visible.
• Ping/curl from outside time out.

Possible causes:

• Provider firewall blocks the chosen IP range.
• Server firewall (UFW) drops packets.
• Wrong subnet or non‑routable addresses.

Check from the server:

ip addr | grep 172.18.255
ip neigh | grep 172.18.255.200

The IP must be visible on the node and ARP must resolve.

4.7.3 Connectivity Works Internally but Not Externally¶

If curl from inside the cluster works but from outside fails:

• Check provider firewall rules.
• Ensure that port 80/443 is allowed to the server.
• Confirm that the external IP is in the provider’s routed subnet.

4.7.4 MetalLB Speaker Not Running¶

Run:

kubectl get pods -n metallb-system -o wide

If speaker pods are in CrashLoopBackOff, inspect logs:

kubectl logs -n metallb-system <speaker-pod-name>

Look for:

• Permission errors.
• Interface detection issues.
• Problems binding to ARP/NDP.

4.8 Integrating MetalLB with Envoy Gateway¶

Now that MetalLB is functioning, it will be the foundation for Envoy Gateway exposure.

4.8.1 Envoy Gateway Service Type¶

When Envoy Gateway is installed, it will typically be exposed as:

kind: Service
apiVersion: v1
metadata:
  name: envoy-gateway
  namespace: envoy-gateway-system
spec:
  type: LoadBalancer
  ports:
    - name: http
      port: 80
      targetPort: 8080
    - name: https
      port: 443
      targetPort: 8443
  selector:
    app.kubernetes.io/name: envoy-gateway

MetalLB will automatically assign an external IP from the pool.

4.8.2 Reserving an IP for the Gateway¶

It’s a good idea to dedicate one IP to the main Envoy Gateway Service, for example:

172.18.255.200  → Envoy Gateway
172.18.255.201  → Grafana (optional external access)
172.18.255.202  → Seq (optional external access)

MetalLB itself doesn’t “reserve” IPs by name, but you can set an annotation on a Service to request a specific IP:

metadata:
  annotations:
    metallb.universe.tf/loadBalancerIPs: "172.18.255.200"

This tells MetalLB to try to allocate that particular IP (must be inside the pool).

4.8.3 DNS for Gateway and Services¶

Once Envoy Gateway gets an IP (e.g., 172.18.255.200), you can create DNS records:

search.hershkowitz.co.il → 172.18.255.200
checkin.hershkowitz.co.il → 172.18.255.200

Envoy Gateway will then route based on hostnames and HTTPRoutes.

This is where LocalCloudLab becomes a truly realistic, production-like environment.

4.9 Best Practices for MetalLB in LocalCloudLab¶

To keep the environment robust and easy to manage, follow these best practices:

4.9.1 Keep the IP Pool Small and Documented¶

• Use 10–20 IPs.
• Maintain a small table in your documentation:

      172.18.255.200 – Envoy Gateway
      172.18.255.201 – Monitoring (Grafana)
      172.18.255.202 – Seq
      172.18.255.203 – Spare testing IP

4.9.2 Expose Only What You Need¶

• Don’t expose PostgreSQL, Redis, or RabbitMQ as LoadBalancer.
• Keep data services internal to the cluster.
• Only Envoy (and optional admin tools) should be accessible externally.

4.9.3 Monitor Disk and Network¶

MetalLB itself is lightweight, but the services behind it can generate:

• High HTTP traffic
• Large logs
• High database load

Use Prometheus and Grafana to monitor:

• HTTP request rates
• Error rates
• Network throughput
• Pod restarts

4.9.4 Keep Config in Git¶

Store MetalLB manifests in your Git repository:

k8s/
  metallb/
    metallb-ip-pool.yaml
    metallb-l2advertisement.yaml

This aligns with the GitOps philosophy:

• All cluster config in Git.
• Changes go through pull requests.
• Easy to recreate the environment on a new server.

4.9.5 Understand Your Provider’s Network Model¶

Each provider may have small differences:

• Some require you to attach additional IPs or subnets explicitly.
• Some require static routes or vSwitch/VLAN configuration.

Always verify in your provider dashboard how private/public IPs are routed.

End of Section 04 — MetalLB Load Balancing¶

At this point, your LocalCloudLab Kubernetes cluster:

• Runs on a hardened Ubuntu server.
• Uses k3s as a lightweight, full Kubernetes distribution.
• Has MetalLB installed and configured in Layer 2 mode.
• Can assign real external IP addresses to Services of type LoadBalancer.
• Is ready for Envoy Gateway to be exposed publicly with HTTP and HTTPS.

In the next section (Section 05 – Envoy Gateway), you will:

• Install Envoy Gateway via Helm.
• Configure a main Gateway resource.
• Add HTTPRoutes for Search API and Checkin API.
• Integrate TLS certificates through cert-manager.
• Verify host-based routing and secure ingress.

This is where your APIs become truly internet‑accessible through a modern, production‑grade gateway layer.

(End of Section 04 – Complete)