03: Installing Kubernetes (k3s)¶
3.1 Introduction¶
Kubernetes is the control center of LocalCloudLab. It orchestrates containers, manages resources, provides self-healing, and forms the base on which every other component (Envoy Gateway, PostgreSQL, Redis, RabbitMQ, observability stack) depends.
In a typical enterprise environment, Kubernetes clusters are created using tools like kubeadm, managed services (EKS, AKS, GKE), or vendor distributions. For LocalCloudLab, we want something that is:
• Lightweight
• Easy to install and upgrade
• Fully CNCF-compliant
• Resource-efficient on a single server
• Transparent enough to learn from
This is exactly why we choose k3s as the Kubernetes distribution for LocalCloudLab.
In this section, we will:
• Compare k3s with kubeadm and microk8s
• Explain the architecture of k3s
• Install k3s with Traefik disabled
• Manage kubeconfig on both Linux and Windows
• Inspect the underlying container runtime (containerd)
• Understand where k3s stores its files and state
• Verify the installation and troubleshoot common issues
3.2 Why k3s Instead of kubeadm or microk8s¶
There are multiple options to run Kubernetes on a single server. The most common are:
• kubeadm – official Kubernetes bootstrap tool
• microk8s – Canonical’s Kubernetes distribution
• k3s – Rancher’s lightweight Kubernetes
Each has strengths and weaknesses, but for LocalCloudLab our priorities are:
• Low complexity
• Low resource overhead
• High transparency and control
• Good community support
• Easy upgrades
• Production-like behavior
3.2.1 kubeadm (Full control, high complexity)¶
kubeadm is powerful but requires more manual effort:
Pros:
• Full, upstream Kubernetes – no shortcuts.
• Maximum control over every component.
• Ideal for real production clusters with multiple nodes.
Cons for LocalCloudLab:
• Heavier control plane resource usage.
• Requires more configuration (etcd, certificates, networking).
• Overkill for a single-node training and development cluster.
• Longer learning curve for an initial environment.
k3s internally uses the same core Kubernetes concepts but simplifies the outer shell.
3.2.2 microk8s (Canonical’s integrated Kubernetes)¶
microk8s is a snap-based, modular distribution from Canonical.
Pros:
• Easy “snap install microk8s” experience.
• Add-ons for DNS, ingress, storage, etc.
• Good integration with Ubuntu.
Cons:
• Uses snap for packaging, which complicates inspection and debugging.
• Some add-ons can lag behind upstream charts and projects.
• Less commonly used in production compared to k3s, especially for edge/single-node.
microk8s is an excellent option for some use cases but does not align as closely with the “transparent, GitOps-friendly, edge/Dev cluster” philosophy of LocalCloudLab.
3.2.3 Why k3s is Ideal for LocalCloudLab¶
k3s was explicitly designed for environments like LocalCloudLab:
• Lightweight:
Runs the full Kubernetes control plane with a smaller memory and CPU footprint.
• CNCF-certified:
It is a “real” Kubernetes, not a partial reimplementation.
• Embedded datastore:
Uses SQLite by default for single-node setups, removing etcd complexity.
• containerd runtime:
Uses containerd instead of Docker, matching modern Kubernetes standards.
• Simple install & upgrade:
A single install script; upgrades are predictable and well-documented.
• Production-ready:
Widely used in edge computing, IoT, and real workloads.
Because LocalCloudLab is intended to be both a learning platform and a serious technical lab, k3s is the best match: full Kubernetes behavior with minimal friction.
(End of Part 1a — more for Section 3 will be appended.)
3.3 Understanding the Architecture of k3s¶
Although k3s is lightweight, it is still a fully compliant Kubernetes distribution. It removes unnecessary components, optimizes others, and integrates everything into a single efficient binary while maintaining all core Kubernetes behavior.
Below is a detailed breakdown of each architectural layer.
3.3.1 Control Plane Components in k3s¶
k3s bundles the entire Kubernetes control plane into a compact, optimized package:
• kube-apiserver
• kube-scheduler
• kube-controller-manager
• cloud-controller-manager
• admission controllers
• RBAC and API audit logging
• Authentication + authorization layers
These components behave just like upstream Kubernetes, meaning:
• All official tools work
• All Helm charts behave as expected
• All Kubernetes CRDs work
k3s remains fully compatible with kubectl, Helm, CRDs, GitOps tools, Envoy Gateway, cert-manager, etc.
3.3.2 Datastore Behavior: SQLite vs etcd¶
One of the strongest advantages of k3s is its embedded datastore.
Default (single-node mode): • SQLite
Option (multi-node mode): • External etcd or embedded etcd cluster
SQLite benefits for LocalCloudLab:
• No additional processes to manage
• No need to configure HA or clustering
• Very fast on NVMe SSDs
• Perfect for lightweight production-lab scenarios
etcd becomes relevant only when:
• Multi-node HA clusters are needed
• There is a requirement for extremely high write throughput
3.3.3 containerd as the Runtime (Docker is deprecated)¶
Unlike older Kubernetes installations, Docker is no longer used as the primary runtime.
k3s uses containerd, which is:
• The official Kubernetes default since v1.24
• Faster and more lightweight than Docker
• More secure (reduced attack surface)
• Easier to integrate with Kubernetes tooling
• Less resource-heavy
Containerd allows:
• Pulling images
• Running containers
• Logging containers
• Managing namespaces and snapshots
k3s automatically installs and manages containerd internally.
3.3.4 Networking Stack (CNI)¶
By default, k3s uses Flannel for networking.
Flannel provides:
• Pod-to-pod networking
• Stable overlay network via VXLAN
• Simple, reliable routing without complexity
Later sections will describe how Envoy Gateway operates above this layer.
3.3.5 The role of Traefik (and why we disable it)¶
By default, k3s installs Traefik as its built-in Ingress controller.
We disable it because:
• We use Envoy Gateway instead
• Running two ingress controllers creates conflicts
• Disabling Traefik saves ~200–400 MB of RAM
• Simplifies traffic routing and troubleshooting
Disabling Traefik is done during install:
INSTALL_K3S_EXEC="--disable traefik"
3.3.6 k3s as a Systemd Service¶
Once installed, k3s runs as:
/etc/systemd/system/k3s.service
It includes:
• Automatic restarts
• Logging via journalctl
• Automatic recovery if the server reboots
• Version-managed upgrades
This service supervises both the control plane and worker node responsibilities.
3.4 Installing k3s (Traefik Disabled)¶
This is the official LocalCloudLab installation command:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -
Breakdown:
curl -sfL https://get.k3s.io
→ Downloads the official installer script
INSTALL_K3S_EXEC="--disable traefik"
→ Passes installation arguments to disable Traefik
sh -
→ Executes installer
This performs:
• Installing k3s binaries
• Creating systemd service
• Installing Flannel
• Configuring containerd
• Starting the Kubernetes control plane
3.4.1 Verifying Installation and Cluster Health¶
Check service health:
sudo systemctl status k3s
View logs:
sudo journalctl -u k3s -f
Check Kubernetes node status:
kubectl get nodes -o wide
Expected output:
NAME STATUS ROLES VERSION
myserver Ready control-plane,master v1.xx.x
3.4.2 Common Installation Problems¶
If kubectl get nodes shows NotReady:
Possible causes: • Swap is still enabled → disable it completely • Missing kernel modules → run modprobe overlay + br_netfilter • sysctl not applied → run sysctl --system • Low entropy → install haveged or rng-tools • Firewall blocking port 6443 → allow it with ufw • Incorrect DNS configuration
These must be resolved before using the cluster.
3.5 Managing kubeconfig (Linux & Windows)¶
k3s stores its primary kubeconfig at:
/etc/rancher/k3s/k3s.yaml
To allow your Linux user to run kubectl:
mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown $USER:$USER ~/.kube/config
chmod 600 ~/.kube/config
3.5.1 Using the same kubeconfig from Windows (Dev Machine)¶
Your Windows machine will be used for:
• Writing YAML manifests
• Running kubectl commands
• Running Helm charts
• Managing deployments from VS2022
Steps:
- Copy the kubeconfig file contents from the server.
- On Windows, create: %USERPROFILE%.kube\config
-
Replace inside kubeconfig:
server: https://127.0.0.1:6443
With your server’s public IP:
server: https://<public-ip>:6443
-
Install kubectl on Windows:
choco install kubernetes-cli
-
Test the connection:
kubectl get nodes
If authentication fails: • Your firewall on the server is blocking port 6443 • The token inside kubeconfig may be corrupted • Ensure the k3s service is running • Ensure your Windows machine has internet access to the server
(End of Part 1b — next we will continue with deeper k3s internals, filesystem layout, containerd deep dive, and troubleshooting.)
3.6 Deep Dive: Understanding containerd in k3s¶
containerd is the default container runtime used by Kubernetes (since Docker was removed as a runtime in v1.24). It is lightweight, efficient, secure, and deeply integrated into k3s.
k3s installs and manages containerd automatically. To master Kubernetes operations, it is important to understand how containerd works under the hood.
3.6.1 containerd Architecture Overview¶
containerd consists of:
• containerd daemon (runs as a system service)
• runc (low-level OCI runtime)
• snapshotter (manages container filesystems)
• content store (stores image layers)
• metadata store (tracks containers, images, tasks)
View service status:
sudo systemctl status containerd
containerd exposes a gRPC API that Kubernetes uses to:
• pull images
• create containers
• start/stop containers
• log outputs
• manage runtime networking
• track resource usage
3.6.2 containerd Namespaces¶
containerd logically separates workloads by namespaces.
List namespaces:
sudo ctr namespaces list
You will typically see:
• k8s.io → Kubernetes-managed containers
• default → manually created containers (rare in LocalCloudLab)
Each namespace contains:
• Containers
• Images
• Snapshots
• Tasks
3.6.3 Listing Kubernetes Containers (ctr)¶
To inspect running containers:
sudo ctr -n k8s.io containers list
To inspect tasks (running processes):
sudo ctr -n k8s.io tasks list
3.6.4 Viewing container logs¶
Container logs are stored here:
/var/log/pods
/var/log/containers
To view logs directly:
sudo ctr -n k8s.io tasks logs <task-id>
However, you generally use:
kubectl logs <pod> [-f]
3.6.5 Inspecting Images¶
List images:
sudo ctr -n k8s.io images list
Remove an image:
sudo ctr -n k8s.io images rm <image>
3.6.6 containerd Snapshotter¶
Snapshotters implement the container filesystem.
Default snapshotter for k3s:
overlayfs
Snapshots live at:
/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs
You do not usually modify these manually, but understanding their structure is essential when diagnosing disk usage.
3.6.7 Why containerd is better for Kubernetes environments¶
Advantages over Docker:
• Lower CPU and RAM usage
• Faster pod startup time
• No embedded unnecessary extra components (e.g., Docker API proxy)
• Kubernetes interacts directly with containerd via CRI
• More deterministic logging and events
• Fewer attack surfaces
containerd is the future of Kubernetes container orchestration.
3.7 k3s Filesystem Layout¶
To operate a Kubernetes server effectively, you must know where k3s stores its files. k3s has a predictable and clean directory structure.
Here are the most important paths:
3.7.1 /etc/rancher/k3s¶
Contains:
• k3s.yaml (main kubeconfig)
• token file (for nodes joining cluster)
• TLS certificates for API server
• Basic configuration files
This directory should be included in backups.
3.7.2 /var/lib/rancher/k3s¶
This is the heart of k3s.
It contains:
• SQLite datastore (when using embedded DB)
• containerd state directories
• Flannel CNI configuration
• Pod manifests (static pods)
• Kubelet working directories
Critical subdirectories:
/var/lib/rancher/k3s/agent
/var/lib/rancher/k3s/server
3.7.3 /var/lib/kubelet¶
Kubelet stores:
• Pod volumes
• Mounted secrets/configmaps
• CSI storage contents
• Pod logs (symlinks to /var/log/containers)
This directory is constantly active.
3.7.4 /var/lib/containerd¶
Stores:
• Container images
• Snapshots
• Metadata
This directory often grows large; Loki logs and PostgreSQL data usually compete for space here.
3.7.5 /etc/systemd/system/k3s.service¶
The systemd unit that manages the lifecycle of the k3s server:
sudo systemctl status k3s
sudo systemctl restart k3s
sudo systemctl stop k3s
3.7.6 /var/log¶
Logs from:
• containerd
• system services
• kubelet (via journald)
• networking components
Understanding log locations is critical for debugging cluster behavior.
3.8 Verifying Cluster Health¶
After installation, we must confirm Kubernetes is fully functional.
3.8.1 Verify Node Readiness¶
kubectl get nodes -o wide
Expected:
STATUS: Ready
ROLES: control-plane,master
VERSION: v1.xx.x
3.8.2 Inspect CoreDNS¶
kubectl -n kube-system get pods -l k8s-app=kube-dns
If CoreDNS is CrashLooping, the cause is usually:
• Missing networking modules
• CNI initialization failure
• DNS issues in /etc/resolv.conf
3.8.3 Check System Pods¶
kubectl -n kube-system get pods -o wide
Look for:
• coredns
• local-path-provisioner
• metrics-server (if enabled)
• flannel
• kube-proxy
3.8.4 Run a test deployment¶
kubectl create deployment test --image=nginx
kubectl expose deployment test --port=80 --type=ClusterIP
Test internal connectivity:
kubectl run curl --image=radial/busyboxplus:curl -it --rm -- curl http://test
3.8.5 Check API Server Health¶
kubectl get --raw='/healthz'
3.8.6 Check container runtime status¶
sudo ctr -n k8s.io containers list
sudo systemctl status containerd
3.9 Troubleshooting Startup Problems¶
Problem 1 — Node is NotReady¶
Causes:
• Swap enabled
• Missing sysctl settings
• br_netfilter or overlay kernel modules not loaded
• Flannel failed to initialize
• DNS misconfiguration
Problem 2 — CoreDNS CrashLoopBackOff¶
Check:
kubectl logs -n kube-system <coredns-pod>
Typical causes:
• /etc/resolv.conf has invalid DNS servers
• Flannel did not bring up the CNI interface
Problem 3 — containerd keeps restarting¶
Check logs:
sudo journalctl -u containerd -f
Possible causes:
• Corrupt image store
• Insufficient disk space
• Snapshotter failure
Problem 4 — kube-apiserver fails to start¶
Check:
sudo journalctl -u k3s -f
Look for:
• Certificate errors
• SQLite lock issues
• Firewall blocking port 6443
Problem 5 — Pod stuck in ContainerCreating¶
Check:
kubectl describe pod <pod>
Typical causes:
• CNI not ready
• Missing network interface
• containerd cannot pull image
(End of Part 2 — more to come including upgrades, backups, cluster reset procedures, and preparing for Envoy Gateway installation.)
3.10 Upgrading k3s Safely¶
One of the major advantages of k3s is its clean and predictable upgrade process. Upgrading a Kubernetes cluster can be dangerous when done incorrectly, but k3s simplifies the workflow while still requiring careful planning.
3.10.1 Checking Current k3s Version¶
k3s --version
kubectl version --short
These commands confirm: • k3s binary version • Kubernetes API version • Client vs server version differences
3.10.2 Upgrade Channels¶
k3s supports stable and latest channels:
curl -sfL https://get.k3s.io | sh -
→ stable channel
curl -sfL https://get.k3s.io | INSTALL_K3S_CHANNEL=latest sh -
→ latest channel
LocalCloudLab should stick to stable, unless testing new features.
3.10.3 Manual Upgrade Procedure¶
The safest production-like upgrade method:
-
Stop k3s: sudo systemctl stop k3s
-
Install new version: curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -
-
Start k3s: sudo systemctl start k3s
-
Verify cluster is ready: kubectl get nodes kubectl get pods -A
3.10.4 Automated Upgrade with k3s-upgrade Service¶
k3s includes a built-in upgrade controller for multi-node clusters. Although LocalCloudLab is single-node, it still works.
Enable it by installing the k3s-agent or HA mode, but we skip that here.
3.10.5 Validating the Upgrade¶
After upgrading:
kubectl get pods -A
kubectl get nodes -o wide
sudo journalctl -u k3s -f
Check for: • CNI issues • Module failures • Containerd compatibility • Deprecated API warnings
3.11 Backing Up k3s Cluster State¶
Even on a single-node environment, backing up Kubernetes state is essential. k3s stores cluster data in a predictable set of locations.
Backups protect against: • Corrupted datastore • Failed upgrades • Accidental deletions • System crashes
3.11.1 Backing Up SQLite Datastore¶
SQLite file location:
/var/lib/rancher/k3s/server/db/state.db
Backup:
sudo systemctl stop k3s
sudo cp /var/lib/rancher/k3s/server/db/state.db /root/k3s_state_backup.db
sudo systemctl start k3s
3.11.2 Backing Up Certificates¶
sudo cp -r /etc/rancher/k3s /root/k3s_certs_backup/
Certificates include: • kube-apiserver certs • kubeconfig token • TLS assets
3.11.3 Backing Up Manifests and Static Pods¶
sudo cp -r /var/lib/rancher/k3s/server/manifests /root/k3s_manifests_backup/
3.11.4 Restoring a k3s Backup¶
To restore:
-
Stop k3s: sudo systemctl stop k3s
-
Replace state.db: sudo cp /root/k3s_state_backup.db /var/lib/rancher/k3s/server/db/state.db
-
Restore certs if needed: sudo cp -r /root/k3s_certs_backup/* /etc/rancher/k3s/
-
Restart: sudo systemctl start k3s
3.11.5 Automated Scheduled Backups¶
Create a script:
nano /usr/local/bin/k3s-backup.sh
Contents:
#!/bin/bash
systemctl stop k3s
cp /var/lib/rancher/k3s/server/db/state.db /root/state-$(date +%F).db
systemctl start k3s
Make executable:
chmod +x /usr/local/bin/k3s-backup.sh
Create cron job:
sudo crontab -e
0 2 * * * /usr/local/bin/k3s-backup.sh
3.12 Resetting or Reinstalling k3s¶
Sometimes k3s needs to be reset completely to recover from corruption or to rebuild the lab.
3.12.1 Completely Uninstall k3s¶
sudo /usr/local/bin/k3s-uninstall.sh
This removes: • systemd service • k3s binaries • All Kubernetes data • All containerd state
3.12.2 Cleaning containerd manually (optional)¶
If containerd remains in a bad state:
sudo systemctl stop containerd
sudo rm -rf /var/lib/containerd
sudo systemctl start containerd
This wipes: • All container images • All snapshots • All metadata
3.12.3 Full Clean Reset (Hard Reset)¶
To wipe EVERYTHING:
sudo systemctl stop k3s
sudo rm -rf /etc/rancher
sudo rm -rf /var/lib/rancher
sudo rm -rf /var/lib/kubelet
sudo rm -rf /var/lib/containerd
sudo /usr/local/bin/k3s-uninstall.sh
This returns the server to a clean state for reinstalling k3s.
3.13 Preparing for Envoy Gateway Installation¶
Before installing Envoy Gateway, we must ensure that:
• LoadBalancer services work
• MetalLB is prepared
• CNI (Flannel) is fully functional
• Kubernetes environment is stable
• TLS certificates can be issued
3.13.1 Validate Node Networking¶
Check for correct pod CIDR allocation:
ip addr | grep flannel
Check routes:
ip route | grep flannel
3.13.2 Verify LoadBalancer Behavior¶
MetalLB will later assign IPs to Envoy.
Before installing MetalLB, ensure:
• No conflicting IP ranges
• Server interface allows ARP broadcasts
3.13.3 Prepare MetalLB Pool Range¶
For example (Hetzner):
172.18.255.200 - 172.18.255.220
This must align with your hosting network adapter.
3.13.4 Ensure cert-manager Will Work¶
Check entropy:
cat /proc/sys/kernel/random/entropy_avail
Should be > 1500.
3.13.5 Confirm Cluster Stability¶
Run:
kubectl get pods -A
kubectl get nodes
kubectl describe node <nodename>
All system pods must be healthy before adding networking components like Envoy.
3.14 Summary of Section 3¶
In this section, we accomplished:
• Understanding why k3s is ideal for LocalCloudLab
• Installing k3s with Traefik disabled
• Managing kubeconfig for Linux and Windows
• Diving deep into containerd internals
• Understanding filesystem layout for Kubernetes components
• Verifying cluster health
• Troubleshooting common issues
• Planning safe upgrades and backups
• Resetting and reinstalling k3s
• Preparing the cluster for Envoy Gateway, MetalLB, and cert-manager
Your Kubernetes cluster is now fully operational and ready for the next major chapter: Installing MetalLB, Envoy Gateway, TLS certificates, observability tools, and the full LocalCloudLab stack.
(End of Section 03 — Complete)