02: Fresh Linux Server Setup¶

2.1 Introduction¶

This section provides a comprehensive, production-grade guide to preparing a fresh Linux server for LocalCloudLab. Unlike basic tutorials, this chapter explains not only what to do but why each step matters in real-world DevOps and cloud-native environments.

(Truncated demo — full content will be added in subsequent parts.)

2.2 Choosing the Right Linux Distribution¶

When selecting the operating system for LocalCloudLab, the goal is to balance stability, security, community support, and compatibility with modern cloud‑native tooling. While many Linux distributions can support Kubernetes, not all are equally suited.

Ubuntu 22.04 LTS (Long-Term Support) is recommended because:

• It offers 5 years of security updates.
• It has predictable kernel versions and update cycles.
• It is widely supported by Kubernetes distributions including k3s.
• Most Helm charts and DevOps tools document Ubuntu-compatible instructions.
• Its package manager (APT) is stable and widely understood.
• Community support is extensive.

Alternatives: • Debian 12 – extremely stable; slower-moving but excellent for servers. • Rocky Linux / AlmaLinux – strong RHEL-based choices. • Fedora – modern but updates too frequently for production stability.

Avoid: • CentOS Stream – rolling-release model makes Kubernetes unstable. • Custom/minimal distros unless you already have expertise.

Ubuntu is chosen for LocalCloudLab because it reduces friction and maximizes compatibility.

2.3 Server Hardware Architecture¶

Even though LocalCloudLab is built to run on a single server, the hardware still matters. Kubernetes, containers, databases, monitoring systems, and logging pipelines all consume resources.

2.3.1 vCPU Considerations¶

Kubernetes schedules workloads based on CPU cores. More vCPUs mean:

• More concurrent workloads
• Faster container orchestration
• Better responsiveness under load

Minimum recommended: • 4 vCPUs for light workloads

Ideal for LocalCloudLab: • 8–16 vCPUs

2.3.2 RAM Considerations¶

Memory is usually the first bottleneck in local clusters.

Approximate memory usage:

• k3s control plane + core services: 700–1200 MB
• Envoy Gateway: 200–500 MB
• Prometheus: 500–900 MB
• Loki: 300–600 MB
• Jaeger: 200–500 MB
• PostgreSQL: 400–800 MB
• Redis & RabbitMQ: 200–400 MB combined
• Your .NET APIs: depends on load (150–350 MB each)

Minimum: • 8 GB (not recommended; too tight)

Recommended: • 16 GB for development • 32 GB for comfortable performance

2.3.3 Storage (SSD / NVMe)¶

Containers, logs, and databases are very I/O intensive.

Storage requirements: • Minimum: 100 GB SSD • Recommended: 200–300 GB NVMe

NVMe drives increase: • Database throughput • Container image extraction speed • Log ingestion capacity • Overall responsiveness

2.3.4 SWAP Usage¶

SWAP prevents crashes when RAM runs out, but should not be heavily relied upon.

Best practice for Kubernetes nodes: • Keep SWAP minimal or disabled. • Configure with: sudo swapoff -a and remove from /etc/fstab.

If using SWAP: • Limit to 1–2 GB. • Never use SWAP for long-term load.

2.4 Networking Model for Hosted Servers¶

Unlike home labs, hosted servers (Hetzner, DigitalOcean, Linode, etc.) have unique networking characteristics.

2.4.1 Understanding Public IPs¶

Your server gets: • One primary public IPv4 • One or more private subnets (depending on provider) • Gateway IP from the hosting network

MetalLB must allocate from a range that exists on your local network interface.

Example (Hetzner): • Public IP: 49.13.xx.xx • Internal range: 172.18.0.0/16 • MetalLB pool: 172.18.255.200–172.18.255.220

2.4.2 Routing Basics¶

Your server receives ARP broadcasts from the provider's switch. MetalLB leverages this to:

• Announce extra IPs assigned to LoadBalancers
• Allow external traffic to reach Envoy Gateway and services

2.4.3 Firewalls¶

Hosting firewalls may exist in parallel with: • Linux UFW • Kubernetes NetworkPolicies

All must be aligned to avoid: • Dropped packets • Misrouted traffic • Services appearing offline

If your provider includes a cloud firewall: • Allow ports 22, 80, 443 • Optionally allow 6443 (Kubernetes API) for remote management

2.5 Initial Security Procedures¶

Before installing any software, secure the base operating system.

2.5.1 Update System¶

sudo apt update && sudo apt upgrade -y

2.5.2 Create a Non-Root User¶

sudo adduser boten
sudo usermod -aG sudo boten

Why: • Reduce risk of accidental system damage • Restrict attack surface • Follow least-privilege principles

2.5.3 SSH Hardening¶

Edit SSH configuration:

sudo nano /etc/ssh/sshd_config

Recommended changes:

PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
PermitEmptyPasswords no

Restart: sudo systemctl restart ssh

2.5.4 Configure SSH Keys (From Windows)¶

ssh-keygen -t ed25519 -C "localcloudlab"

Copy public key to server:

mkdir -p ~/.ssh
echo "<your-public-key>" >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

2.5.5 Install UFW Firewall¶

sudo apt install ufw -y
sudo ufw allow OpenSSH
sudo ufw allow 80
sudo ufw allow 443
sudo ufw enable

2.5.6 Enable Fail2ban¶

sudo apt install fail2ban -y
sudo systemctl enable fail2ban
sudo systemctl start fail2ban

Fail2ban protects against: • SSH brute-force attacks • Repeated authentication failures

2.5.7 System Auditing¶

sudo apt install auditd -y
sudo systemctl enable auditd
sudo systemctl start auditd

Auditing logs all sensitive system actions.

(End of Part 2 — more will be appended.)

2.6 User Management Best Practices¶

A well‑maintained Linux server follows clear user management principles. Even if you are the only user accessing the machine, adopting production-grade standards from day one increases security, reduces risk, and prepares the system for future expansion.

2.6.1 Why User Management Matters¶

Improper user management is one of the leading causes of:

• Unauthorized privilege escalation
• Accidental system damage
• Insecure automation scripts
• Compliance failures (in regulated environments)
• Difficulty auditing system actions

LocalCloudLab enforces a clean, organized user model.

2.6.2 Creating a Dedicated Administrative User¶

We already created:

sudo adduser boten
sudo usermod -aG sudo boten

This user becomes the primary administrator. Root login remains disabled, reducing attack vectors.

2.6.3 The sudoers File¶

Fine‑tuning sudo privileges:

sudo visudo

Safer alternatives:

• Use NOPASSWD only for automation scripts
• Use command-restricted sudo rules for CI/CD bots
• Log all sudo usage through auditd

2.6.4 SSH Key Hygiene¶

Best practices:

• Use one SSH key per device
• Use ed25519 keys for strong security
• Avoid reusing keys across multiple servers
• Rotate keys annually (or more frequently if required)

2.6.5 Locking and Unlocking Users¶

To lock:

sudo usermod -L username

To unlock:

sudo usermod -U username

This is critical when an account is compromised or no longer required.

2.6.6 Expiring Passwords (even if passwords are rarely used)¶

Set an expiry date:

sudo chage -E 2025-12-31 username

List password aging details:

sudo chage -l username

Even in SSH-key environments, this keeps accounts hygienic.

2.7 Advanced Firewall Theory and Strategy¶

Many servers rely only on a basic firewall. LocalCloudLab takes a layered, production-inspired approach.

2.7.1 Layers of Firewalling¶

There are three overlapping firewall layers:

Cloud provider firewall (if available) Example: Hetzner firewall rules.
Local OS firewall (UFW or nftables) Applied directly on the machine.
Kubernetes NetworkPolicies Applied inside the cluster itself.

Each layer has a role and must be configured in harmony.

2.7.2 UFW: Your First Line of Defense¶

Current configuration:

sudo ufw allow OpenSSH
sudo ufw allow 80
sudo ufw allow 443
sudo ufw enable

2.7.3 Recommended Additional Hardening¶

Block everything else:

sudo ufw default deny incoming
sudo ufw default allow outgoing

Explicitly allow Kubernetes internal traffic on the node:

sudo ufw allow 6443/tcp
sudo ufw allow 10250/tcp
sudo ufw allow 8472/udp   # Flannel VXLAN

These ports are required for Kubernetes operations.

2.7.4 nftables (Advanced)¶

UFW internally uses nftables on modern Ubuntu. Advanced users can write custom nftables rulesets, such as rate-limiting SSH attempts:

nft add rule inet filter input tcp dport 22 limit rate 10/minute accept

This complements fail2ban for enhanced brute-force protection.

2.7.5 Kubernetes NetworkPolicies¶

Even if the node firewall is secure, internal traffic can still move between pods unless restricted.

Later in the book, we introduce NetworkPolicies to:

• Deny all cross-namespace traffic by default
• Allow only necessary flows (e.g., Search → PostgreSQL)
• Prevent Redis, RabbitMQ, and other backing services from being exposed

2.8 OS Hardening for Kubernetes Nodes¶

Production Kubernetes nodes require specific OS-level tuning. LocalCloudLab follows the same principles.

2.8.1 Disable SWAP (Required for Kubernetes)¶

Immediate disable:

sudo swapoff -a

Remove from /etc/fstab:

sudo nano /etc/fstab
# Comment out any swap entries

Why?

• Kubernetes scheduling expects SWAP disabled
• SWAP causes unpredictable latency
• Control plane components assume stable memory behavior

2.8.2 Kernel Modules Needed by Kubernetes¶

Ensure required modules are enabled:

sudo modprobe overlay
sudo modprobe br_netfilter

Persist them:

echo "overlay" | sudo tee -a /etc/modules-load.d/k8s.conf
echo "br_netfilter" | sudo tee -a /etc/modules-load.d/k8s.conf

2.8.3 sysctl Networking Parameters for Containers¶

Add the following to /etc/sysctl.d/k8s.conf:

net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1

Apply:

sudo sysctl --system

These settings enable:

• Pod-to-pod routing
• NAT for services
• Gateway communication
• Load balancing correctness

2.8.4 Disable Unnecessary Services¶

List active services:

systemctl --type=service

Disable what you don’t need:

sudo systemctl disable bluetooth.service
sudo systemctl disable cups.service
sudo systemctl disable avahi-daemon.service

This reduces attack surface and frees memory.

2.8.5 Protecting Critical Binaries¶

Use immutable file attributes:

sudo chattr +i /etc/passwd
sudo chattr +i /etc/shadow

To modify them later, remove immutability:

sudo chattr -i /etc/passwd
sudo chattr -i /etc/shadow

2.8.6 Ensuring High Entropy (important for TLS-heavy workloads)¶

Entropy is required for cryptographic operations such as:

• TLS handshake
• SSH key generation
• JWT signing
• Random number generation

Install haveged:

sudo apt install haveged -y

This ensures reliable certificate creation and renewal under cert-manager.

2.9 Essential Tools and Packages¶

A well-prepared server must include debugging, monitoring, and development utilities.

2.9.1 Basic Tools¶

sudo apt install -y         curl wget git unzip tar vim nano htop tmux

2.9.2 Useful Diagnostics Tools¶

sudo apt install -y         net-tools traceroute dnsutils lsof tcpdump nmap

2.9.3 JSON and YAML Utilities¶

sudo apt install -y jq yq

jq: JSON formatter and processor.

yq: YAML formatter and processor.

2.9.4 System Monitoring Tools¶

Install:

sudo apt install bmon iftop sysstat -y

These help diagnose networking and performance issues in real time.

2.9.5 File System Tools¶

sudo apt install ncdu tree

Useful for discovering disk usage patterns (Loki logs, container images, PostgreSQL data).

2.10 Preparing Ubuntu for Kubernetes (k3s)¶

This section describes how to tune the OS specifically for lightweight Kubernetes distribution k3s.

2.10.1 Disable Swap Permanently¶

We already disabled SWAP temporarily. Ensure it's permanent by editing /etc/fstab and removing swap entries.

2.10.2 Check cgroups Support¶

Container runtimes require cgroupv1 or cgroupv2.

Verify:

ls /sys/fs/cgroup

On Ubuntu 22.04, cgroupv2 is standard and fully supported by k3s.

2.10.3 CPU Governor Tuning¶

Ensure the CPU is set to performance mode:

sudo apt install cpufrequtils -y
echo "GOVERNOR="performance"" | sudo tee /etc/default/cpufrequtils
sudo systemctl restart cpufrequtils

Benefits:

• More predictable latency
• Stable behavior for cluster components

2.10.4 Entropy Improvement (Again: important!)¶

TLS-heavy systems suffer from low entropy. Install both haveged and rng-tools:

sudo apt install rng-tools -y

Then configure:

sudo nano /etc/default/rng-tools
HRNGDEVICE=/dev/urandom

Restart:

sudo systemctl restart rng-tools

2.10.5 Reboot After All Preparations¶

A reboot ensures:

• Kernel modules are loaded
• sysctl parameters are active
• Disabled services remain inactive
• cgroup settings are stable

sudo reboot

(End of Part 3 — more will be appended.)

2.11 Post‑Installation Verification Checklist¶

After preparing the Linux server, it is essential to verify that the system is functioning correctly before proceeding to install Kubernetes (k3s). This step ensures stability and avoids failures later when the environment becomes more complex.

Below is a rigorous, production-grade verification procedure.

2.11.1 Verify System Logs Are Clean¶

Check for critical errors:

sudo journalctl -p 3 -xb

Look for: • Failed services • Kernel errors • Filesystem issues • Network initialization failures

Investigate any warnings before continuing.

2.11.2 Verify DNS Resolution Works¶

Test common DNS resolutions:

dig google.com
dig hershkowitz.co.il
dig search.hershkowitz.co.il

Also verify internal resolution after Kubernetes is installed later.

If DNS fails: • Check /etc/resolv.conf • Ensure hosting provider DNS settings are correct

2.11.3 Verify Network Interfaces¶

List interfaces:

ip addr

Expected output should show: • eth0 (or similar primary NIC) • lo (loopback) • No misconfigured virtual adapters

2.11.4 Check Routing Table¶

Inspect default route:

ip route

Look for: • Single, clear default gateway • No duplicate routes • Correct subnet masks

2.11.5 Verify Kernel Modules¶

Ensure required modules are loaded:

lsmod | grep br_netfilter
lsmod | grep overlay

If missing, load again:

sudo modprobe br_netfilter
sudo modprobe overlay

2.11.6 Confirm sysctl Parameters¶

Verify:

sudo sysctl net.bridge.bridge-nf-call-iptables
sudo sysctl net.ipv4.ip_forward

Expected output: net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1

2.11.7 Check Entropy Availability (Critical for TLS)¶

cat /proc/sys/kernel/random/entropy_avail

Healthy values: • > 1500 recommended • < 500 may cause TLS failures

If low: • Ensure haveged or rng-tools are running

2.11.8 Validate Firewall Rules¶

List UFW rules:

sudo ufw status verbose

Ensure: • SSH is allowed • HTTP/HTTPS are allowed • All other ports default deny

2.11.9 Confirm Disk Health¶

Check SMART status:

sudo apt install smartmontools -y
sudo smartctl -a /dev/sda

Verify: • No reallocated sectors • No pending sectors • No I/O errors

2.11.10 Final Reboot Before Proceeding¶

A clean reboot ensures all system adjustments are active:

sudo reboot

After reboot, re-run:

sudo journalctl -p 3 -xb

2.12 Performance Tuning for Container Workloads¶

Containerized environments require certain OS-level performance optimizations. These tuning strategies significantly improve stability and throughput, especially when running databases, gateways, and observability stacks on a single node.

2.12.1 Filesystem Tuning (ext4 recommended)¶

Check mount options:

sudo mount | grep ext4

Recommended options: • noatime • nodiratime • discard (if SSD supports TRIM)

Edit /etc/fstab accordingly.

2.12.2 I/O Scheduler Optimization¶

For SSD/NVMe:

cat /sys/block/sda/queue/scheduler

Preferred: • none • mq-deadline

Set scheduler:

echo mq-deadline | sudo tee /sys/block/sda/queue/scheduler

2.12.3 journald Configuration (Prevent Log Bloat)¶

Default journald settings can consume gigabytes of disk space.

Edit:

sudo nano /etc/systemd/journald.conf

Set:

SystemMaxUse=200M
RuntimeMaxUse=100M
Compress=yes
Storage=auto

Restart:

sudo systemctl restart systemd-journald

2.12.4 Increase ulimits for Container Engines¶

Edit limits:

sudo nano /etc/security/limits.conf

Add:

* soft nofile 100000
* hard nofile 100000

Apply globally:

sudo sysctl -w fs.file-max=200000

2.12.5 TCP Tuning for High‑Load Environments¶

Add to /etc/sysctl.d/k8s-network.conf:

net.core.somaxconn = 1024
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65000
net.core.netdev_max_backlog = 4096

Apply:

sudo sysctl --system

These settings optimize: • Connection handling • Gateway responsiveness • Database connections • High-throughput logging

2.13 Linux Backup Strategy for Server Stability¶

A good server is not only secure but also resilient. Backup practices ensure you can recover quickly after corruption, accidental deletion, bad upgrades, or hardware failure.

2.13.1 What Must Be Backed Up¶

Critical system areas:

/etc           – system configs
/var/lib       – container runtime & service data
/root          – root user configs
/home          – user data
/opt           – optional installed tools
/var/log       – logs for troubleshooting

2.13.2 Backing Up Post‑Installation State¶

Create a full system archive:

sudo tar -czf localcloudlab_system_backup.tar.gz         /etc /var/lib /root /home /opt

Store backups: • Off-server storage (S3, Google Drive, rclone sync) • Local NAS • Encrypted USB drive

2.13.3 Automated Backups with cron¶

Create script:

sudo nano /usr/local/bin/server-backup.sh

Contents:

#!/bin/bash
tar -czf /root/server-backup-$(date +%F).tar.gz /etc /var/lib

Make executable:

sudo chmod +x /usr/local/bin/server-backup.sh

Add cron job:

sudo crontab -e

0 3 * * * /usr/local/bin/server-backup.sh

2.13.4 Using Restic or Borg for Advanced Backups¶

Restic advantages: • Deduplication • Encryption • Supports object storage

Borg advantages: • Compression • Extremely fast restores

These tools are ideal for production-grade backup strategies.

2.13.5 Disaster Recovery Considerations¶

Think in terms of:

• RPO (Recovery Point Objective)
• RTO (Recovery Time Objective)

Worst-case scenario: • Full OS corruption • Need to restore server from scratch

LocalCloudLab is designed so that: • Kubernetes configuration is stored in Git • Infrastructure scripts are reproducible • Data backups restore quickly

This dramatically reduces downtime.

2.14 Common Mistakes and How to Avoid Them¶

Over years of DevOps experience, certain setup mistakes appear repeatedly. LocalCloudLab helps avoid them from the start.

Mistake 1 – Swap Still Enabled¶

Fix: sudo swapoff -a

Mistake 2 – Missing Kernel Modules¶

Fix: sudo modprobe overlay sudo modprobe br_netfilter

Mistake 3 – Firewall Blocking Kubernetes Ports¶

Fix: Allow necessary internal ports.

Mistake 4 – Not Rebooting After Major Changes¶

Fix: sudo reboot

Mistake 5 – Incorrect sysctl Parameters¶

Fix: Reapply with: sudo sysctl --system

Mistake 6 – DNS Misconfiguration¶

Fix: nano /etc/resolv.conf

Mistake 7 – Running Services as Root¶

Fix: • Use service accounts • Restrict container privileges

Mistake 8 – Filling Disk with Logs¶

Fix: • journald limits • Loki retention configuration

Mistake 9 – Forgetting Fail2ban or SSH Hardening¶

Fix: • Always secure SSH • Disable root login

Mistake 10 – Ignoring Backups¶

Fix: • Automate them • Store off-server copies

(End of Section 02 — Complete)