Introduction

Warning

These docs contain information that relates to my setup. They may or may not work for you.



My home operations repository

... managed with Flux, Renovate and GitHub Actions ๐Ÿฑ

GitHub Workflow Statusย ย  kubernetes


Uptime Robot statusย ย  Uptime Robot status Uptime Robot status

Age-Daysย ย  Uptime-Daysย ย  Node-Countย ย  Pod-Countย ย  CPU-Usageย ย  Memory-Usage


๐Ÿ‘‹ Welcome to my Home Operations repository. This is a mono repository for my home infrastructure and Kubernetes cluster implementing Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate and GitHub Actions.


๐Ÿค Thanks

Thanks to all the people who donate their time to the Kubernetes @Home Discord community. A lot of inspiration for my cluster comes from the people that have shared their clusters using the k8s-at-home GitHub topic. Be sure to check out the Kubernetes @Home search for ideas on how to deploy applications or get ideas on what you can deploy.

๐Ÿ”ง Hardware

DeviceCountOS Disk SizeData Disk SizeRAMOperating SystemPurpose
MikroTik RB5009UG+S+IN1--1GB NAND1GBRouterOS 7.10Router
HP ProCurve 1810G-241----512MB--Switch
HP EliteDesk 800 G2 mini1240GB NVMe256GB SSD16GBTalos 1.5.5k8s Master
HP 260 G3 DM1256GB SSD540GB NvmE16GBTalos 1.5.5k8s Master
DELL Wyse 50601240GB SSD--16GBTalos 1.5.5k8s Master
Lenovo M910x1256GB NVMe--8GBTalos 1.5.5k8s Worker
HP ProDesk G5 mini1256GB NVMe500GB NvmE16GBTalos 1.5.5k8s Worker
Raspberry Pi 3B132GB SDCard--1GBRaspbianPi-hole
NAS1120GB SSD8TB ZRAID016GBTrueNas CoreNFS/BACKUP

NAS (Detailed)
TypeItem
CPUIntel Core i5-6500 3.2 GHz Quad-Core Processor
CPU CoolerIntel Stock
MotherboardMSI H110M PRO-VH Micro ATX LGA1151
MemoryCrucial Ballistix Sport LT 16 GB (2 x 8 GB) DDR4-3200 CL16
Storage (Boot)Kingston A400 120 GB 2.5" SSD
Storage (Data)Seagate IronWolf NAS 4 TB 3.5" 5400 RPM Internal Hard Drive x 3
Storage Controller10Gtekยฎ Internal SAS/SATA Raid Controller PCI Express Host Bus Adapter for LSI 9211-8I, LSI SAS2008 Chip, 8-Port 6Gb/s
CaseFractal Design Node 804 MicroATX Mid Tower Case
Power SupplyCorsair CV550 550 W 80+ Bronze Certified ATX Power Supply

Network

My DNS setup may seem a bit complicated at first, but it allows for completely automatic management of DNS entries for Services and Ingress objects.

Components

NGINX

NGINX is my cluster Ingress controller. It is set to a LoadbalancerIP provided by Cilium so I can access the services directly

external-dns

external-dns runs in my cluster and is connected to my domain DNS server. It automatically manages records for all my Ingresses taht have the external-dns/is-public: true annotation set.

Cloudflared

In order to expose my services to the outside world, I have a Cloudflare tunnel directly to my cluster using Cloudflared, that way I don't need to open ports on my router

How it all works together

When I am connected to my home network, my DNS server is set to pi-hole that is running on my network. I have configured it forward all requests pointing to my domain to the Cilium address providing internal DNS resolution.

# /etc/dnsmasq.d/99-k8s-gateway-forward.conf
server=/${SECRET_DOMAIN}/${CILIUM_K8S_GATEWAY_ADDR}

When I am outside my home network, and request an address for one of my domains, it will query my domains DNS server and will respond with the DNS record that was set by cloudflared.

โ˜๏ธ Cloud services

While most of my infrastructure and workloads are selfhosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.

The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Authentik. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in.

ServiceUseCost
GitHubHosting this repository and continuous integration/deploymentsFree
CloudflareDomain, DNS and proxy managementFree
B2 StorageOffsite application backupsFree
Terraform CloudStore Terraform state onlineFree
Total: ~$0/m

Kubernetes

My cluster is Ubuntu provisioned on bare-metal using Talos.

This is a semi hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server for (NFS) file storage.

Core Components

  • metallb: A network load-balancer implementation using standard routing protocols
  • kube-vip: Provides static virtual IPs for services
  • cert-manager: Creates SSL certificates for services in my Kubernetes cluster.
  • external-dns: Automatically manages DNS records from my cluster in a cloud DNS provider.
  • longhorn: Distributed storage for persistent storage
  • traefik: Ingress controller to expose HTTP traffic to pods over DNS
  • sops: Managed secrets for Kubernetes, Talos and Terraform which are commited to Git.

GitOps

Flux watches my cluster folder (see Directory structure) and makes the changes to my cluster based on the YAML manifests.

The way Flux works for me here is it will recursively search the cluster/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations. Those Flux kustomizations will generally have a HelmRelease or other resources related to the application underneath it which will be applied.

Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When PRs are merged Flux applies the changes to my cluster.

Directory structure

My home-ops repository contains the following directories under cluster.

๐Ÿ“ cluster  # My main kubernetes cluster
โ”œโ”€โ”€๐Ÿ“ apps  # Apps deployed into my cluster grouped by namespace (see below)
โ”œโ”€โ”€๐Ÿ“ base  # Flux entrypoint
โ”œโ”€โ”€๐Ÿ“ core  # Important applications that should never be pruned by flux
โ””โ”€โ”€๐Ÿ“ crds  # Custom resource definitions (CRDs) that need to exist globally

Storage

Storage in my cluster is handled in a number of ways. The in-cluster storage is provided by a longhorn cluster that is running on a number of my nodes.

Distributed storage

The bulk of my cluster storage relies on democratic-csi. This ensures that my data is replicated across my storage nodes.

NFS storage

Finally, I have my NAS that exposes several exports over NFS. Given how NFS is a very bad idea for storing application data (see for example this Github issue) I only use it to store data at rest, such as my personal media files, Linux ISO's, backups, etc.

Backups

Longhorn creates a backup daily of each PVC to my NAS. I have configured TrueNAS to upload backups daily to B2 Cloud

Helper Apps

Outside of the core of Kubernetes I run some apps that help to maintain de state of the cluster

kyverno

Kyverno is a policy engine for kubernetes, policies can validate, mutate, generate and cleanup Kubernetes resources

These are the policies I'm currently running on my cluster.

reloader

A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig.

Apply changes to Pods after modifing a ConfigMap/Secret without having to restart it manually, uses an annotation to enable

descheduler

Kubernetes scheduler decisions are influenced by its view of a Kubernetes cluster at that point of time when a new pod appears for scheduling. As Kubernetes clusters are very dynamic and their state changes over time, there may be desire to move already running pods to some other nodes for various reasons:

  • Some nodes are under or over utilized.
  • The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node affinity requirements are not satisfied any more.
  • Some nodes failed and their pods moved to other nodes.
  • New nodes are added to clusters.

node-feature-discovery

Kubernetes add-on for detecting hardware features and system configuration!

My main usage is to detect the USB Bluetooth adapter.