Introduction
My home operations repository
... managed with Flux, Renovate and GitHub Actions ๐ฑ
๐ Welcome to my Home Operations repository. This is a mono repository for my home infrastructure and Kubernetes cluster implementing Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate and GitHub Actions.
๐ค Thanks
Thanks to all the people who donate their time to the Kubernetes @Home Discord community. A lot of inspiration for my cluster comes from the people that have shared their clusters using the k8s-at-home GitHub topic. Be sure to check out the Kubernetes @Home search for ideas on how to deploy applications or get ideas on what you can deploy.
๐ง Hardware
Device | Count | OS Disk Size | Data Disk Size | RAM | Operating System | Purpose |
---|---|---|---|---|---|---|
MikroTik RB5009UG+S+IN | 1 | -- | 1GB NAND | 1GB | RouterOS 7.10 | Router |
HP ProCurve 1810G-24 | 1 | -- | -- | 512MB | -- | Switch |
HP EliteDesk 800 G2 mini | 1 | 240GB NVMe | 256GB SSD | 16GB | Talos 1.5.5 | k8s Master |
HP 260 G3 DM | 1 | 256GB SSD | 540GB NvmE | 16GB | Talos 1.5.5 | k8s Master |
DELL Wyse 5060 | 1 | 240GB SSD | -- | 16GB | Talos 1.5.5 | k8s Master |
Lenovo M910x | 1 | 256GB NVMe | -- | 8GB | Talos 1.5.5 | k8s Worker |
HP ProDesk G5 mini | 1 | 256GB NVMe | 500GB NvmE | 16GB | Talos 1.5.5 | k8s Worker |
Raspberry Pi 3B | 1 | 32GB SDCard | -- | 1GB | Raspbian | Pi-hole |
NAS | 1 | 120GB SSD | 8TB ZRAID0 | 16GB | TrueNas Core | NFS/BACKUP |
NAS (Detailed)
Type | Item |
---|---|
CPU | Intel Core i5-6500 3.2 GHz Quad-Core Processor |
CPU Cooler | Intel Stock |
Motherboard | MSI H110M PRO-VH Micro ATX LGA1151 |
Memory | Crucial Ballistix Sport LT 16 GB (2 x 8 GB) DDR4-3200 CL16 |
Storage (Boot) | Kingston A400 120 GB 2.5" SSD |
Storage (Data) | Seagate IronWolf NAS 4 TB 3.5" 5400 RPM Internal Hard Drive x 3 |
Storage Controller | 10Gtekยฎ Internal SAS/SATA Raid Controller PCI Express Host Bus Adapter for LSI 9211-8I, LSI SAS2008 Chip, 8-Port 6Gb/s |
Case | Fractal Design Node 804 MicroATX Mid Tower Case |
Power Supply | Corsair CV550 550 W 80+ Bronze Certified ATX Power Supply |
Network
My DNS setup may seem a bit complicated at first, but it allows for completely automatic management of DNS entries for Services and Ingress objects.
Components
NGINX
NGINX is my cluster Ingress controller. It is set to a LoadbalancerIP provided by Cilium so I can access the services directly
external-dns
external-dns runs in my cluster and is connected to my domain DNS server. It automatically manages records for all my Ingresses taht have the external-dns/is-public: true
annotation set.
Cloudflared
In order to expose my services to the outside world, I have a Cloudflare tunnel directly to my cluster using Cloudflared, that way I don't need to open ports on my router
How it all works together
When I am connected to my home network, my DNS server is set to pi-hole that is running on my network. I have configured it forward all requests pointing to my domain to the Cilium address providing internal DNS resolution.
# /etc/dnsmasq.d/99-k8s-gateway-forward.conf
server=/${SECRET_DOMAIN}/${CILIUM_K8S_GATEWAY_ADDR}
When I am outside my home network, and request an address for one of my domains, it will query my domains DNS server and will respond with the DNS record that was set by cloudflared.
โ๏ธ Cloud services
While most of my infrastructure and workloads are selfhosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.
The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Authentik. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in.
Service | Use | Cost |
---|---|---|
GitHub | Hosting this repository and continuous integration/deployments | Free |
Cloudflare | Domain, DNS and proxy management | Free |
B2 Storage | Offsite application backups | Free |
Terraform Cloud | Store Terraform state online | Free |
Total: ~$0/m |
Kubernetes
My cluster is Ubuntu provisioned on bare-metal using Talos.
This is a semi hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server for (NFS) file storage.
Core Components
- metallb: A network load-balancer implementation using standard routing protocols
- kube-vip: Provides static virtual IPs for services
- cert-manager: Creates SSL certificates for services in my Kubernetes cluster.
- external-dns: Automatically manages DNS records from my cluster in a cloud DNS provider.
- longhorn: Distributed storage for persistent storage
- traefik: Ingress controller to expose HTTP traffic to pods over DNS
- sops: Managed secrets for Kubernetes, Talos and Terraform which are commited to Git.
GitOps
Flux watches my cluster folder (see Directory structure) and makes the changes to my cluster based on the YAML manifests.
The way Flux works for me here is it will recursively search the cluster/apps folder until it finds the most top level kustomization.yaml
per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml
will generally only have a namespace resource and one or many Flux kustomizations. Those Flux kustomizations will generally have a HelmRelease
or other resources related to the application underneath it which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When PRs are merged Flux applies the changes to my cluster.
Directory structure
My home-ops repository contains the following directories under cluster.
๐ cluster # My main kubernetes cluster
โโโ๐ apps # Apps deployed into my cluster grouped by namespace (see below)
โโโ๐ base # Flux entrypoint
โโโ๐ core # Important applications that should never be pruned by flux
โโโ๐ crds # Custom resource definitions (CRDs) that need to exist globally
Storage
Storage in my cluster is handled in a number of ways. The in-cluster storage is provided by a longhorn cluster that is running on a number of my nodes.
Distributed storage
The bulk of my cluster storage relies on democratic-csi. This ensures that my data is replicated across my storage nodes.
NFS storage
Finally, I have my NAS that exposes several exports over NFS. Given how NFS is a very bad idea for storing application data (see for example this Github issue) I only use it to store data at rest, such as my personal media files, Linux ISO's, backups, etc.
Backups
Longhorn creates a backup daily of each PVC to my NAS. I have configured TrueNAS to upload backups daily to B2 Cloud
Helper Apps
Outside of the core of Kubernetes I run some apps that help to maintain de state of the cluster
kyverno
Kyverno is a policy engine for kubernetes, policies can validate, mutate, generate and cleanup Kubernetes resources
These are the policies I'm currently running on my cluster.
- remove-cpu-limits: This policy removes CPU limits from all Pods.
reloader
A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig.
Apply changes to Pods after modifing a ConfigMap/Secret without having to restart it manually, uses an annotation to enable
descheduler
Kubernetes scheduler decisions are influenced by its view of a Kubernetes cluster at that point of time when a new pod appears for scheduling. As Kubernetes clusters are very dynamic and their state changes over time, there may be desire to move already running pods to some other nodes for various reasons:
- Some nodes are under or over utilized.
- The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node affinity requirements are not satisfied any more.
- Some nodes failed and their pods moved to other nodes.
- New nodes are added to clusters.
node-feature-discovery
Kubernetes add-on for detecting hardware features and system configuration!
My main usage is to detect the USB Bluetooth adapter.