| ansible | ||
| images | ||
| talos/patches | ||
| .gitignore | ||
| LICENSE | ||
| makefile | ||
| README.md | ||
homelab
Talos cluster
The repo contains the ansible code to init the actual k8s cluster and the apps inside that cluster. The ansible is designed to be idempotent and therefore you can repeat the command multiple times against a cluster and it will only bring the cluster up if it isn't already. There are 2 phases:
- init the cluster infra
- bring up the apps a. core apps b. user apps
Init cluster infra
To run the infra playbook, you need to have 3 nodes running in talos maintenance mode (booted from a live usb). If talos was already installed on those nodes you'll need to erase the disk by installing another os and then boot from the talos usb.
- install specific version of talosctl
- apply and bootstrap the cluster
- install cilium networking
- init virtual ip
Steps to run before running the init cluster infra step
rm -rf talos/rendered/rm ~/.talos/config
Command
make infra
Init core apps
prerequisites create a secret_vars.yaml with an 'email_addr' value and create a cloudflare_config.json with the relevant values. I've cloned the cloudflare ddns repo so that I don't have to pin the docker image to "latest" as the original repo does.
- metal lb
- nginx ingress controller
- cert manager
- blocky (dns proxy and ad blocker)
- wire guard
- dynamic dns
- nfs storage csi
- kube-prometheus-stack
- loki-stack
- version-checker
To run ansible that uses the kubernetes community package you need to run the ansible commands from a python venv:
. venv/homelab/bin/activate
make core-apps
make all-apps
make app app=<app-name> to make a single app
Core explained
Metallb provides 3 ips that it load balances from.
- 192.168.222.197 - the cluster load balancer for apps etc
- 192.168.222.198 - wireguard server
- 192.168.222.199 - load balancer used for internal network traffic to access the blocky dns server
- 192.168.222.200 - talos control plane vip
Blocky is the network wide dns server, all traffic on the vpn or local networks resolves dns via blocky. It also serves as an ad blocker by dns sink holing any unwanted traffic. We run 3 replicas and a redis server to store a cache of dns queries across the instances. Blocky is also configured to use dns over tls (DoT), this encrypts dns packets so actors, especially our ISP can't snoop.
Wireguard is our vpn and allows users to tunnel safely and securely to inside our network. Once we are directed to the wireguard app via the router we are then pointed to the correct app via our blocky dns server, which will redirect to the cluster ip.
External or non-local traffic flows into the router from the vpn and from the vpn to blocky and onto which app they request in the cluster.
Internal or local traffic flows from inside the network via blocky and then out to the internet or whatever local service it needs.
Monitoring
The monitoring namespace contains kube-prometheus-stack which sets up prometheus, grafana and alertmanager. It also contains the loki-stack chart which deploys loki for logging. It's important to note, that this stack is no longer supported and cannot push the loki version past 2.9.3. Find the loki dashboard here
Version Checker
Version checker scans the cluster and checks the version of images against their repos for the latest releases. However, the scanner doesn't pick up images in github container repo (ghcr), so I will need to check those images manually:
- forgjo (external to cluster)
- mealie
- sonarr (hotio)
- radarr (hotio)
- prowlarr (hotio)
Init user apps
mealie
Contains ~3000 recipes scraped programmatically. The mealie app doesn't appear in version checker because it is hosted in github, so I need to check that manually. There is a cron job in k8s that triggers the db backup at 1am the 1st of each month, the backup is then pushed to a bucket at 3am on the 2nd of every month, triggered by a cron in TrueNas.
actual-budget
The actual budget server keeps in sync data from the clients. The server and the client should be the same version. Backups are taken on the client device.
photo-prism
Photos are pushed from the client devices (when they are on charge) to the photo prism instance. Photoprism has a mariadb for metadata and the actual images are kept on the nfs (TrueNas). If you ever get in trouble with the mariadb instance which has a pvc, don't hesitate to blast the pvc and restart the mariadb instance, then you can restore the db from the nfs backup found photo_prism/storage/backup/mysql. Backups are pushed to a bucket nightly (via TrueNas sync).
joplin-server
Joplin server keeps in sync notes across devices. The server uses a postgres instance to store the data, currently backups are handled on the client.
kanboard
Data is stored on the nfs. Data is split into plugins, board data and certs (certs aren't really relevant here because I use nginx ingress). The entire kandboard folder is fairly small and pushed up to a bucket nightly, the db is found kanboard/data/db.sqlite.
sonarr, radarr, prowlarr, flaresolverr, rclone and jellyfin
These apps are found in the arrs namespace and work by sharing volume mounts and connecting to an externally hosted seed box. Files are then rclone mounted to a shared direction and worked on by the arr apps
find-my-device (fmd)
Open source self-hosted device tracker, allows you to to locate, ring, wipe and issue other commands to your device when it's lost. It aims to be a secure open source alternative to Google's Find My Device. server source and android app source. Uses ntfy to push notifications.
ntfy
Server pushes notifications to the find my device phone app website.
readeck
A nice platform to save, read, tag and highlight online articles. Useful for understanding an article and then when you revist it you can read through the highlights rather than the whole article.
