- HCL 70.1%
- Shell 29.9%
Live Forgejo PVC was already expanded to 19Gi on 2026-05-12 after registry blob accumulation (Odoo base images + kaniko cache layers) filled the 10Gi tactical pin from the 2026-05-03 incident rebuild. Codifies the live state so the next `tofu apply` doesn't drift to 10Gi (Longhorn would reject the shrink anyway, but the chart spec would be misleading). Comment in-file captures: bump rationale, why 20Gi was rejected by the longhorn admission webhook (loco-wo-0 over-provisioned by ~70 MiB at OverProvisioningPercentage=100), and the long-term fix path (split registry to its own PVC OR raise overprovisioning). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .woodpecker | ||
| docs | ||
| scripts | ||
| .gitignore | ||
| .terraform.lock.hcl | ||
| bao.yml | ||
| external-secrets.tf | ||
| forgejo-claude-pat.tf | ||
| forgejo-pull-secret.tf | ||
| forgejo.tf | ||
| mkdocs.yml | ||
| README.md | ||
| terraform.tfvars | ||
| variables.tf | ||
| versions.tf | ||
| woodpecker-bootstrap.tf | ||
| woodpecker.tf | ||
k8s-build-env
Build-environment overlay on the Talos loco cluster (Hetzner Cloud). Deploys Forgejo (v15 LTS, git forge + OCI registry) and Woodpecker (CI) on top of the generic platform provided by talos-hcloud-cluster.
The platform repo is the sole owner of the cluster itself, shared-pg, OpenBao, ESO, ingress-nginx, cert-manager, Longhorn, and the monitoring stack. This repo owns everything Forgejo- and Woodpecker-specific: helm releases, ExternalSecrets, OIDC/OAuth registration, gRPC ingress, and the CI pipeline definitions.
Services
| Service | URL | Purpose |
|---|---|---|
| Forgejo | git.loop-coop.net |
Git forge (v15 LTS) + OCI container/helm registry |
| Woodpecker | ci.loop-coop.net |
CI/CD — Kubernetes backend, pipelines run as pods |
Container images and Helm charts for cluster workloads are pushed to and pulled from Forgejo's OCI endpoint at git.loop-coop.net/projects/*.
Architecture
┌─ Platform (talos-hcloud-cluster) ───────────────────────────────┐
│ database/shared-pg (CNPG, 3 replicas) │
│ openbao/openbao (3-node raft HA) │
│ external-secrets operator + per-ns SecretStores │
│ ingress-nginx, cert-manager, Longhorn, Prometheus stack │
└──────────────────────────┬──────────────────────────────────────┘
│
┌─ k8s-build-env (this repo, namespace: forgejo) ─────────────────┐
│ helm_release.forgejo │
│ └─ DB: shared-pg/forgejo, PVC: gitea-shared-storage (5Gi) │
│ └─ Admin via ESO-synced `forgejo-admin` Secret │
│ └─ FORGEJO__database__PASSWD via ESO-synced `forgejo-db` │
│ helm_release.woodpecker │
│ └─ DB: shared-pg/woodpecker (datasource via ESO `woodpecker-db`) │
│ └─ OAuth client with Forgejo via `woodpecker-forgejo-oauth` │
│ └─ gRPC NodePort 30900 + TLS Ingress grpc.ci.<domain> │
│ 8 ExternalSecrets (see external-secrets.tf) │
│ null_resource.forgejo_oidc_source (register Loop Portal OIDC) │
└─────────────────────────────────────────────────────────────────┘
Storage
| StorageClass | Replicas | Used by this repo |
|---|---|---|
longhorn (default) |
1 | gitea-shared-storage (10Gi for Forgejo) |
The legacy longhorn-r3 SC was retired 2026-05-03 — the name was a leftover from a 3-replica design abandoned 2026-04-10 (WG-mesh sync overhead consumed 80%+ worker CPU). After the bump to numberOfReplicas=1 the SC was functionally identical to default longhorn; consumers migrated and the SC was removed from the platform module. Forgejo PVC tactically pinned at 10 GiB during the 2026-05-03 incident rebuild; can grow back to 30 GiB once worker volumes are resized (PVC expansion is allowed; shrinking below 10 GiB is forbidden by k8s).
Woodpecker
- Server queues pipelines; agents (K8s backend) execute them as pods in the
forgejonamespace. - PostgreSQL on shared-pg; datasource injected via
envFromfrom thewoodpecker-dbSecret. - OAuth2 with Forgejo (confidential client). Client credentials in OpenBao at
secret/infra/forgejo/woodpecker-oauth, synced to thewoodpecker-forgejo-oauthSecret by ESO. - Webhooks auto-registered with the internal URL via
WOODPECKER_WEBHOOK_HOST. WOODPECKER_REPO_OWNERSauto-trusts repos fromadmin_users(defaultaienv-admin, claude, stefan) plusprojects,qubeos,upstreams.- Agents pull pipeline step images via the
forgejo-pullimagePullSecret (managed by the platform module).
Secrets management
Secrets flow from OpenBao (platform-managed) to K8s via the External Secrets Operator. The openbao SecretStore in each namespace is seeded by the platform and points at openbao-active.openbao.svc:8200 (leader-only routing).
ExternalSecrets defined in this repo — each consumes a path under secret/infra/* in OpenBao:
| ExternalSecret | OpenBao path | K8s Secret shape | Consumers |
|---|---|---|---|
forgejo-pg-credentials |
secret/infra/pg/forgejo |
username, password |
CNPG migrations + scripts (shared-pg is the real cluster) |
forgejo-admin |
secret/infra/forgejo/admin |
username, password |
Forgejo helm gitea.admin.existingSecret |
forgejo-db |
secret/infra/pg/forgejo |
FORGEJO__database__PASSWD (templated) |
Forgejo pod env (gitea.additionalConfigFromEnvs) |
forgejo-oidc-client |
secret/infra/oidc/forgejo |
client-id, client-secret |
null_resource.forgejo_oidc_source script |
hetzner-s3-creds |
secret/infra/s3/hetzner |
ACCESS_KEY_ID, SECRET_ACCESS_KEY |
CNPG barmanObjectStore (on platform-owned clusters) |
woodpecker-pg-credentials |
secret/infra/pg/woodpecker |
username, password |
shared-pg role mgmt scripts |
woodpecker-db |
secret/infra/pg/woodpecker |
WOODPECKER_DATABASE_DATASOURCE (templated) |
Woodpecker pod env (extraSecretNamesForEnvFrom) |
woodpecker-forgejo-oauth |
secret/infra/forgejo/woodpecker-oauth |
WOODPECKER_FORGEJO_CLIENT, WOODPECKER_FORGEJO_SECRET |
Woodpecker pod env |
The Forgejo admin Secret is ESO/bao-authoritative — the helm release uses gitea.admin.existingSecret = "forgejo-admin" so Helm never overwrites the password.
Woodpecker OAuth2 setup (one-time bootstrap)
First deploy only. The register-woodpecker-oauth.sh script needs write access to k8s OpenBao, so it runs outside tofu:
cd ~/work/identity/k8s-build-env
KUBECONFIG=$(pwd)/.kubeconfig DOMAIN=loop-coop.net \
bao-exec k8s-bao -- bash scripts/register-woodpecker-oauth.sh
The script creates the OAuth2 app in Forgejo, then writes the credentials to secret/infra/forgejo/woodpecker-oauth. After that, the woodpecker-forgejo-oauth ExternalSecret syncs them into Kubernetes automatically.
Prerequisites
talos-hcloud-clusterplatform deployed — provides cluster, shared-pg, OpenBao, ESO, ingress-nginx, cert-manager, Longhorn, monitoring.- OpenBao KV paths seeded under
secret/infra/*(see table above). Admin creds atsecret/infra/forgejo/adminmust exist before the helm release starts. tofu-stateHTTP backend reachable:- From workstation:
http://localhost:8223(qrexec tunnel to the openbao qube; creds viabao-exec tofu-state). - From cluster CI:
http://10.90.0.12:8223(mesh IP of the openbao qube via WireGuard).
- From workstation:
Project structure
versions.tf # HTTP backend (10.90.0.12:8223), providers, platform remote state
variables.tf # domain, admin_users, forgejo_admin_user
terraform.tfvars # domain default
external-secrets.tf # 8 ExternalSecret CRs (OpenBao → K8s Secrets)
forgejo.tf # namespace, kubeconfig file, helm_release.forgejo, OIDC registration
woodpecker.tf # helm_release.woodpecker, gRPC NodePort patch, TLS gRPC Ingress
scripts/
register-oidc.sh # registers Loop Portal OIDC source in Forgejo
register-woodpecker-oauth.sh # one-time bootstrap — see above
.woodpecker/ci.yml # fmt + validate (push/PR)
.woodpecker/deploy.yml # tofu apply (manual trigger)
backend_override.tf is gitignored. The committed versions.tf backend points at the mesh IP for CI; your workstation overrides to localhost:8223 via backend_override.tf:
terraform {
backend "http" {
address = "http://localhost:8223/identity/k8s-build-env"
lock_address = "http://localhost:8223/identity/k8s-build-env"
unlock_address = "http://localhost:8223/identity/k8s-build-env"
}
}
data "terraform_remote_state" "platform" {
backend = "http"
config = { address = "http://localhost:8223/platform/talos-platform" }
}
Variables
| Variable | Default | Description |
|---|---|---|
domain |
loop-coop.net |
Base domain for ingress hostnames |
admin_users |
["aienv-admin","claude","stefan"] |
Comma-joined into WOODPECKER_ADMIN + WOODPECKER_REPO_OWNERS |
forgejo_admin_user |
claude |
gitea.admin.username (password comes from ESO-synced Secret) |
No password TF variables — admin credentials are ESO-managed.
CI/CD
- ci.yml (push / PR):
tofu fmt -check -recursive+tofu init -backend=false && tofu validate. - deploy.yml (manual): installs
kubectlin the ci-runner image, thentofu init && tofu apply -auto-approve. Uses thetofu_state_user+tofu_state_passwordWoodpecker org secrets to authenticate against the HTTP backend at10.90.0.12:8223.
Woodpecker org secrets (projects org)
| Secret | Purpose |
|---|---|
tofu_state_user |
HTTP basic-auth user for the tofu-state backend |
tofu_state_password |
HTTP basic-auth password for the tofu-state backend |
forgejo_token |
Forgejo API token for release uploads etc. |
forgejo_registry_username |
Pull/push creds for git.loop-coop.net OCI registry |
forgejo_registry_password |
Pull/push creds for git.loop-coop.net OCI registry |
Deploy
Manual (workstation)
cd /home/user/work/identity/k8s-build-env
bao-exec admin -- bao-exec tofu-stack -- /usr/bin/tofu init
bao-exec admin -- bao-exec tofu-stack -- /usr/bin/tofu apply
The tofu-stack scope injects the state-backend creds via env.
CI (manual Woodpecker trigger)
bao-exec woodpecker -- woodpecker-cli pipeline create --branch main projects/k8s-build-env
Relationship to talos-hcloud-cluster
Everything under "Platform" in the architecture diagram above is provided by talos-hcloud-cluster. This repo adds:
- Forgejo + Woodpecker helm releases.
- 8 ExternalSecrets with
infra/*bao paths. - Loop Portal OIDC source registration in Forgejo.
- Woodpecker ↔ Forgejo OAuth app (one-time bootstrap script).
- gRPC TLS Ingress at
grpc.ci.<domain>for external Woodpecker agents. - CI pipelines for this repo itself.
Generic resources (namespace, SecretStore, forgejo-pull imagePullSecret, TLS cert backup, Prometheus ServiceMonitor/PodMonitor, RBAC RoleBindings) stay in talos-hcloud-cluster/stack/modules/services/.
Cost
Marginal — this repo adds no new Hetzner infra, just helm releases on top of the platform cluster (~55 EUR/month).