No description
  • HCL 70.1%
  • Shell 29.9%
Find a file
statevault 29bd00ace6 fix(forgejo): codify persistence.size = 19Gi (matches live PVC)
Live Forgejo PVC was already expanded to 19Gi on 2026-05-12 after
registry blob accumulation (Odoo base images + kaniko cache layers)
filled the 10Gi tactical pin from the 2026-05-03 incident rebuild.
Codifies the live state so the next `tofu apply` doesn't drift to
10Gi (Longhorn would reject the shrink anyway, but the chart spec
would be misleading).

Comment in-file captures: bump rationale, why 20Gi was rejected by
the longhorn admission webhook (loco-wo-0 over-provisioned by ~70
MiB at OverProvisioningPercentage=100), and the long-term fix path
(split registry to its own PVC OR raise overprovisioning).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 14:13:55 +02:00
.woodpecker fix(ci): install kubectl in deploy step 2026-04-23 11:55:44 +02:00
docs fix(forgejo): switch storageClass from longhorn-r3 → longhorn 2026-05-03 12:47:44 +02:00
scripts feat(woodpecker): codify org-state bootstrap (SA patch + secrets + repo activation) 2026-05-04 13:49:28 +02:00
.gitignore fix(forgejo): move Forgejo DB to shared-pg, drop dedicated cluster 2026-04-23 11:00:04 +02:00
.terraform.lock.hcl Integrate Forgejo with OpenBao OIDC authentication 2026-04-02 03:46:35 +02:00
bao.yml feat(forgejo): codify claude admin PAT auto-issuance 2026-05-04 11:39:32 +02:00
external-secrets.tf feat: take ownership of Forgejo + Woodpecker from platform 2026-04-23 12:34:56 +02:00
forgejo-claude-pat.tf feat(forgejo): outbound mail via Stalwart with per-service token 2026-05-06 20:54:30 +02:00
forgejo-pull-secret.tf fix(woodpecker): codify forgejo-pull imagePullSecret in forgejo namespace 2026-05-04 13:09:14 +02:00
forgejo.tf fix(forgejo): codify persistence.size = 19Gi (matches live PVC) 2026-05-13 14:13:55 +02:00
mkdocs.yml Docs overhaul: add OIDC, RBAC, monitoring, expanded secrets coverage 2026-04-02 17:59:07 +02:00
README.md docs(readme): update tofu-state path example to canonical keyspace 2026-05-10 07:31:09 +02:00
terraform.tfvars chore: remove Harbor from tf (Harbor was retired) 2026-04-23 11:11:07 +02:00
variables.tf chore(variables): point smtp_token description at secret/infra/smtp/forgejo 2026-05-06 21:10:24 +02:00
versions.tf chore(state): repoint platform remote-state read to platform/talos-platform 2026-05-01 17:59:17 +02:00
woodpecker-bootstrap.tf feat(forgejo): outbound mail via Stalwart with per-service token 2026-05-06 20:54:30 +02:00
woodpecker.tf feat(woodpecker): persist CI pool node labels in tofu 2026-05-11 16:07:44 +02:00

k8s-build-env

Build-environment overlay on the Talos loco cluster (Hetzner Cloud). Deploys Forgejo (v15 LTS, git forge + OCI registry) and Woodpecker (CI) on top of the generic platform provided by talos-hcloud-cluster.

The platform repo is the sole owner of the cluster itself, shared-pg, OpenBao, ESO, ingress-nginx, cert-manager, Longhorn, and the monitoring stack. This repo owns everything Forgejo- and Woodpecker-specific: helm releases, ExternalSecrets, OIDC/OAuth registration, gRPC ingress, and the CI pipeline definitions.

Services

Service URL Purpose
Forgejo git.loop-coop.net Git forge (v15 LTS) + OCI container/helm registry
Woodpecker ci.loop-coop.net CI/CD — Kubernetes backend, pipelines run as pods

Container images and Helm charts for cluster workloads are pushed to and pulled from Forgejo's OCI endpoint at git.loop-coop.net/projects/*.

Architecture

┌─ Platform (talos-hcloud-cluster) ───────────────────────────────┐
│  database/shared-pg (CNPG, 3 replicas)                          │
│  openbao/openbao (3-node raft HA)                               │
│  external-secrets operator + per-ns SecretStores                │
│  ingress-nginx, cert-manager, Longhorn, Prometheus stack        │
└──────────────────────────┬──────────────────────────────────────┘
                           │
┌─ k8s-build-env (this repo, namespace: forgejo) ─────────────────┐
│  helm_release.forgejo                                           │
│    └─ DB: shared-pg/forgejo, PVC: gitea-shared-storage (5Gi)    │
│    └─ Admin via ESO-synced `forgejo-admin` Secret                │
│    └─ FORGEJO__database__PASSWD via ESO-synced `forgejo-db`     │
│  helm_release.woodpecker                                        │
│    └─ DB: shared-pg/woodpecker (datasource via ESO `woodpecker-db`) │
│    └─ OAuth client with Forgejo via `woodpecker-forgejo-oauth`  │
│    └─ gRPC NodePort 30900 + TLS Ingress grpc.ci.<domain>        │
│  8 ExternalSecrets (see external-secrets.tf)                    │
│  null_resource.forgejo_oidc_source (register Loop Portal OIDC)  │
└─────────────────────────────────────────────────────────────────┘

Storage

StorageClass Replicas Used by this repo
longhorn (default) 1 gitea-shared-storage (10Gi for Forgejo)

The legacy longhorn-r3 SC was retired 2026-05-03 — the name was a leftover from a 3-replica design abandoned 2026-04-10 (WG-mesh sync overhead consumed 80%+ worker CPU). After the bump to numberOfReplicas=1 the SC was functionally identical to default longhorn; consumers migrated and the SC was removed from the platform module. Forgejo PVC tactically pinned at 10 GiB during the 2026-05-03 incident rebuild; can grow back to 30 GiB once worker volumes are resized (PVC expansion is allowed; shrinking below 10 GiB is forbidden by k8s).

Woodpecker

  • Server queues pipelines; agents (K8s backend) execute them as pods in the forgejo namespace.
  • PostgreSQL on shared-pg; datasource injected via envFrom from the woodpecker-db Secret.
  • OAuth2 with Forgejo (confidential client). Client credentials in OpenBao at secret/infra/forgejo/woodpecker-oauth, synced to the woodpecker-forgejo-oauth Secret by ESO.
  • Webhooks auto-registered with the internal URL via WOODPECKER_WEBHOOK_HOST.
  • WOODPECKER_REPO_OWNERS auto-trusts repos from admin_users (default aienv-admin, claude, stefan) plus projects, qubeos, upstreams.
  • Agents pull pipeline step images via the forgejo-pull imagePullSecret (managed by the platform module).

Secrets management

Secrets flow from OpenBao (platform-managed) to K8s via the External Secrets Operator. The openbao SecretStore in each namespace is seeded by the platform and points at openbao-active.openbao.svc:8200 (leader-only routing).

ExternalSecrets defined in this repo — each consumes a path under secret/infra/* in OpenBao:

ExternalSecret OpenBao path K8s Secret shape Consumers
forgejo-pg-credentials secret/infra/pg/forgejo username, password CNPG migrations + scripts (shared-pg is the real cluster)
forgejo-admin secret/infra/forgejo/admin username, password Forgejo helm gitea.admin.existingSecret
forgejo-db secret/infra/pg/forgejo FORGEJO__database__PASSWD (templated) Forgejo pod env (gitea.additionalConfigFromEnvs)
forgejo-oidc-client secret/infra/oidc/forgejo client-id, client-secret null_resource.forgejo_oidc_source script
hetzner-s3-creds secret/infra/s3/hetzner ACCESS_KEY_ID, SECRET_ACCESS_KEY CNPG barmanObjectStore (on platform-owned clusters)
woodpecker-pg-credentials secret/infra/pg/woodpecker username, password shared-pg role mgmt scripts
woodpecker-db secret/infra/pg/woodpecker WOODPECKER_DATABASE_DATASOURCE (templated) Woodpecker pod env (extraSecretNamesForEnvFrom)
woodpecker-forgejo-oauth secret/infra/forgejo/woodpecker-oauth WOODPECKER_FORGEJO_CLIENT, WOODPECKER_FORGEJO_SECRET Woodpecker pod env

The Forgejo admin Secret is ESO/bao-authoritative — the helm release uses gitea.admin.existingSecret = "forgejo-admin" so Helm never overwrites the password.

Woodpecker OAuth2 setup (one-time bootstrap)

First deploy only. The register-woodpecker-oauth.sh script needs write access to k8s OpenBao, so it runs outside tofu:

cd ~/work/identity/k8s-build-env
KUBECONFIG=$(pwd)/.kubeconfig DOMAIN=loop-coop.net \
  bao-exec k8s-bao -- bash scripts/register-woodpecker-oauth.sh

The script creates the OAuth2 app in Forgejo, then writes the credentials to secret/infra/forgejo/woodpecker-oauth. After that, the woodpecker-forgejo-oauth ExternalSecret syncs them into Kubernetes automatically.

Prerequisites

  • talos-hcloud-cluster platform deployed — provides cluster, shared-pg, OpenBao, ESO, ingress-nginx, cert-manager, Longhorn, monitoring.
  • OpenBao KV paths seeded under secret/infra/* (see table above). Admin creds at secret/infra/forgejo/admin must exist before the helm release starts.
  • tofu-state HTTP backend reachable:
    • From workstation: http://localhost:8223 (qrexec tunnel to the openbao qube; creds via bao-exec tofu-state).
    • From cluster CI: http://10.90.0.12:8223 (mesh IP of the openbao qube via WireGuard).

Project structure

versions.tf           # HTTP backend (10.90.0.12:8223), providers, platform remote state
variables.tf          # domain, admin_users, forgejo_admin_user
terraform.tfvars      # domain default
external-secrets.tf   # 8 ExternalSecret CRs (OpenBao → K8s Secrets)
forgejo.tf            # namespace, kubeconfig file, helm_release.forgejo, OIDC registration
woodpecker.tf         # helm_release.woodpecker, gRPC NodePort patch, TLS gRPC Ingress
scripts/
  register-oidc.sh               # registers Loop Portal OIDC source in Forgejo
  register-woodpecker-oauth.sh   # one-time bootstrap — see above
.woodpecker/ci.yml    # fmt + validate (push/PR)
.woodpecker/deploy.yml # tofu apply (manual trigger)

backend_override.tf is gitignored. The committed versions.tf backend points at the mesh IP for CI; your workstation overrides to localhost:8223 via backend_override.tf:

terraform {
  backend "http" {
    address        = "http://localhost:8223/identity/k8s-build-env"
    lock_address   = "http://localhost:8223/identity/k8s-build-env"
    unlock_address = "http://localhost:8223/identity/k8s-build-env"
  }
}

data "terraform_remote_state" "platform" {
  backend = "http"
  config  = { address = "http://localhost:8223/platform/talos-platform" }
}

Variables

Variable Default Description
domain loop-coop.net Base domain for ingress hostnames
admin_users ["aienv-admin","claude","stefan"] Comma-joined into WOODPECKER_ADMIN + WOODPECKER_REPO_OWNERS
forgejo_admin_user claude gitea.admin.username (password comes from ESO-synced Secret)

No password TF variables — admin credentials are ESO-managed.

CI/CD

  • ci.yml (push / PR): tofu fmt -check -recursive + tofu init -backend=false && tofu validate.
  • deploy.yml (manual): installs kubectl in the ci-runner image, then tofu init && tofu apply -auto-approve. Uses the tofu_state_user + tofu_state_password Woodpecker org secrets to authenticate against the HTTP backend at 10.90.0.12:8223.

Woodpecker org secrets (projects org)

Secret Purpose
tofu_state_user HTTP basic-auth user for the tofu-state backend
tofu_state_password HTTP basic-auth password for the tofu-state backend
forgejo_token Forgejo API token for release uploads etc.
forgejo_registry_username Pull/push creds for git.loop-coop.net OCI registry
forgejo_registry_password Pull/push creds for git.loop-coop.net OCI registry

Deploy

Manual (workstation)

cd /home/user/work/identity/k8s-build-env
bao-exec admin -- bao-exec tofu-stack -- /usr/bin/tofu init
bao-exec admin -- bao-exec tofu-stack -- /usr/bin/tofu apply

The tofu-stack scope injects the state-backend creds via env.

CI (manual Woodpecker trigger)

bao-exec woodpecker -- woodpecker-cli pipeline create --branch main projects/k8s-build-env

Relationship to talos-hcloud-cluster

Everything under "Platform" in the architecture diagram above is provided by talos-hcloud-cluster. This repo adds:

  • Forgejo + Woodpecker helm releases.
  • 8 ExternalSecrets with infra/* bao paths.
  • Loop Portal OIDC source registration in Forgejo.
  • Woodpecker ↔ Forgejo OAuth app (one-time bootstrap script).
  • gRPC TLS Ingress at grpc.ci.<domain> for external Woodpecker agents.
  • CI pipelines for this repo itself.

Generic resources (namespace, SecretStore, forgejo-pull imagePullSecret, TLS cert backup, Prometheus ServiceMonitor/PodMonitor, RBAC RoleBindings) stay in talos-hcloud-cluster/stack/modules/services/.

Cost

Marginal — this repo adds no new Hetzner infra, just helm releases on top of the platform cluster (~55 EUR/month).