openova/platform/cert-manager/blueprint.yaml
e3mrah 83f9fc429a
fix(bp-cert-manager): add CRD-establishment gate to close ClusterIssuer race (#149) (#1355)
Closes #149 (prov #24, c776423270f4ae30): bp-cert-manager terminal failure
"no matches for kind ClusterIssuer in version cert-manager.io/v1" — the
post-install ClusterIssuer hook (weight 5) fires before the cert-manager.io
ClusterIssuer CRD reaches status.conditions[?(@.type=="Established")].status
== "True". The upstream Jetstack subchart installs CRDs as regular templates
(no helm.sh/hook), so kubectl apply returns when the resource is CREATED —
not when the apiextensions-apiserver has finished Establishing it. Async in
the apiserver; observed up to 30s on fresh Hetzner cold-start k3s.

Target-state fix per docs/INVIOLABLE-PRINCIPLES.md #4 (no hardcoded
band-aids): a post-install,post-upgrade hook-weight -10 Job that polls
every CRD in values.crdGate.crds for Established=True. Only after the
gate exits 0 does the ClusterIssuer hook (weight 5) fire. Models the
canonical webhook-gate pattern from bp-external-secrets-stores (#137,
#143) — same SA + ClusterRole + ClusterRoleBinding + Job triplet.

300s budget gives ~10x headroom over worst-case observed Established
latency while still failing fast on a genuinely broken upstream.

Chart 1.1.2 -> 1.2.0 (minor bump: new templates + new values stanza).
HR pins in clusters/_template + clusters/omantel + clusters/otech
bumped to 1.2.0.

Per principle 16: canonical seam = the chart's templates/clusterissuer-*.yaml
post-install hook. Per principle 18: every gate knob (enabled, crds,
timeoutSeconds, intervalSeconds, image, imagePullPolicy) templatable.

## Claimed TCs

- prov #24 bp-cert-manager Ready=True (and downstream HRs that depend on
  cert-manager: bp-cilium-gateway, bp-harbor, bp-gitea, bp-keycloak,
  bp-openbao, bp-catalyst-platform — all unblocked once cert-manager
  goes Ready)

Co-authored-by: openova-bot <claude@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 08:28:06 +04:00

42 lines
2.1 KiB
YAML

apiVersion: catalyst.openova.io/v1alpha1
kind: Blueprint
metadata:
name: bp-cert-manager
labels:
catalyst.openova.io/section: pts-3-3-security-and-policy
spec:
version: 1.2.0
card:
title: cert-manager
summary: TLS certificate automation. Lets Encrypt issuer with Dynadot DNS-01 for omani.works pool, HTTP-01 for BYO domains.
visibility: unlisted # mandatory infra, auto-installed by bootstrap kit
manifests:
chart: ./chart
depends: []
# ── Outputs advertised to dependent Blueprints (#113) ────────────────────
# Blueprints that issue Certificates (cilium-gateway, harbor, gitea, etc.)
# consume `issuerName` rather than hardcoding "letsencrypt-prod" so that
# operators can swap the active issuer (DNS-01 vs HTTP-01) via the
# bp-catalyst-platform umbrella values without editing every dependent
# chart. The chart's templates/clusterissuer-letsencrypt-dns01.yaml ships
# both issuers; this output names the one a dependent Blueprint should
# default to in production.
outputs:
# Default issuer name. As of openova#159 (bp-cert-manager-dynadot-webhook
# ships) the wildcard-capable DNS-01 issuer is enabled by default in
# the chart's values.yaml — dependents that issue wildcard certs (the
# Cilium Gateway's *.<sub>.<pool> TLS listener) reference this name
# directly. Cluster overlays MAY revert to letsencrypt-http01-prod by
# flipping certManager.issuers.dns01.enabled=false in the umbrella
# chart values; the http01 issuer remains templated for that path.
issuerName: letsencrypt-dns01-prod
# Kind is always ClusterIssuer for Catalyst — the wildcard cert lives
# in cilium-gateway and is consumed cluster-wide.
issuerKind: ClusterIssuer
# The TARGET-STATE wildcard issuer name. Equal to issuerName above now
# that DNS-01 is the default; kept as a separate field so a dependent
# Blueprint that explicitly needs the wildcard variant can pin to it
# independent of which issuer is currently the default.
wildcardIssuerName: letsencrypt-dns01-prod