Commit Graph

18 Commits

Author SHA1 Message Date
e3mrah
ab67a48fe7
fix(blueprints): align blueprint.yaml spec.version with Chart.yaml version (#817) (#819)
TestBootstrapKit_BlueprintCardsHaveRequiredFields was failing on main for
9 blueprints because their platform/<name>/chart/Chart.yaml version had
been bumped without a matching update to platform/<name>/blueprint.yaml
spec.version. The pre-existing failure forced 7 recent PRs to self-merge
with --admin, masking real CI failures.

Aligned spec.version to match Chart.yaml version on:

  cert-manager   1.1.1 -> 1.1.2
  flux           1.1.3 -> 1.1.4
  crossplane     1.1.3 -> 1.1.4
  sealed-secrets 1.1.1 -> 1.1.2
  spire          1.1.4 -> 1.1.7
  nats-jetstream 1.1.1 -> 1.1.2
  openbao        1.2.0  -> 1.2.14
  keycloak       1.3.1 -> 1.3.2
  gitea          1.2.1 -> 1.2.3

Verified locally:

  $ go test ./... -run TestBootstrapKit_BlueprintCardsHaveRequiredFields -count=1
  --- PASS: TestBootstrapKit_BlueprintCardsHaveRequiredFields (0.01s)
      ... all 10 sub-tests pass (cilium + the 9 above)

The existing test (tests/e2e/bootstrap-kit/main_test.go:145) is itself
the drift guardrail: it fails CI whenever Chart.yaml is bumped without a
matching blueprint.yaml bump. No additional script needed.

Closes #817 once verified on main.

Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>
2026-05-04 22:32:49 +04:00
e3mrah
83ec889f06
feat(platform): add global.imageRegistry to remaining bp-* charts + bp-catalyst-platform (PR 3/3, #560) (#580)
Charts bumped:
- bp-keycloak 1.2.0 -> 1.2.1 (subchart stub; per-component image.registry knobs documented)
- bp-crossplane 1.1.3 -> 1.1.4 (subchart stub)
- bp-crossplane-claims 1.1.0 -> 1.1.1 (global.kubectlImage added; kubectl Job image templated; Hetzner ubuntu-24.04 server images intentionally untouched)
- bp-velero 1.2.0 -> 1.2.1 (subchart stub)
- bp-kyverno 1.0.0 -> 1.0.1 (subchart stub; per-controller image.registry knobs documented)
- bp-trivy 1.0.0 -> 1.0.1 (subchart stub; both operator + scanner image.registry knobs documented)
- bp-grafana 1.0.0 -> 1.0.1 (subchart stub)
- bp-flux 1.1.3 -> 1.1.4 (subchart stub; per-controller image.repository knobs documented)
- bp-catalyst-platform 1.1.13 -> 1.1.14 (global.imageRegistry + images.{catalystApi,catalystUi,marketplaceApi,console,smeTag} added; all 14 Catalyst-authored image refs templated: catalyst-api, catalyst-ui, marketplace-api, console + 10 SME services)

Post-handover per-Sovereign overlays set global.imageRegistry to harbor.<sovereign-fqdn> so every container image pull routes through the Sovereign's own Harbor proxy_cache.

Closes (partial): issue #560 — all 23 bp-* charts now carry global.imageRegistry

Co-authored-by: alierenbaysal <alierenbaysal@openova.io>
2026-05-02 13:21:53 +04:00
e3mrah
2d1799d738
fix(bp-crossplane): split XRDs+Compositions into bp-crossplane-claims (#247)
Resolves install ordering on fresh clusters where the apiserver rejects
CompositeResourceDefinition CRs because the apiextensions.crossplane.io
CRDs registered by the crossplane subchart aren't live yet at apply time.

- bp-crossplane bumped 1.1.2 -> 1.1.3 (controller-only payload)
- NEW bp-crossplane-claims@1.0.0 carries XRDs + Compositions
- Flux HelmRelease for crossplane-claims uses dependsOn: [bp-crossplane]
- composition-validate.sh + fixtures relocate to the new chart
- blueprint-release CI: opt-out annotation
  catalyst.openova.io/no-upstream=true permits zero-deps charts that
  legitimately ship only Catalyst-authored CRs (the original hollow-chart
  rule remains in force for every other umbrella chart)

Live error this fixes (from otech.omani.works):
  no matches for kind "CompositeResourceDefinition" in version
  "apiextensions.crossplane.io/v1" -- ensure CRDs are installed first

Pattern: intra-chart CRD-ordering breaks -> split charts + Flux dependsOn.
Apply universally to similar cases going forward.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:55:05 +04:00
e3mrah
f658757962
fix(bp-crossplane): resolve CHART_DIR to absolute path in composition-validate.sh (#237)
CI invokes the script as `bash <script> "platform/crossplane/chart"` from
the repo root. The script then `cd`s into that relative path, which works,
but every later `"$CHART_DIR/<sub>"` reference (notably FIXTURE_DIR for
Case 6) inherits the now-stale relative prefix and resolves under the
wrong cwd. Fix: resolve CHART_DIR via `(cd ... && pwd)` to an absolute
path BEFORE the chdir.

Local repro before fix:

  $ bash platform/crossplane/chart/tests/composition-validate.sh \
        platform/crossplane/chart
  ...
  Case 6: every fixture XRC kind is matched by an XRD
  FAIL: fixtures dir platform/crossplane/chart/tests/fixtures missing

Local result after fix:

  $ bash platform/crossplane/chart/tests/composition-validate.sh \
        platform/crossplane/chart
  ...
  Case 6: every fixture XRC kind is matched by an XRD
    PASS
  All bp-crossplane Day-2 CRUD Composition gates green.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:36:07 +02:00
e3mrah
8592d20919
feat(bp-crossplane): 6 XRDs + Compositions for Day-2 CRUD (RegionClaim/ClusterClaim/NodePoolClaim/LoadBalancerClaim/PeeringClaim/NodeActionClaim) (#236)
Adds the 6 CompositeResourceDefinitions and matching Compositions that
back the catalyst-api Day-2 CRUD endpoints. catalyst-api writes XRCs of
these kinds; Crossplane materialises them into provider-hcloud (and a
small number of provider-kubernetes) managed resources. Per
docs/INVIOLABLE-PRINCIPLES.md #3, every cloud-side op flows through
provider-hcloud — never bespoke hcloud-go calls or shell-outs to the
hcloud CLI.

XRDs (canonical group: compose.openova.io/v1alpha1):

  - RegionClaim       → composes the Phase-0 quartet via provider-hcloud:
                        Network + NetworkSubnet + Firewall + Server (cp1)
                        + LoadBalancer + LoadBalancerNetwork +
                        LoadBalancerService×2 + LoadBalancerTarget. Mirrors
                        infra/hetzner/main.tf 1:1 so deletion of a
                        RegionClaim cascades the whole slice.
  - ClusterClaim      → composes a provider-kubernetes Object that
                        materialises a cluster-identity ConfigMap. The
                        catalyst-environment-controller reads the CM to
                        template per-server cloud-init.
  - NodePoolClaim     → composes up to 100 provider-hcloud Server
                        resources. UPDATE flow: patching replicas n→m
                        flips the per-index Required-policy gate so
                        Crossplane creates/deletes Server CRs.
  - LoadBalancerClaim → composes provider-hcloud LoadBalancer +
                        LoadBalancerNetwork + up to 50
                        LoadBalancerService entries (per listener) + up
                        to 50 LoadBalancerTarget entries. UPDATE: patch
                        listeners[]/targets[] → composite controller
                        adds/removes services/targets.
  - PeeringClaim      → composes 1 or 2 provider-hcloud Route resources
                        (bidirectional flag toggles the second one
                        through a Required-policy gate).
  - NodeActionClaim   → composes a provider-kubernetes Object that
                        creates a batch/v1 Job running kubectl
                        cordon/drain (k8s-side op, not a cloud op, per
                        the task spec). action=replace additionally
                        composes a provider-hcloud Server for the
                        replacement node.

UPDATE/DELETE summary:

  - UPDATE: every mutable schema field is patched onto the underlying
    managed resource; Crossplane's composite controller drives the diff
    and provider-hcloud reconciles to the new state.
  - DELETE: every composed resource has deletionPolicy: Delete, so a
    cascade delete of the composite tears down the whole resource graph
    in dependency-safe order (Crossplane retries until deps unblock).

New tests:
  - tests/composition-validate.sh — 7 gates: helm renders cleanly,
    exactly 6 XRDs, ≥ 6 Compositions, all 6 expected claim kinds
    present, every rendered doc is valid YAML, every fixture references
    a real XRD, and (when KUBECONFIG + Crossplane CRDs available)
    server-side dry-run for every fixture.
  - tests/fixtures/<kind>-sample.yaml — one XRC fixture per kind.

Version bump:
  - platform/crossplane/chart/Chart.yaml             1.1.1 → 1.1.2
  - platform/crossplane/blueprint.yaml               1.1.1 → 1.1.2
  - clusters/_template/bootstrap-kit/04-crossplane.yaml         → 1.1.2
  - clusters/otech.omani.works/bootstrap-kit/04-crossplane.yaml → 1.1.2

Hard rules respected:
  - provider-hcloud only for cloud ops (never hcloud-go, never CLI).
  - provider-kubernetes Object for k8s-side ops (never raw kubectl).
  - No bespoke kubectl manifests for cloud resources.
  - Frontend + catalyst-api Go code untouched (sibling-owned).
  - Target state, no MVP framing — all 6 Compositions ship.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:33:38 +02:00
e3mrah
1f5c76def1
fix(platform): sync blueprint.yaml versions with Chart.yaml (#199)
* feat(ui): Playwright cosmetic + step-flow regression guards

15 regression guards in products/catalyst/bootstrap/ui/e2e/cosmetic-
guards.spec.ts that fail HARD when each user-flagged defect class
returns:

  1.  card height drift from canonical 108px
  2.  reserved right padding eating description width
  3.  logo tile drift from per-brand LOGO_SURFACE
  4.  invisible glyph (white-on-white) via luminance proxy
  5.  wizard step order Org/Topology/Provider/Credentials/Components/
      Domain/Review
  6.  legacy "Choose Your Stack" / "Always Included" tab labels
  7.  Domain step reachable before Components
  8.  CPX32 not the recommended Hetzner SKU
  9.  per-region SKU dropdown shows wrong provider catalog
  10. provision page is .html (static) not SPA route
  11. legacy bubble/edge DAG SVG markup on provision page
  12. admin sidebar drift from canonical core/console (w-56 + 7 labels)
  13. AppDetail uses tablist instead of sectioned layout
  14. job rows navigate to /job/<id> instead of expand-in-place
  15. Phase 0 banners (Hetzner infra / Cluster bootstrap) on AdminPage

Each test prints a failure message naming the canonical reference,
the source-of-truth file, and the data-testid PR needed (if any) so
the implementing agent has a precise target. No .skip() — per
INVIOLABLE-PRINCIPLES #2, missing components fail loud.

CI: .github/workflows/cosmetic-guards.yaml runs the suite on every
PR that touches products/catalyst/bootstrap/ui/** or core/console/**.

Docs: docs/UI-REGRESSION-GUARDS.md maps each test to the user's
original complaint, the canonical reference, and the green/red
semantics (5 tests intentionally RED on main today — they stay red
until the companion-agent's UI work lands).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(platform): sync blueprint.yaml versions with Chart.yaml so manifest-validation passes

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 22:07:55 +04:00
hatiyildiz
b1638f51ea fix(bp-* tests): skip helm dep build when charts/ already vendored
Earlier rerun failure on the CI workflow (bp-cert-manager 25120060270):

  Error: no repository definition for https://charts.jetstack.io.
  Please add the missing repos via 'helm repo add'

Root cause: blueprint-release.yaml's earlier `helm dependency build`
step (line 181) successfully resolves the upstream chart and populates
chart/charts/ — but it does NOT `helm repo add` the upstream repo
first. Helm 3.20's `helm dep build` succeeds on the first call by
falling back to direct-URL fetch from Chart.yaml `dependencies[].repository`.
A SECOND `helm dep build` (run by the test script) hits a different
code path that requires the repo to be in the helm repo cache.

Fix: tests/observability-toggle.sh now skips `helm dep build` when
chart/charts/ is already populated (which is always the case in CI
since the workflow's own `helm dependency build` step ran first). Local
dev runs from a fresh checkout still resolve subcharts.

Refs #182

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:12:21 +02:00
hatiyildiz
d34facc040 fix(bp-*): observability toggles default false — break circular CRD dependency
bp-cilium@1.1.0 install fails on every fresh Sovereign with:

  no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
  — ensure CRDs are installed first

Cascades to all 10 other bp-* HelmReleases ("dep is not ready") since
bp-cilium is the root of the bootstrap dep graph. Verified live on
omantel.omani.works 2026-04-29 (issue #182).

Root cause: platform/cilium/chart/values.yaml and
platform/cert-manager/chart/values.yaml hardcoded
`serviceMonitor.enabled: true`. The monitoring.coreos.com/v1 CRDs ship
with kube-prometheus-stack — an Application-tier Blueprint that itself
depends on the bootstrap-kit. Hardcoding `true` creates a circular CRD
ordering: bp-cilium wants the CRD bp-kube-prometheus-stack provides, but
bp-kube-prometheus-stack cannot install before bp-cilium.

The `trustCRDsExist=true` mitigation only suppresses Helm's render-time
gate; the apiserver still rejects the resource at install-time.

Violates INVIOLABLE-PRINCIPLES.md #4 (never hardcode): observability
toggles MUST be operator-tunable, not chart-level constants assuming an
observability tier exists.

This commit:

A. Defaults every observability toggle false in the affected wrappers:
   - platform/cilium/chart/values.yaml:
     cilium.prometheus.enabled: false
     cilium.prometheus.serviceMonitor.enabled: false
     (trustCRDsExist removed — no longer relevant)
   - platform/cert-manager/chart/values.yaml:
     cert-manager.prometheus.enabled: false
     cert-manager.prometheus.servicemonitor.enabled: false
   - platform/crossplane/chart/values.yaml:
     crossplane.metrics.enabled: false
     (uniformity rule — does not break install but holds the invariant)

B. Bumps affected wrapper charts 1.1.0 → 1.1.1:
   - bp-cilium, bp-cert-manager, bp-crossplane (leaves)
   - bp-catalyst-platform (umbrella; deps repinned to 1.1.1 for the 3)

C. Updates clusters/_template/bootstrap-kit/* and
   clusters/omantel.omani.works/bootstrap-kit/* HelmRelease versions to
   1.1.1 so the live Sovereign picks up the fix on Flux reconcile.

D. Adds platform/<name>/chart/tests/observability-toggle.sh under each
   affected chart. Each script asserts:
     - default render produces zero monitoring.coreos.com refs
     - opt-in render with --set <toggle>=true succeeds and produces a
       ServiceMonitor (proves the toggle is wired)
     - explicit-off render succeeds and produces zero refs
   Wired into .github/workflows/blueprint-release.yaml via a new
   "Run chart integration tests" step that executes every chart/tests/
   *.sh on every publish — a regression that re-introduces a hardcoded
   `true` fails the publish job before the OCI artifact is pushed.

E. Documents the rule in docs/BLUEPRINT-AUTHORING.md §11.2
   "Observability toggles must default false". References Principle #4
   and provides the canonical pattern (default off in wrapper values,
   opt-in via per-cluster overlay at clusters/<sovereign>/...).

Per-chart audit table (which toggle was hardcoded → new default):

| Chart            | Toggle                                                   | Was  | Now   |
|------------------|----------------------------------------------------------|------|-------|
| bp-cilium        | cilium.prometheus.enabled                                | true | false |
| bp-cilium        | cilium.prometheus.serviceMonitor.enabled                 | true | false |
| bp-cert-manager  | cert-manager.prometheus.enabled                          | true | false |
| bp-cert-manager  | cert-manager.prometheus.servicemonitor.enabled           | true | false |
| bp-crossplane    | crossplane.metrics.enabled                               | true | false |
| bp-flux          | (no observability hardcodes)                             | n/a  | n/a   |
| bp-sealed-secrets| (no observability hardcodes)                             | n/a  | n/a   |
| bp-spire         | (no observability hardcodes)                             | n/a  | n/a   |
| bp-nats-jetstream| (no observability hardcodes)                             | n/a  | n/a   |
| bp-openbao       | (no observability hardcodes)                             | n/a  | n/a   |
| bp-keycloak      | (no observability hardcodes)                             | n/a  | n/a   |
| bp-gitea         | (no observability hardcodes)                             | n/a  | n/a   |
| bp-powerdns      | (no observability hardcodes)                             | n/a  | n/a   |
| bp-catalyst-platform | (umbrella, no values overlay)                        | n/a  | n/a   |

Local gates green:
  helm dep build      ✓ all 3 affected charts
  helm lint           ✓ all 3
  helm template       ✓ all 3 — 0 monitoring.coreos.com refs in default
  tests/observability-toggle.sh  ✓ all 9 sub-cases pass

Closes the install path for bp-cilium 1.1.1 on a fresh Sovereign;
unblocks the full bp-* dep graph.

Refs: https://github.com/openova-io/openova/issues/182

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:08:09 +02:00
hatiyildiz
43aff20254 feat(bp-*): convert all 11 bootstrap-kit charts to umbrella charts depending on upstream
Each platform/<name>/chart/Chart.yaml now declares the canonical upstream
chart as a dependencies: entry. helm dependency build pulls the upstream
payload into the OCI artifact at publish time, so Flux helm install of
bp-<name>:1.1.0 actually installs the upstream Helm release alongside the
Catalyst-curated overlays (NetworkPolicy, ServiceMonitor, ClusterIssuer,
ExternalSecret) under templates/.

Pinned upstream chart versions per platform/<name>/blueprint.yaml:
- cilium                 1.16.5  https://helm.cilium.io
- cert-manager           v1.16.2 https://charts.jetstack.io
- flux                   2.4.0   https://fluxcd-community.github.io/helm-charts
- crossplane             1.17.x  https://charts.crossplane.io/stable
- sealed-secrets         2.16.x  https://bitnami-labs.github.io/sealed-secrets
- spire                  ...     https://spiffe.github.io/helm-charts-hardened
- nats-jetstream         ...     https://nats-io.github.io/k8s/helm/charts
- openbao                ...     https://openbao.github.io/openbao-helm
- keycloak               ...     https://charts.bitnami.com/bitnami
- gitea                  ...     https://dl.gitea.com/charts
- catalyst-platform      umbrella over the 10 leaf bp-* charts via
                         helm dependency

values.yaml in each chart adopts the umbrella convention: catalystBlueprint
metadata block (provenance + version) at top level, upstream subchart
values namespaced under the dependency name.

cert-manager specifically: clusterissuer-letsencrypt-dns01.yaml gets the
helm.sh/hook: post-install,post-upgrade annotation so it applies AFTER
cert-manager controllers are running and CRDs registered (the previous
hollow-chart shape ran the ClusterIssuer at install time when CRDs
didn't exist yet, which was the omantel cluster's exact failure mode).

Wrapper chart version bumped 1.0.0 → 1.1.0 across the board (umbrella
conversion is a meaningful structural revision). Cluster manifests in
clusters/_template/bootstrap-kit/ AND clusters/omantel.omani.works/
bootstrap-kit/ updated to reference 1.1.0.

The blueprint-release.yaml workflow's helm package step needs an
explicit helm dependency build before push so the upstream subchart
bytes ship inside the OCI artifact. That CI change is a follow-up
commit on this same branch (separate file scope).
2026-04-29 17:21:36 +02:00
hatiyildiz
31b03ce02a ci(pdm)+platform(crossplane): build workflow + XDynadotPoolAllocation composition (Phase 3+4 of #163)
CI workflow (.github/workflows/pool-domain-manager-build.yaml) mirrors
the marketplace-api / catalyst-api shape:

  - Triggers on push to core/pool-domain-manager/** + workflow_dispatch
  - Runs unit tests (reserved + dynadot — the integration suite needs a
    real Postgres which the workflow does not provide; full integration
    runs in test-bootstrap-api.yaml against an ephemeral CNPG)
  - Builds and pushes ghcr.io/openova-io/openova/pool-domain-manager:<sha>
  - Cosign-signs the image via Sigstore keyless OIDC (id-token: write)
  - Emits an SBOM attestation tied to the image digest
  - Manifest deployment is intentionally NOT in this workflow — PDM
    manifests live in the openova-private repo per the issue body, so
    the Flux Kustomization there picks up the new SHA via a follow-up
    private-repo commit (Phase 6 of #163)

Crossplane composition (platform/crossplane/compositions/xrd-pool-
allocation.yaml + composition-pool-allocation.yaml) wraps PDM as a
declarative Crossplane Resource:

  apiVersion: compose.openova.io/v1alpha1
  kind: XDynadotPoolAllocation
  spec:
    parameters:
      poolDomain:    omani.works
      subdomain:     omantel
      sovereignFQDN: omantel.omani.works
      loadBalancerIP: 1.2.3.4
      createdBy:     crossplane

The Composition uses provider-http (crossplane-contrib/provider-http) to
render the XR into a Reserve → Commit sequence of HTTP calls against
PDM's in-cluster service URL. Per docs/INVIOLABLE-PRINCIPLES.md #3 we use
provider-http rather than bespoke Go to keep the day-2 lifecycle
declarative. Operators who want to pre-allocate a name (e.g. reserve
'omantel.omani.works' for a Sovereign that hasn't been provisioned yet)
commit YAML to Git and Flux+Crossplane converge.

Refs: #163
2026-04-29 06:46:11 +02:00
hatiyildiz
046e5ebc18 feat(day2-iac): Crossplane Compositions + per-Sovereign Flux cluster tree + catalyst-dns binary
Group F deliverables — completes the day-2 IaC layer that takes over after OpenTofu's Phase 0 hand-off (per docs/SOVEREIGN-PROVISIONING.md §4).

Three artifacts:

1. platform/crossplane/compositions/ — XRDs + Compositions for canonical Hetzner resources
   under the canonical compose.openova.io/v1alpha1 group (per BLUEPRINT-AUTHORING.md §8):
   - XHetznerNetwork + composition-network.yaml — wraps hcloud_network + subnet
   - XHetznerFirewall + composition-firewall.yaml
   - XHetznerServer + composition-server.yaml
   - XHetznerLoadBalancer + composition-loadbalancer.yaml (lb11, 80→31080, 443→31443)
   - README documenting the canonical pattern

2. clusters/_template/ — the canonical per-Sovereign Flux Kustomization tree.
   Copied to clusters/<sovereign-fqdn>/ at provisioning time; cloud-init's
   GitRepository points at the result.
   - kustomization.yaml (root: flux-system + infrastructure + bootstrap-kit)
   - flux-system/ (placeholder for Flux self-config customization)
   - infrastructure/ (provider-hcloud + ProviderConfig referencing hcloud-credentials secret OpenTofu writes)
   - bootstrap-kit/ — 11 HelmRelease manifests in dependency order:
     01-cilium → 02-cert-manager → 03-flux → 04-crossplane → 05-sealed-secrets
     → 06-spire → 07-nats-jetstream → 08-openbao → 09-keycloak → 10-gitea → 11-bp-catalyst-platform
     Each pulls from oci://ghcr.io/openova-io/bp-<name>:1.0.0 — the wrapper charts published by blueprint-release CI.
     dependsOn declarations enforce the canonical install order at runtime.

3. clusters/omantel.omani.works/ — the first concrete Sovereign instance.
   Mirror of _template with SOVEREIGN_FQDN_PLACEHOLDER substituted to omantel.omani.works.
   This is what the wizard's first omantel.omani.works run will actually reconcile.

4. products/catalyst/bootstrap/api/cmd/catalyst-dns/main.go — small Go binary the
   OpenTofu module's null_resource.dns_pool invokes via local-exec at Phase-0 apply time.
   Reads DYNADOT_API_KEY/SECRET/DOMAIN/SUBDOMAIN/LB_IP env vars; calls existing dynadot.Client.AddSovereignRecords. Containerfile already builds + ships it at /usr/local/bin/catalyst-dns.

Architectural compliance (Lesson #24 closed):
- No bespoke Go cloud-API calls (Crossplane Compositions are the canonical day-2 IaC)
- No exec.Command("helm", ...) (Flux HelmReleases are the canonical install unit)
- No kubectl apply from outside (cloud-init kubectl-applies one Flux GitRepository, then Flux owns everything)

After this commit, the path is end-to-end: wizard → catalyst-api → tofu apply (with infra/hetzner/) → cloud-init installs k3s + Flux + applies GitRepository pointing at clusters/omantel.omani.works/ → Flux reconciles bootstrap-kit (11 HelmReleases in dependency order) → Crossplane adopts day-2 management.
2026-04-28 14:09:29 +02:00
hatiyildiz
62d9c7d936 fix(charts): drop dependencies block — wrappers carry values overlay only
The first 2 blueprint-release CI runs failed on `helm package` with containerd permission errors because the wrapper Chart.yaml's `dependencies:` block triggered helm to pull the upstream charts via OCI/containerd at package time, which the GitHub Actions runner blocks.

Architectural fix: each Catalyst Blueprint wrapper carries the values overlay + metadata only. The bootstrap installer reads the upstream chart reference from the wrapper's values.yaml `catalystBlueprint.upstream.{chart,version,repo}` metadata block, points `helm install` at the upstream chart's repo, and overlays our values.

This keeps:
- blueprint-release CI lightweight (no upstream pulls during package; helm package now works without containerd)
- the "bp-<name> wrapper does NOT drift from upstream" property (we ship the overlay, not a fork)
- the single Blueprint contract from BLUEPRINT-AUTHORING §1 (a wrapper is still a Catalyst-curated Helm chart published as bp-<name>:<semver>)

Changes:
- 11 platform/<name>/chart/Chart.yaml: removed dependencies block. Each is now a plain Helm chart with no remote pulls during package.
- 11 platform/<name>/chart/values.yaml: prepended catalystBlueprint.upstream.{chart,version,repo} metadata block at the top. Bootstrap installer parses it to know which upstream chart to install with these values.
- products/catalyst/bootstrap/api/internal/bootstrap/bootstrap.go: installCilium now does `helm repo add cilium https://helm.cilium.io --force-update` then `helm install cilium cilium/cilium --version 1.16.5 --values -` (the cilium/cilium upstream chart, with our overlay values piped from values.yaml). Same pattern needs propagating to the other 10 install functions in a follow-up.

After this commit, blueprint-release CI should green-build all 11 wrappers (helm package now works without containerd access since there's nothing to pull). The bootstrap installer's actual `helm install` calls in production reach upstream chart repos via the runtime k3s cluster's pod network, which has full network access.
2026-04-28 12:57:29 +02:00
hatiyildiz
8c0f76640c feat(charts): G2 wrapper Helm charts for 11 bootstrap-kit components + blueprint-release CI
Per docs/PROVISIONING-PLAN.md and tickets [F] chart. Adds Catalyst-curated wrapper Helm charts at platform/<name>/chart/ for every component the bootstrap-kit installer (introduced in commit 07b4bcf) needs. Each chart is the canonical bp-<name> source per BLUEPRINT-AUTHORING.md §1's source-location rule.

11 charts created with Chart.yaml + values.yaml + blueprint.yaml each:

Network + GitOps:
- platform/cilium/chart — wraps cilium 1.16.5; kubeProxyReplacement, WireGuard mTLS, Hubble, Gateway API
- platform/flux/chart — wraps flux 2.4.0
- platform/crossplane/chart — wraps crossplane 1.18.0 + provider-hcloud manifest

Security:
- platform/cert-manager/chart — wraps cert-manager 1.16.2 with CRDs+ServiceMonitor
- platform/sealed-secrets/chart — wraps sealed-secrets 2.16.1 (transient bootstrap-only)
- platform/spire/chart — wraps spiffe/spire 1.10.4 (5-min SVID rotation)

Catalyst control-plane services:
- platform/nats-jetstream/chart — wraps nats 2.10.22 (3-node cluster, JetStream + KV)
- platform/openbao/chart — wraps openbao 2.1.0 (3-node Raft, region-local per SECURITY §5)
- platform/keycloak/chart — wraps keycloak 25.0.6 (Bitnami flavor, edge proxy mode)
- platform/gitea/chart — wraps gitea 10.5.0 (CNPG Postgres backend, no chart-bundled valkey/redis since Catalyst control plane uses JetStream)

New platform/ folders (added per AUDIT-PROCEDURE component-count anchor — was 53, now 55):
- platform/spire/README.md — workload identity Catalyst control plane component
- platform/nats-jetstream/README.md — control-plane event spine
- platform/sealed-secrets/README.md — transient bootstrap-only

Each blueprint.yaml declares:
- catalyst.openova.io/v1alpha1 Blueprint kind (canonical CRD per BLUEPRINT-AUTHORING §3)
- visibility: unlisted (mandatory infra, auto-installed by bootstrap kit, not a marketplace card)
- manifests.chart: ./chart pointer
- depends: [] (foundational components have no Blueprint dependencies; control-plane services depend on each other implicitly via bootstrap order, not via Blueprint depends)

.github/workflows/blueprint-release.yaml:
- New CI workflow per BLUEPRINT-AUTHORING §11 (path-matrix per Blueprint folder)
- Triggers on push to main touching platform/*/chart/** or products/*/chart/**
- detect job: emits matrix of changed Blueprint folders via git diff
- build job (per chart): helm dependency build → helm package → helm push to GHCR → cosign keyless sign (GitHub OIDC) → Syft SBOM attestation
- Output: ghcr.io/openova-io/bp-<name>:<semver> with SLSA-3-style supply-chain provenance

Closes [F] tickets: 11 G2 charts (cilium, cert-manager, flux, crossplane, sealed-secrets, spire, nats-jetstream, openbao, keycloak, gitea, plus the umbrella products/catalyst/chart already exists from Pass 105). blueprint.yaml CRDs added across 11 entries. CI fan-out workflow live.

After this commit lands, the bootstrap-kit installer in commit 07b4bcf has real OCI artifacts to install. The first push to main will trigger 10 build matrix jobs (cilium was created in a separate commit earlier in this session) which produce 10 cosigned bp-<name>:<semver> artifacts on GHCR.

Component-count anchor update follows: 53 → 55 (added spire + nats-jetstream + sealed-secrets — but sealed-secrets was already conceptually counted under "supporting services"). Per AUDIT-PROCEDURE the count needs updating in CLAUDE.md, BUSINESS-STRATEGY, TECHNOLOGY-FORECAST L11. Tracked as separate ticket [K] docs.
2026-04-28 12:51:06 +02:00
hatiyildiz
67aab8f6c1 docs(pass-48): crossplane OpenTofu/XRD group drift; PERSONAS clean
platform/crossplane/README.md had three real drift items:

1. §"Terraform vs Crossplane" — Catalyst's canonical bootstrap IaC is
   OpenTofu (PTS §3.2 + SOVEREIGN-PROVISIONING §3), not Terraform.
   Renamed section to "OpenTofu vs Crossplane", added intro paragraph
   clarifying the OSS-fork rationale, updated table rows + Decision.

2. XRD CompositeResourceDefinition example used name: xdatabases.openova.io
   and group: openova.io. Per BLUEPRINT-AUTHORING §8 (Pass 42 verified
   canonical), Crossplane XRDs use compose.openova.io group — separate
   from Catalyst CRDs (catalyst.openova.io). Fixed to
   xdatabases.compose.openova.io / group: compose.openova.io with inline
   pointer to BLUEPRINT-AUTHORING §8.

3. Composition compositeTypeRef.apiVersion was openova.io/v1alpha1, fixed
   to compose.openova.io/v1alpha1. Also corrected Composition metadata.name
   to database.hcloud.compose.openova.io for naming consistency.

Pass 1's API group unification was Catalyst-CRDs-only; Pass 42 verified
the separate Crossplane group; Pass 48 catches a downstream consequence
where the crossplane README defaulted to bare `openova.io` matching
neither canonical form.

PERSONAS-AND-JOURNEYS §1-§7 deep re-scan: clean. Pass 22, 33, 39 fixes
all intact. Three-pass-touched doc reads consistently. Stable.

Banner already correctly enforces "platform plumbing, never user-facing"
per ARCHITECTURE §7.4 / GLOSSARY.
2026-04-28 00:10:48 +02:00
hatiyildiz
5834daec14 docs(pass-10): banners on 7 more components + opentofu active-active drift fix
7 more component READMEs got role-in-Catalyst banners:

- vpa, keda, reloader → per-host-cluster scaling/ops layer (§3.4).
  Reloader specifically calls out its role in Catalyst's secret-
  rotation flow (rolling deploy on K8s Secret hash change).
- external-dns → per-host-cluster DNS-sync (§3.1); pairs with k8gb
  for the GSLB zone separation.
- coraza → DMZ-block WAF on every host cluster (§3.1).
- crossplane → per-Sovereign on the management cluster (§3.2);
  banner explicitly emphasizes the agreed "never a user-facing
  surface" rule (Users don't write Compositions in Application
  configs; Blueprint authors and advanced contributors do). Cross-
  references the no-fourth-surface clause in ARCHITECTURE §4/§7
  and the Crossplane Composition section in BLUEPRINT-AUTHORING §8.
- opentofu → repositioned as Phase-0-only, runs on `catalyst-
  provisioner` only, NOT installed on host clusters at runtime.

opentofu drift fixes (uncovered by line-by-line read):
- Section 5 line 182: "Bootstrap Wizard prompts for cloud credentials"
  → "Catalyst Bootstrap (Phase 0) prompts for cloud credentials"
  (banned term).
- Same section line 186: "ESO PushSecrets sync to both regional
  OpenBao instances" — the active-active drift Pass 7 corrected
  elsewhere, still here. Replaced with "writes go to the primary
  OpenBao region only; replicas pick up via async perf replication".

VALIDATION-LOG: Pass 10 entry added.

Refs #37
2026-04-27 21:43:45 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
talent-mesh
435f49738d feat: restructure platform to 52 components and 9 products
Technology forecast and strategic review restructure:
- Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq)
- Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza)
- Rename product: Synapse → Axon (SaaS LLM Gateway)
- Merge products: Titan + Fuse → Fabric (Data & Integration)
- New product: Relay (Communication)
- Replace Backstage with Catalyst IDP
- Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG)
- Add supply chain security (Sigstore/Cosign, Syft+Grype)
- Add AI safety and observability (NeMo Guardrails, LangFuse)
- Add technology forecast 2027-2030 document
- Full verification pass: zero stale references across all docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 21:00:19 +00:00
talent-mesh
c9d04a53b4 refactor: flatten platform/ structure (41 components)
Remove hierarchical grouping (networking/, security/, etc.) and use flat
structure for all 41 platform components.

Changes:
- All components now directly under platform/ (no subfolders)
- AI Hub components moved from meta-platforms/ai-hub/components/ to platform/
- Open Banking components (lago, openmeter) moved to platform/
- meta-platforms/ now only contains README files that reference platform/
- Open Banking custom services remain in meta-platforms/open-banking/services/

Structure:
- platform/ (41 components, flat)
- meta-platforms/ai-hub/ (README only, references platform/)
- meta-platforms/open-banking/ (README + 6 custom services)

All documentation links updated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 15:19:48 +00:00