openova/products/catalyst/bootstrap/api
e3mrah 38f1f83971
fix(sovereign-dns-records): 404 fallback to FQDN-minus-first-label parent (#1529)
When operator submits sovereignFQDN like "t126.omani.works" without
parentDomains[] AND without sovereignPoolDomain, Validate()'s back-compat
synthesis stamps ParentDomain.Name = SovereignFQDN itself ("t126.omani.works").
The post-Phase-0 upsertSovereignParentZoneRecordsFromResult then PATCHes
zone "t126.omani.works." → PowerDNS 404 (the authoritative zone is
"omani.works") → no A records written → every console.* / auth.* /
gitea.* hostname resolves NXDOMAIN even after handoverFired.

Caught on t126 (84c0848406dd6fdd, 2026-05-16): clustermesh fully meshed
(D10  after PRs #1525+#1528), handover JWT minted, wildcard cert
Ready=True, LB external IP assigned — but DoD D1/D2 stayed red because
the sovereign-dns-records PATCH 404'd silently with only a WARN log.

This PR adds a 404-fallback in upsertSovereignParentZoneRecordsFromResult:
when the synthesized parent equals SovereignFQDN AND the PATCH returns
status 404, retry once with parent-of-FQDN (`SovereignFQDN[i+1:]` where
i is the first `.`). Two-label FQDNs ("customer.com") skip the retry
since there is no parent to derive — preserves BYO-mode behavior.

The provisioner Validate() back-compat synthesis stays untouched
because TestValidate_SynthesisesPrimaryFromSovereignFQDN asserts the
exact "BYO mode keeps SovereignFQDN as parent" semantics for 3-label
apexes like "acme.openova.io" — that's a legitimate case (operator
registered the 3-label apex). The 404-fallback handles the pool-mode
case at the PATCH boundary where we actually know whether the zone
exists.

Refs DoD D1/D2. Same incident chain as PRs #1525 + #1528.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:13:26 +04:00
..
cmd feat(openova-flow): catalyst-api proxy + cloud-init thread (Agent #3 — integrator, infra-side) (#1396) 2026-05-11 16:01:09 +04:00
internal fix(sovereign-dns-records): 404 fallback to FQDN-minus-first-label parent (#1529) 2026-05-16 17:13:26 +04:00
Containerfile fix(build): unblock Build & Deploy Catalyst — Containerfile + test typing (#1172) 2026-05-09 12:28:59 +04:00
go.mod feat(epic-4): K+P+X1+G — k8s-ws-proxy + projector + WebSocket logs + Guacamole chart (#1099) (#1164) 2026-05-09 09:27:39 +04:00
go.sum feat(epic-4): K+P+X1+G — k8s-ws-proxy + projector + WebSocket logs + Guacamole chart (#1099) (#1164) 2026-05-09 09:27:39 +04:00