openova/infra
e3mrah 1dc21bfd51
fix(cloud-init): accept Hetzner DHCP routes on private NIC (use-routes: true) (#1489)
The netplan stanza for the hot-attached private NIC had
`dhcp4-overrides.use-routes: false`, which discards Hetzner DHCP's
classless static routes. Result: the interface gets `10.0.1.2/32` (host
route only) with NO route for the 10.0.0.0/8 private network. The
kernel routes all return traffic (including SYN-ACK to the Hetzner LB
at 10.0.1.254) via eth0's default route — the public NIC.

Hetzner LB's health check on private network gets the SYN forwarded,
but the SYN-ACK arrives via the wrong NIC; Hetzner drops it as
asymmetric. Target stays `unhealthy` forever on every service port.
Caught live on prov 6dfade27 (omani.works, 2026-05-14): all 3 region
LBs marked unhealthy on 53/80/443 — public surface blackholed despite
3-region × 45/45 HRs Ready + valid PROD cert + envoy listening on
0.0.0.0:30443.

Confirmed via tcpdump on the host:
  enp7s0 In  10.0.1.254.X > 10.0.1.2:30443 [S]   ← SYN arrives on private
  eth0   Out 10.0.1.2:30443 > 10.0.1.254.X [S.] ← SYN-ACK on wrong NIC

Fix: change to `use-routes: true`. Hetzner DHCP-provided routes have
higher metric than eth0's default (metric 100), so the public default
stays intact; we only gain the per-subnet 10.0.0.0/N route needed for
symmetric routing on the private NIC.

Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:52:01 +04:00
..
cloudflare-worker-leases feat(continuum): K-Cont-4 — Cloudflare Worker source + tofu wiring for lease witness (#1101) (#1159) 2026-05-09 08:01:44 +04:00
hetzner fix(cloud-init): accept Hetzner DHCP routes on private NIC (use-routes: true) (#1489) 2026-05-14 22:52:01 +04:00