Edge + serverless + model-serving batch (W2.5.C) — three upstream- subchart umbrella Blueprints completing the bootstrap-kit slots for WebRTC media relay (bp-relay → bp-stunner) and the AI/ML serving stack (bp-cortex → bp-kserve → bp-knative). Each chart follows the canonical umbrella pattern from docs/BLUEPRINT-AUTHORING.md §11.1: Chart.yaml declares the upstream chart under `dependencies:` so `helm dependency build` bundles the upstream payload into the OCI artifact, and Catalyst-curated overlay values + templates sit alongside in chart/values.yaml + chart/templates/. Per-chart highlights: - bp-stunner/1.0.0 — wraps stunner/stunner-gateway-operator 1.1.0. Ships a Cilium-native GatewayClass (Capabilities-gated on gateway.networking.k8s.io/v1) so bp-relay (LiveKit / SFU) can claim Gateway CRs without an operator-ordering dance. Default UDP TURN port range 30000-32767 matches the range opened at the Sovereign edge firewall (Crossplane bp-firewall composition). - bp-knative/1.0.0 — wraps knative-operator v1.21.1. Ships a KnativeServing CR pre-configured for **istio-less mode** (ingress.istio.enabled=false, ingress.contour.enabled=false, ingress.kourier.enabled=false; config.network.ingress-class=cilium). Sovereign FQDN sourced from values, no hardcoded fallback per inviolable principle #4 — render fails loudly if cluster overlay doesn't set knativeOverlay.knativeServing.sovereignFqdn. - bp-kserve/1.0.0 — wraps kserve/kserve v0.16.0 (latest version published on the official OCI registry as of 2026-04-30). Default deploymentMode=RawDeployment (no Knative hop on the hot path) but bp-knative is still installed (declared as a hard dep) so per-IS annotation `serving.kserve.io/deploymentMode: Serverless` opts in to scale-to-zero per tenant. Cilium native Gateway-API ingress (enableGatewayApi=true, className=cilium, disableIstioVirtualHost= true). Observability discipline (issue #182): every observability toggle (ServiceMonitor, HPA, GatewayClass) defaults false and is operator- tunable via per-cluster overlay once bp-kube-prometheus-stack reconciles. Each chart ships tests/observability-toggle.sh covering default-off, opt-in (with `--api-versions monitoring.coreos.com/v1` to simulate Prometheus Operator CRDs), and explicit-off cases. Per-chart kind summary (helm template default render): bp-stunner: ClusterRole, ClusterRoleBinding, ConfigMap, Dataplane, Deployment, Role, RoleBinding, Service, ServiceAccount. (+ GatewayClass when --api-versions gateway.networking.k8s.io/v1 is passed.) bp-knative: ClusterRole, ClusterRoleBinding, ConfigMap, CustomResourceDefinition, Deployment, KnativeServing, Role, RoleBinding, Secret, Service, ServiceAccount. bp-kserve: Certificate, ClusterRole, ClusterRoleBinding, ClusterServingRuntime, ClusterStorageContainer, ConfigMap, Deployment, Gateway, Issuer, MutatingWebhookConfiguration, Role, RoleBinding, Service, ServiceAccount, ValidatingWebhookConfiguration. `helm lint` clean for all three (single INFO on missing icon — icons land with marketplace card work). `bash tests/observability-toggle.sh` green for all three (3 cases each: default-off, opt-in, explicit-off). Closes #263 #264 #265 Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
182 lines
4.4 KiB
Markdown
182 lines
4.4 KiB
Markdown
# STUNner
|
|
|
|
K8s-native TURN/STUN for WebRTC NAT traversal. **Application Blueprint** (see [`docs/PLATFORM-TECH-STACK.md`](../../docs/PLATFORM-TECH-STACK.md) §4.5 — Communication). Used by `bp-relay` to make LiveKit (WebRTC SFU) reachable from clients behind NATs.
|
|
|
|
**Status:** Accepted | **Updated:** 2026-04-30
|
|
|
|
---
|
|
|
|
## Blueprint chart
|
|
|
|
This folder ships an umbrella Helm chart at `chart/` that wraps the upstream `stunner/stunner-gateway-operator` chart (1.1.0) under `dependencies:`. Catalyst-curated overlay templates render alongside:
|
|
|
|
- `chart/templates/gatewayclass.yaml` — `gateway.networking.k8s.io/v1.GatewayClass` claiming the operator (`stunner.l7mp.io/gateway-operator` controller). Capabilities-gated on Gateway-API CRDs (delivered by `bp-cilium`).
|
|
- `chart/templates/networkpolicy.yaml` — locks operator + dataplane pods to the minimum ingress/egress (DEFAULT FALSE; per-Sovereign overlay opts in once consumer namespaces are pinned).
|
|
- `chart/templates/servicemonitor.yaml` — `monitoring.coreos.com/v1.ServiceMonitor` (DEFAULT FALSE per [`docs/BLUEPRINT-AUTHORING.md`](../../docs/BLUEPRINT-AUTHORING.md) §11.2; double-gated on Capabilities).
|
|
- `chart/templates/hpa.yaml` — `autoscaling/v2.HorizontalPodAutoscaler` for the dataplane Deployment (DEFAULT FALSE).
|
|
|
|
**Cilium-native Gateway integration**: STUNner registers a GatewayClass and the operator dynamically materializes dataplane Deployments backing each Gateway CR. UDP port range default 30000-32767 matches the range opened at the Sovereign edge firewall (Crossplane `bp-firewall` composition).
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
STUNner provides WebRTC connectivity:
|
|
- Kubernetes-native STUN/TURN server
|
|
- Gateway API integration
|
|
- Scalable media relay
|
|
- NAT traversal for video/audio
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph External["External"]
|
|
Client[WebRTC Client]
|
|
end
|
|
|
|
subgraph K8s["Kubernetes"]
|
|
subgraph STUNner["STUNner"]
|
|
GW[Gateway]
|
|
TURN[TURN Servers]
|
|
end
|
|
|
|
subgraph Apps["Applications"]
|
|
SFU[Media Server/SFU]
|
|
end
|
|
end
|
|
|
|
Client -->|"STUN/TURN"| GW
|
|
GW --> TURN
|
|
TURN --> SFU
|
|
Client -->|"Media"| TURN
|
|
TURN -->|"Media"| SFU
|
|
```
|
|
|
|
---
|
|
|
|
## Why STUNner
|
|
|
|
| Factor | STUNner | Traditional TURN |
|
|
|--------|---------|-----------------|
|
|
| Deployment | Kubernetes-native | Separate VMs |
|
|
| Scaling | HPA/KEDA | Manual |
|
|
| Configuration | Gateway API CRDs | Config files |
|
|
| Integration | Native K8s | External |
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### Gateway
|
|
|
|
```yaml
|
|
apiVersion: gateway.networking.k8s.io/v1
|
|
kind: Gateway
|
|
metadata:
|
|
name: stunner-gateway
|
|
namespace: stunner
|
|
spec:
|
|
gatewayClassName: stunner-gatewayclass
|
|
listeners:
|
|
- name: udp-listener
|
|
port: 3478
|
|
protocol: TURN-UDP
|
|
- name: tcp-listener
|
|
port: 3478
|
|
protocol: TURN-TCP
|
|
```
|
|
|
|
### UDPRoute
|
|
|
|
```yaml
|
|
apiVersion: stunner.l7mp.io/v1
|
|
kind: UDPRoute
|
|
metadata:
|
|
name: media-route
|
|
namespace: stunner
|
|
spec:
|
|
parentRefs:
|
|
- name: stunner-gateway
|
|
rules:
|
|
- backendRefs:
|
|
- name: media-server
|
|
namespace: apps
|
|
```
|
|
|
|
### GatewayConfig
|
|
|
|
```yaml
|
|
apiVersion: stunner.l7mp.io/v1
|
|
kind: GatewayConfig
|
|
metadata:
|
|
name: stunner-config
|
|
namespace: stunner
|
|
spec:
|
|
realm: stunner.<env>.<sovereign-domain>
|
|
authType: longterm
|
|
userName: stunner
|
|
password:
|
|
name: stunner-credentials
|
|
namespace: stunner
|
|
key: password
|
|
```
|
|
|
|
---
|
|
|
|
## TURN Authentication
|
|
|
|
STUNner supports long-term credentials:
|
|
|
|
```yaml
|
|
# Generate time-limited credentials
|
|
apiVersion: stunner.l7mp.io/v1
|
|
kind: GatewayConfig
|
|
spec:
|
|
authType: longterm
|
|
authLifetime: 86400 # 24 hours
|
|
```
|
|
|
|
---
|
|
|
|
## Scaling
|
|
|
|
STUNner scales with KEDA based on connection count:
|
|
|
|
```yaml
|
|
apiVersion: keda.sh/v1alpha1
|
|
kind: ScaledObject
|
|
metadata:
|
|
name: stunner-scaler
|
|
namespace: stunner
|
|
spec:
|
|
scaleTargetRef:
|
|
name: stunner
|
|
minReplicaCount: 2
|
|
maxReplicaCount: 10
|
|
triggers:
|
|
- type: prometheus
|
|
metadata:
|
|
serverAddress: http://mimir.monitoring.svc:8080/prometheus
|
|
metricName: stunner_allocations_active
|
|
query: sum(stunner_allocations_active)
|
|
threshold: "100"
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
| Metric | Description |
|
|
|--------|-------------|
|
|
| `stunner_allocations_active` | Active TURN allocations |
|
|
| `stunner_bytes_received_total` | Received bytes |
|
|
| `stunner_bytes_sent_total` | Sent bytes |
|
|
| `stunner_connections_total` | Total connections |
|
|
|
|
---
|
|
|
|
*Part of [OpenOva](https://openova.io)*
|