Edge + serverless + model-serving batch (W2.5.C) — three upstream- subchart umbrella Blueprints completing the bootstrap-kit slots for WebRTC media relay (bp-relay → bp-stunner) and the AI/ML serving stack (bp-cortex → bp-kserve → bp-knative). Each chart follows the canonical umbrella pattern from docs/BLUEPRINT-AUTHORING.md §11.1: Chart.yaml declares the upstream chart under `dependencies:` so `helm dependency build` bundles the upstream payload into the OCI artifact, and Catalyst-curated overlay values + templates sit alongside in chart/values.yaml + chart/templates/. Per-chart highlights: - bp-stunner/1.0.0 — wraps stunner/stunner-gateway-operator 1.1.0. Ships a Cilium-native GatewayClass (Capabilities-gated on gateway.networking.k8s.io/v1) so bp-relay (LiveKit / SFU) can claim Gateway CRs without an operator-ordering dance. Default UDP TURN port range 30000-32767 matches the range opened at the Sovereign edge firewall (Crossplane bp-firewall composition). - bp-knative/1.0.0 — wraps knative-operator v1.21.1. Ships a KnativeServing CR pre-configured for **istio-less mode** (ingress.istio.enabled=false, ingress.contour.enabled=false, ingress.kourier.enabled=false; config.network.ingress-class=cilium). Sovereign FQDN sourced from values, no hardcoded fallback per inviolable principle #4 — render fails loudly if cluster overlay doesn't set knativeOverlay.knativeServing.sovereignFqdn. - bp-kserve/1.0.0 — wraps kserve/kserve v0.16.0 (latest version published on the official OCI registry as of 2026-04-30). Default deploymentMode=RawDeployment (no Knative hop on the hot path) but bp-knative is still installed (declared as a hard dep) so per-IS annotation `serving.kserve.io/deploymentMode: Serverless` opts in to scale-to-zero per tenant. Cilium native Gateway-API ingress (enableGatewayApi=true, className=cilium, disableIstioVirtualHost= true). Observability discipline (issue #182): every observability toggle (ServiceMonitor, HPA, GatewayClass) defaults false and is operator- tunable via per-cluster overlay once bp-kube-prometheus-stack reconciles. Each chart ships tests/observability-toggle.sh covering default-off, opt-in (with `--api-versions monitoring.coreos.com/v1` to simulate Prometheus Operator CRDs), and explicit-off cases. Per-chart kind summary (helm template default render): bp-stunner: ClusterRole, ClusterRoleBinding, ConfigMap, Dataplane, Deployment, Role, RoleBinding, Service, ServiceAccount. (+ GatewayClass when --api-versions gateway.networking.k8s.io/v1 is passed.) bp-knative: ClusterRole, ClusterRoleBinding, ConfigMap, CustomResourceDefinition, Deployment, KnativeServing, Role, RoleBinding, Secret, Service, ServiceAccount. bp-kserve: Certificate, ClusterRole, ClusterRoleBinding, ClusterServingRuntime, ClusterStorageContainer, ConfigMap, Deployment, Gateway, Issuer, MutatingWebhookConfiguration, Role, RoleBinding, Service, ServiceAccount, ValidatingWebhookConfiguration. `helm lint` clean for all three (single INFO on missing icon — icons land with marketplace card work). `bash tests/observability-toggle.sh` green for all three (3 cases each: default-off, opt-in, explicit-off). Closes #263 #264 #265 Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
213 lines
5.3 KiB
Markdown
213 lines
5.3 KiB
Markdown
# Knative
|
|
|
|
Serverless platform for Kubernetes with scale-to-zero and event-driven capabilities. **Application Blueprint** (see [`docs/PLATFORM-TECH-STACK.md`](../../docs/PLATFORM-TECH-STACK.md) §4.6 — AI/ML). Used by `bp-cortex` (composite AI Hub Blueprint) as the serverless layer for KServe-managed model inference.
|
|
|
|
**Status:** Accepted | **Updated:** 2026-04-30
|
|
|
|
---
|
|
|
|
## Blueprint chart
|
|
|
|
This folder ships an umbrella Helm chart at `chart/` that wraps the upstream `knative-operator` chart (v1.21.1) under `dependencies:`. Catalyst-curated overlay templates render alongside:
|
|
|
|
- `chart/templates/knativeserving.yaml` — `operator.knative.dev/v1beta1.KnativeServing` CR pre-configured for **istio-less mode** (Cilium native Gateway-API ingress, no Knative-Istio sidecar). Domain template is sourced from `knativeOverlay.knativeServing.sovereignFqdn` — REQUIRED, no hardcoded fallback per [`docs/INVIOLABLE-PRINCIPLES.md`](../../docs/INVIOLABLE-PRINCIPLES.md) #4.
|
|
- `chart/templates/networkpolicy.yaml` — locks the operator namespace down (DEFAULT FALSE).
|
|
- `chart/templates/servicemonitor.yaml` — operator metrics scrape (DEFAULT FALSE per [`docs/BLUEPRINT-AUTHORING.md`](../../docs/BLUEPRINT-AUTHORING.md) §11.2; Capabilities-gated).
|
|
- `chart/templates/hpa.yaml` — operator Deployment HPA (DEFAULT FALSE; operator is leader-elected so HPA rarely makes sense).
|
|
|
|
**Istio-less mode**: the KnativeServing CR ships with `ingress.istio.enabled: false`, `ingress.contour.enabled: false`, `ingress.kourier.enabled: false`, and `config.network.ingress-class: cilium.ingress.networking.knative.dev` so Knative Routes resolve to Cilium HTTPRoute / Gateway-API objects.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Knative provides serverless capabilities on Kubernetes. Used independently for event-driven workloads and as the foundation for KServe model serving.
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph Knative["Knative"]
|
|
Serving[Knative Serving]
|
|
Eventing[Knative Eventing]
|
|
end
|
|
|
|
subgraph Use Cases
|
|
KServe[KServe<br/>Model Serving]
|
|
Functions[Serverless<br/>Functions]
|
|
Events[Event-Driven<br/>Workloads]
|
|
end
|
|
|
|
Serving --> KServe
|
|
Serving --> Functions
|
|
Eventing --> Events
|
|
Eventing --> KServe
|
|
```
|
|
|
|
---
|
|
|
|
## Components
|
|
|
|
| Component | Purpose |
|
|
|-----------|---------|
|
|
| **Knative Serving** | Request-driven compute, scale-to-zero |
|
|
| **Knative Eventing** | Event-driven architecture, CloudEvents |
|
|
|
|
---
|
|
|
|
## Why Knative?
|
|
|
|
| Feature | Benefit |
|
|
|---------|---------|
|
|
| Scale-to-zero | Cost savings for idle workloads |
|
|
| Auto-scaling | Handles traffic spikes automatically |
|
|
| Revisions | Traffic splitting, canary deployments |
|
|
| CloudEvents | Standard event format |
|
|
| KServe foundation | Required for ML model serving |
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### Helm Values
|
|
|
|
```yaml
|
|
knative-serving:
|
|
enabled: true
|
|
config:
|
|
network:
|
|
ingress-class: "cilium"
|
|
domain-template: "{{.Name}}.{{.Namespace}}.{{.Domain}}"
|
|
autoscaler:
|
|
enable-scale-to-zero: "true"
|
|
scale-to-zero-grace-period: "30s"
|
|
stable-window: "60s"
|
|
|
|
knative-eventing:
|
|
enabled: true
|
|
config:
|
|
default-ch-webhook:
|
|
default-ch-config: |
|
|
clusterDefault:
|
|
apiVersion: messaging.knative.dev/v1
|
|
kind: InMemoryChannel
|
|
```
|
|
|
|
---
|
|
|
|
## Knative Service Example
|
|
|
|
```yaml
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: my-service
|
|
namespace: default
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
autoscaling.knative.dev/min-scale: "0"
|
|
autoscaling.knative.dev/max-scale: "10"
|
|
spec:
|
|
containers:
|
|
- image: harbor.<location-code>.<sovereign-domain>/my-app:latest
|
|
ports:
|
|
- containerPort: 8080
|
|
resources:
|
|
requests:
|
|
cpu: 100m
|
|
memory: 128Mi
|
|
```
|
|
|
|
---
|
|
|
|
## Traffic Splitting
|
|
|
|
```yaml
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: my-service
|
|
spec:
|
|
template:
|
|
metadata:
|
|
name: my-service-v2
|
|
spec:
|
|
containers:
|
|
- image: harbor.<location-code>.<sovereign-domain>/my-app:v2
|
|
traffic:
|
|
- revisionName: my-service-v1
|
|
percent: 90
|
|
- revisionName: my-service-v2
|
|
percent: 10
|
|
```
|
|
|
|
---
|
|
|
|
## Event-Driven Architecture
|
|
|
|
```yaml
|
|
apiVersion: eventing.knative.dev/v1
|
|
kind: Broker
|
|
metadata:
|
|
name: default
|
|
namespace: default
|
|
---
|
|
apiVersion: eventing.knative.dev/v1
|
|
kind: Trigger
|
|
metadata:
|
|
name: my-trigger
|
|
spec:
|
|
broker: default
|
|
filter:
|
|
attributes:
|
|
type: my.event.type
|
|
subscriber:
|
|
ref:
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
name: my-service
|
|
```
|
|
|
|
---
|
|
|
|
## Integration with Cilium
|
|
|
|
Knative uses Cilium as the ingress class:
|
|
|
|
```yaml
|
|
config:
|
|
network:
|
|
ingress-class: "cilium"
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
| Metric | Query |
|
|
|--------|-------|
|
|
| Request count | `revision_request_count` |
|
|
| Request latency | `revision_request_latencies` |
|
|
| Pod count | `autoscaler_actual_pods` |
|
|
| Desired pods | `autoscaler_desired_pods` |
|
|
|
|
---
|
|
|
|
## Consequences
|
|
|
|
**Positive:**
|
|
- Scale-to-zero reduces costs
|
|
- Automatic scaling for traffic spikes
|
|
- Foundation for KServe model serving
|
|
- Event-driven architecture support
|
|
- Traffic splitting for canary deployments
|
|
|
|
**Negative:**
|
|
- Cold start latency for scale-to-zero
|
|
- Additional complexity
|
|
- Requires understanding of Knative concepts
|
|
|
|
---
|
|
|
|
*Part of [OpenOva](https://openova.io)*
|