openova/platform/kserve/blueprint.yaml
e3mrah c09109a61a
feat(charts): bp-stunner + bp-knative + bp-kserve wrapper charts (closes #263 #264 #265) (#290)
Edge + serverless + model-serving batch (W2.5.C) — three upstream-
subchart umbrella Blueprints completing the bootstrap-kit slots for
WebRTC media relay (bp-relay → bp-stunner) and the AI/ML serving stack
(bp-cortex → bp-kserve → bp-knative).

Each chart follows the canonical umbrella pattern from
docs/BLUEPRINT-AUTHORING.md §11.1: Chart.yaml declares the upstream
chart under `dependencies:` so `helm dependency build` bundles the
upstream payload into the OCI artifact, and Catalyst-curated overlay
values + templates sit alongside in chart/values.yaml + chart/templates/.

Per-chart highlights:
- bp-stunner/1.0.0 — wraps stunner/stunner-gateway-operator 1.1.0.
  Ships a Cilium-native GatewayClass (Capabilities-gated on
  gateway.networking.k8s.io/v1) so bp-relay (LiveKit / SFU) can claim
  Gateway CRs without an operator-ordering dance. Default UDP TURN port
  range 30000-32767 matches the range opened at the Sovereign edge
  firewall (Crossplane bp-firewall composition).
- bp-knative/1.0.0 — wraps knative-operator v1.21.1. Ships a
  KnativeServing CR pre-configured for **istio-less mode**
  (ingress.istio.enabled=false, ingress.contour.enabled=false,
  ingress.kourier.enabled=false; config.network.ingress-class=cilium).
  Sovereign FQDN sourced from values, no hardcoded fallback per
  inviolable principle #4 — render fails loudly if cluster overlay
  doesn't set knativeOverlay.knativeServing.sovereignFqdn.
- bp-kserve/1.0.0 — wraps kserve/kserve v0.16.0 (latest version
  published on the official OCI registry as of 2026-04-30). Default
  deploymentMode=RawDeployment (no Knative hop on the hot path) but
  bp-knative is still installed (declared as a hard dep) so per-IS
  annotation `serving.kserve.io/deploymentMode: Serverless` opts in to
  scale-to-zero per tenant. Cilium native Gateway-API ingress
  (enableGatewayApi=true, className=cilium, disableIstioVirtualHost=
  true).

Observability discipline (issue #182): every observability toggle
(ServiceMonitor, HPA, GatewayClass) defaults false and is operator-
tunable via per-cluster overlay once bp-kube-prometheus-stack reconciles.
Each chart ships tests/observability-toggle.sh covering default-off,
opt-in (with `--api-versions monitoring.coreos.com/v1` to simulate
Prometheus Operator CRDs), and explicit-off cases.

Per-chart kind summary (helm template default render):

  bp-stunner: ClusterRole, ClusterRoleBinding, ConfigMap, Dataplane,
              Deployment, Role, RoleBinding, Service, ServiceAccount.
              (+ GatewayClass when --api-versions
              gateway.networking.k8s.io/v1 is passed.)

  bp-knative: ClusterRole, ClusterRoleBinding, ConfigMap,
              CustomResourceDefinition, Deployment, KnativeServing,
              Role, RoleBinding, Secret, Service, ServiceAccount.

  bp-kserve:  Certificate, ClusterRole, ClusterRoleBinding,
              ClusterServingRuntime, ClusterStorageContainer,
              ConfigMap, Deployment, Gateway, Issuer,
              MutatingWebhookConfiguration, Role, RoleBinding,
              Service, ServiceAccount, ValidatingWebhookConfiguration.

`helm lint` clean for all three (single INFO on missing icon — icons
land with marketplace card work).

`bash tests/observability-toggle.sh` green for all three (3 cases each:
default-off, opt-in, explicit-off).

Closes #263 #264 #265

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 19:37:38 +04:00

78 lines
2.7 KiB
YAML

apiVersion: catalyst.openova.io/v1alpha1
kind: Blueprint
metadata:
name: bp-kserve
labels:
catalyst.openova.io/section: pts-4-6-ai-ml
spec:
version: 1.0.0
card:
title: KServe
summary: |
Kubernetes-native model serving — InferenceService CRD, multi-
framework predictors (vLLM, TorchServe, Triton, SKLearn), inference
graphs. Wraps the upstream `kserve/kserve` chart and ships a
Catalyst-curated `RawDeployment` mode (no per-InferenceService
Knative Service hop by default — KServe writes Deployments
directly), with **Knative still installed** (via bp-knative) for
Application-tier Blueprints that opt in to scale-to-zero on a per-
InferenceService basis. Used by bp-cortex (composite AI Hub
Blueprint).
icon: kserve.svg
category: ai-ml
visibility: unlisted # bootstrap-kit infrastructure component
configSchema:
type: object
properties:
deploymentMode:
type: string
enum: [RawDeployment, Serverless]
default: RawDeployment
description: |
KServe deployment mode. RawDeployment writes plain
Deployment+Service+HPA per InferenceService (no Knative hop).
Serverless writes a Knative Service per InferenceService
(scale-to-zero). Catalyst defaults to RawDeployment but bp-
knative is still installed so per-InferenceService overrides
via annotation can opt in to Serverless without infra changes.
ingressClass:
type: string
default: cilium
description: |
KServe ingress class — Catalyst defaults to Cilium native
Gateway-API (istio-less).
controllerReplicas:
type: integer
default: 1
minimum: 1
maximum: 5
description: |
KServe controller-manager replicas. Solo Sovereign = 1;
multi-AZ overlays bump for HA.
crds:
type: object
properties:
create:
type: boolean
default: true
description: |
Install serving.kserve.io CRDs as part of this chart.
CRDs ship with the upstream chart; consumers gated via
Flux dependsOn.
placementSchema:
modes: [single-region, active-active]
default: single-region
minRegions: 1
maxRegions: 5
manifests:
chart: ./chart
depends:
- blueprint: bp-cilium # Cilium native Gateway-API ingress
version: ^1
- blueprint: bp-cert-manager # KServe webhook TLS
version: ^1
- blueprint: bp-knative # Knative still installed for per-IS Serverless opt-in
version: ^1
upgrades:
from: ["0.x"]