Commit Graph

1235 Commits

Author SHA1 Message Date
Hatice Yildiz
a50176cd0e feat(catalyst): full wizard rebuild — Cosmos design, 6 new steps
- WizardLayout: fixed position shell, logo+step-rail pinned at top,
  only content area scrolls — no more vertical jumping
- StepOrg: extended org profile (industry, size, HQ, compliance tags),
  smart defaults with 'default' badge, restore-on-empty logic
- StepTopology: 5 visual topology templates (TITAN/TRIANGLE/DUAL/
  COMPACT/SOLO) with SVG cluster diagrams, bullet details on select
- StepProvider: provider cards with topology region context panel
- StepCredentials: Cosmos-themed, demo bypass preserved
- StepComponents: Linux-style package group selector — full/partial/
  empty indicators, expand to drill into individual components
- StepReview: structured review with provision CTA
- model.ts: TopologyTemplate type, ORG_DEFAULTS, DEFAULT_COMPONENT_GROUPS
- store.ts: topology, org profile, and group component actions
- shared/ui/OOLogo.tsx: extracted reusable logo component

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 06:58:24 +01:00
Hatice Yildiz
aa869bf6cb feat(catalyst/designs): Cosmos — lift logo above card + numbered step circles
- Move OOLogo outside glass card so it stays fixed across all steps
- Replace thin progress bars with Journey-style numbered circles
- Completed steps show checkmark, current glows with sky-blue ring
- Connector lines fill with gradient as steps complete

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 22:39:53 +01:00
Hatice Yildiz
ee0326af04 fix(catalyst): remove unused imports/vars in DesignShowcase — TS build fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 22:24:57 +01:00
Hatice Yildiz
d19efb866a feat(catalyst): fix logo crop + add all 6 steps to all 10 designs
- Fix OOLogo viewBox to 0 0 700 400 with strokeWidth 100 (correct aspect ratio)
- Render logo at width = height × 1.75 to prevent cropping
- Add all 6 wizard steps with real content to every design alternative
- Shared StepBody component with design-specific theming

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 22:14:39 +01:00
Hatice Yildiz
25b65fb6ad feat(catalyst): add 10 distinct wizard design concepts at /designs 2026-03-19 21:41:55 +01:00
Hatice Yildiz
ea7396974d feat(catalyst): world-class UX redesign — rich left panel, centered form
Left panel (always-dark brand panel):
- Gradient bg with ambient brand glow orbs (sky-blue + indigo)
- Large OO logo with radial glow aura behind it
- "OpenOva / Catalyst / Bootstrap Wizard" three-line branding
- Steps: numbered circles (gradient-fill completed, outlined-glow current)
  with connecting lines and current step description visible
- Trust signals at bottom with icons: runs in your cloud, no data stored,
  fully open source

Right panel (themed):
- 2px gradient progress bar (sky→indigo) across the very top, fills per step
- "STEP X OF 6" brand-colored label + hairline rule
- Big responsive heading (clamp 1.5–2rem, weight 700, tight tracking)
- Content max-width 520px, horizontally centered, properly padded
- Theme toggle + exit float top-right as clean icon buttons
- Form nav: back as ghost bordered button, continue as gradient primary
  with brand glow shadow

Input redesign:
- Height 42px (was 36px)
- 1.5px borders with smooth focus transition (brand glow ring)
- Focus/blur handled via onFocus/onBlur (not Tailwind classes)
- Hint text consistent 12px muted

Credentials step:
- Validate button inline with token field, height-matched (42px)
- Demo skip link prominently sky-blue: "No token yet? Skip →"
- How-to card: numbered circle badges instead of plain numbers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 21:30:28 +01:00
Hatice Yildiz
1bdc87bdba feat(catalyst): align UI with OpenOva design system, add light/dark toggle
- Redesign tokens: Catalyst sky-blue brand (#38BDF8), near-black surfaces
  matching openova.io (#09090B/#111113/#18181B), zinc text palette
- Add semantic CSS vars (--color-text-primary/secondary/muted/disabled)
  replacing hardcoded oklch() values across all components — required
  for proper light/dark mode support
- Add light mode with [data-theme="light"] CSS variable overrides
- Add useTheme hook with localStorage persistence (key: oo-theme)
- Add inline theme init script in index.html to prevent flash on load
- Replace OctagonAlert placeholder with OpenOva double-loop SVG logo
- Rebrand sidebar: "OpenOva" label + "Catalyst" in sky-blue
- Add Sun/Moon theme toggle in the top bar header
- Promote credentials skip link to brand-colored underlined text
  (was invisible muted grey — "Skip — explore in demo mode")
- Update button/input/StepShell/StepIndicator to use CSS vars

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 19:27:10 +01:00
Hatice Yildiz
cc45918740 feat(catalyst): add demo mode skip on credentials step 2026-03-19 18:56:27 +01:00
Hatice Yildiz
200e0038b3 feat(catalyst): add Go API backend, Containerfiles, and nginx SPA config
- Add catalyst-api: Chi router, SSE provisioning logs, Hetzner token
  validation, deployment lifecycle simulation (hexagonal-lite layout)
- Add catalyst-ui Containerfile: multi-stage Node→nginx build with
  VITE_APP_MODE baked in at build time
- Add nginx.conf: SPA routing, /api/ proxy to catalyst-api, SSE support,
  /healthz endpoint, K3s DNS resolver
- Wire wizard to real API: StepCredentials validates token via POST
  /api/v1/credentials/validate; StepReview POSTs to /api/v1/deployments
  and stores deploymentId in Zustand; ProvisionPage streams SSE logs
- Add deploymentId to wizard state; fix currentStep initial value (1→1)
- Add region code/countryCode fields; fix cluster context naming convention

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 18:35:11 +01:00
Eren Baysal
5ff6c17e50 fix(catalyst): wizard step navigation, naming convention in cluster context
- Fix currentStep initial value (0→1) so wizard advances correctly on first Continue
- Add `code` + `countryCode` to HETZNER_REGIONS; drop deprecated `role: primary|dr` from Region model
- Use region code in cluster context names: hz-fsn-rtz-prod (not hz-fsn1-rtz-prod)
- Derive success page URLs and kubeconfig from wizard store (orgDomain, region)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 13:59:46 +01:00
Emrah Baysal
3e3f9428ff feat(catalyst): Bootstrap UI — full wizard, auth, dashboard, provision, success
Complete React 18 + TypeScript + Tailwind v4 UI for OpenOva Catalyst Bootstrap.

Architecture:
- Feature-Sliced Design (FSD): app/, pages/, widgets/, features/, entities/, shared/
- TanStack Router (type-safe), TanStack Query, Zustand wizard store
- React Hook Form + Zod validation on every form
- Framer Motion animated step transitions and micro-interactions
- Dark-first design system with Tailwind v4 CSS custom properties

Pages:
- Auth: Login, Signup, Forgot Password (SaaS mode)
- Dashboard: deployment list with status badges and stats
- Wizard: 6-step animated wizard (Org → Provider → Credentials →
  Infrastructure → Components → Review)
- Provision: real-time SSE log stream with phase tracker and progress bar
- Success: kubeconfig download, service URLs, next-steps guide

Shared UI:
- Button, Input, Badge, Card, Separator, Tooltip, Checkbox, Switch,
  Progress, Dialog, DropdownMenu, Select, Avatar (all owned, Radix primitives)

Widgets:
- StepIndicator with animated connectors and completion state
- CloudProviderCard with coming-soon state and tooltip

Features:
- Two runtime modes: saas (full auth) and selfhosted (direct to wizard)
- Live credential validation UX with feedback states
- Component dependency enforcement with tooltip explanations
- Cost estimate per region derived from Hetzner node size selection
- Cluster context names auto-derived from naming convention (NAMING-CONVENTION.md)

Builds clean: tsc -b + vite build, 34KB CSS, 723KB JS (pre-split)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 12:34:03 +01:00
e3mrah
2df03b7ea3 feat(axon): add V1 query() alongside V2 session pool with profile routing
- Add thinking, effort, profile fields to ChatCompletionRequest
- Add chatV1() and chatV1Stream() using query() with persistSession=false
- Route to V1 when thinking/effort params present or profile='deep'
- V2 session pool unchanged; V1 runs stateless with native systemPrompt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 12:26:34 +01:00
e3mrah
5b04c6bd02 fix: increase axon-valkey probe initialDelaySeconds to survive slow RDB load 2026-03-18 16:10:21 +01:00
e3mrah
cf4c37b2df fix: use tcpSocket probes for axon-valkey — exec probes fail on k3s OCI runtime 2026-03-18 14:02:09 +01:00
e3mrah
baf2d8445d fix(axon): persistent Valkey reconnect — never give up retryStrategy
Previous retryStrategy(times > 5) returned null, permanently destroying the
ioredis client after 5 failed reconnects. After idle, the TCP connection drops,
all 5 retries fail, and every subsequent command throws 'Connection is closed'.

Changes:
- retryStrategy now retries indefinitely (max 30s interval) — connection
  is always restored when Valkey comes back
- 'end' event handler restarts the client if ioredis somehow stops retrying
- getValkey() returns null when client.status is 'end'/'close' so callers
  skip persistence gracefully instead of throwing
- maxRetriesPerRequest: 3 kept — commands fail fast, background reconnect
  handles recovery

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 06:09:07 +01:00
e3mrah
63a86772f7 fix(axon): evict idle sessions older than 5 min before handing to caller
Sessions whose Claude CLI subprocess has exited (idle > MAX_IDLE_MS) are
recycled in acquire() rather than returned. This prevents all-stale-pool
scenarios that caused WriteRecsActivity/ExtractIntentActivity to fail with
'Connection is closed' after Axon sits idle overnight.

- Added lastUsed: number to PoolEntry, set on warmup and release
- acquire() skips idle entries older than 5 min, recycles each one
- release() stamps lastUsed so the TTL resets on every successful use

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 06:05:05 +01:00
e3mrah
5f0421e967 fix(axon): restore word-level streaming in chatStream
Re-add 2-3 word chunk splitting with 25-60ms delays that was lost during
the includePartialMessages refactor. Fixes the "10s wait then dump" UX.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:09:18 +01:00
e3mrah
bc069483af fix(axon): restore working assistant message handler, revert broken includePartialMessages 2026-03-12 15:54:23 +01:00
e3mrah
6f81cc7e79 debug(axon): log msg.type from session.stream() to diagnose empty output 2026-03-12 15:51:42 +01:00
e3mrah
3f010b4cca fix(axon): fallback to complete assistant msg if stream_event not emitted 2026-03-12 15:49:07 +01:00
e3mrah
9aba8fe80c fix(axon): cast includePartialMessages to bypass older SDK type version 2026-03-12 15:45:23 +01:00
e3mrah
5113295960 feat(axon): enable real token streaming via includePartialMessages
Set includePartialMessages: true on SDK sessions so stream() emits
SDKPartialAssistantMessage (stream_event) carrying content_block_delta
events. chatStream() now yields actual token text as it is generated
instead of waiting for the complete response and fake-streaming it
with word-splits and delays.

This gives true token-by-token TTFT (~200ms first token) rather than
the previous 3-8s wait for the full response before any text appeared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 15:34:24 +01:00
e3mrah
e783b9329c fix(axon): use XML tags in formatPrompt to prevent injection detection
The Claude Agent SDK reuses sessions across conversations. When the
full system prompt was re-sent on subsequent turns wrapped in
[System instructions] tags, Claude flagged it as a prompt injection
attempt. Switch to XML-style tags (<context>, <conversation>) that
Claude recognises as structured prompt sections. Add <new_conversation/>
boundary marker to isolate reused sessions from prior context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:26:30 +01:00
e3mrah
b911a74e7a feat(axon): progressive word-level streaming for chat completions
The Claude Agent SDK yields complete assistant messages rather than
individual token deltas. This change splits the full text into 2-3
word groups and yields them as separate SSE chunks with small random
delays (25-60ms), giving a natural typing experience on the client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:05:59 +01:00
e3mrah
8fb961c897 docs: rewrite Axon README as client integration guide
SDK examples (Python, Node.js), API reference, model aliases,
streaming, conversations, self-hosting instructions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 13:09:58 +01:00
e3mrah
643cdd9d29 Revert "feat: add request tracing spans to chat completion path"
This reverts commit a2685dd158.
2026-03-04 11:39:29 +01:00
e3mrah
a2685dd158 feat: add request tracing spans to chat completion path
Traces: convLookup, formatPrompt, acquire, send, firstMsg,
stream, release, convStore — logged per request for profiling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 11:20:54 +01:00
e3mrah
023ee6d5e4 chore: increase Axon resource limits for single-node overprovisioning
Axon: 2 CPU / 2Gi memory limits (50m/128Mi requests)
Valkey: 500m CPU / 256Mi memory limits (10m/32Mi requests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 11:08:10 +01:00
e3mrah
97009daf59 fix: set HOME env and mount credentials at subpath
K8s doesn't set HOME from Dockerfile USER directive. Mount
credential file at subpath to preserve debug/ directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:41:47 +01:00
e3mrah
5ff69bd41b fix: use fixed UID 1001 for axon user in container
K8s runAsNonRoot requires numeric UID. Pin to 1001 in both
Containerfile and Helm chart deployment template.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:39:09 +01:00
e3mrah
215ba69272 fix: create .claude/debug directory in Axon container
Claude Agent SDK writes debug logs to ~/.claude/debug/ which must
exist before session creation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:26:28 +01:00
e3mrah
fe2e349246 feat: add Axon Helm chart and CI workflow
Helm chart for deploying Axon LLM gateway with Valkey backing store,
Traefik ingress with TLS, and Claude auth volume mount.

CI workflow builds container image on push to products/axon/ and pushes
SHA-pinned tags to GHCR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 09:22:54 +01:00
talent-mesh
616914cf45 feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API
Rewrite Axon SaaS LLM gateway with three core changes:

1. Session pool acquire/release pattern — sessions stay alive and are
   reused across requests instead of killed after one use. Turn counting
   with automatic recycling after 200 turns.

2. Valkey-backed conversation store — all conversation state (messages,
   metadata, TTL) lives in Valkey, not filesystem. Sessions are stateless
   workers; any session can serve any conversation.

3. 100% OpenAI /v1/chat/completions compatibility — accepts every OpenAI
   request parameter (temperature, top_p, stop, frequency_penalty,
   presence_penalty, logit_bias, logprobs, seed, tools, tool_choice,
   response_format, stream_options, max_completion_tokens, user, store,
   metadata). Response shape matches OpenAI exactly: chatcmpl-* id,
   system_fingerprint, logprobs:null, refusal:null, usage chunk in
   streaming. OpenAI model names (gpt-4o, gpt-4) auto-mapped to Claude.

Axon extension: conversation_id field for multi-turn conversations
backed by Valkey with 7-day TTL. GET /v1/conversations/:id for history.

Includes E2E test suite (67 tests, scripts/e2e-test.sh).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 18:36:26 +04:00
talent-mesh
435f49738d feat: restructure platform to 52 components and 9 products
Technology forecast and strategic review restructure:
- Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq)
- Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza)
- Rename product: Synapse → Axon (SaaS LLM Gateway)
- Merge products: Titan + Fuse → Fabric (Data & Integration)
- New product: Relay (Communication)
- Replace Backstage with Catalyst IDP
- Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG)
- Add supply chain security (Sigstore/Cosign, Syft+Grype)
- Add AI safety and observability (NeMo Guardrails, LangFuse)
- Add technology forecast 2027-2030 document
- Full verification pass: zero stale references across all docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 21:00:19 +00:00
talent-mesh
10245dff98 feat: ecosystem expansion to 55 components with license compliance
- Replace BSL-licensed components with open-source alternatives:
  Terraform→OpenTofu (MPL 2.0), Vault→OpenBao (MPL 2.0),
  Redpanda→Strimzi/Kafka (Apache 2.0), n8n→Airflow (Apache 2.0)
- Add 14 new platform components: activemq, camel, clickhouse, dapr,
  debezium, falco, flink, iceberg, opensearch, rabbitmq, superset,
  temporal, trino, vitess
- Rename meta-platforms/ to products/ with new product names:
  Cortex (AI Hub), Fingate (Open Banking), Titan (Data Lakehouse),
  Fuse (Microservices Integration)
- Update all documentation, READMEs, and cross-references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:15:11 +00:00