Rewrite Axon SaaS LLM gateway with three core changes:
1. Session pool acquire/release pattern — sessions stay alive and are
reused across requests instead of killed after one use. Turn counting
with automatic recycling after 200 turns.
2. Valkey-backed conversation store — all conversation state (messages,
metadata, TTL) lives in Valkey, not filesystem. Sessions are stateless
workers; any session can serve any conversation.
3. 100% OpenAI /v1/chat/completions compatibility — accepts every OpenAI
request parameter (temperature, top_p, stop, frequency_penalty,
presence_penalty, logit_bias, logprobs, seed, tools, tool_choice,
response_format, stream_options, max_completion_tokens, user, store,
metadata). Response shape matches OpenAI exactly: chatcmpl-* id,
system_fingerprint, logprobs:null, refusal:null, usage chunk in
streaming. OpenAI model names (gpt-4o, gpt-4) auto-mapped to Claude.
Axon extension: conversation_id field for multi-turn conversations
backed by Valkey with 7-day TTL. GET /v1/conversations/:id for history.
Includes E2E test suite (67 tests, scripts/e2e-test.sh).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>