Rewrite Axon SaaS LLM gateway with three core changes: 1. Session pool acquire/release pattern — sessions stay alive and are reused across requests instead of killed after one use. Turn counting with automatic recycling after 200 turns. 2. Valkey-backed conversation store — all conversation state (messages, metadata, TTL) lives in Valkey, not filesystem. Sessions are stateless workers; any session can serve any conversation. 3. 100% OpenAI /v1/chat/completions compatibility — accepts every OpenAI request parameter (temperature, top_p, stop, frequency_penalty, presence_penalty, logit_bias, logprobs, seed, tools, tool_choice, response_format, stream_options, max_completion_tokens, user, store, metadata). Response shape matches OpenAI exactly: chatcmpl-* id, system_fingerprint, logprobs:null, refusal:null, usage chunk in streaming. OpenAI model names (gpt-4o, gpt-4) auto-mapped to Claude. Axon extension: conversation_id field for multi-turn conversations backed by Valkey with 7-day TTL. GET /v1/conversations/:id for history. Includes E2E test suite (67 tests, scripts/e2e-test.sh). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
18 lines
440 B
Plaintext
18 lines
440 B
Plaintext
# Axon gateway port
|
|
AXON_PORT=3000
|
|
|
|
# Comma-separated API keys for Bearer token auth
|
|
AXON_API_KEYS=sk-dev-test
|
|
|
|
# Default Claude model when client doesn't specify
|
|
AXON_DEFAULT_MODEL=claude-sonnet-4-6
|
|
|
|
# Number of pre-warmed sessions in the pool
|
|
AXON_POOL_SIZE=3
|
|
|
|
# Valkey connection URL (conversation state + pool metrics)
|
|
AXON_VALKEY_URL=redis://localhost:6379
|
|
|
|
# Conversation TTL in seconds (default: 7 days)
|
|
AXON_CONVERSATION_TTL=604800
|