{"title":"Claude Code API Proxy","version":"4.7.0","updated":"2026-06-01","description":"OpenAI-compatible API proxy to Claude via Direct Anthropic API. Real-time SSE streaming, tool support, server-side tools (web_search, web_fetch, code_execution), Vision (image_url + base64), Opus 4.8 default. Uses Claude Max subscription with OAuth. Compatible with OpenAI SDK, Open WebUI, LiveKit, langchain, anthropic SDK, Cline, Cursor. v4.7.0 (01.06.2026): /v1/messages non-stream switched to direct Anthropic API (legacy SDK pool retired, fixes Unknown slot_key 500 error).","base_url":"https://claude.ai-platform.space","path_aliases":{"description":"All API endpoints work with AND without /v1/ prefix. Use whichever your client expects.","examples":[{"with_prefix":"/v1/chat/completions","without_prefix":"/chat/completions","both_work":true},{"with_prefix":"/v1/messages","without_prefix":"/messages","both_work":true},{"with_prefix":"/v1/models","without_prefix":"/models","both_work":true},{"with_prefix":"/v1/status","without_prefix":"/status","both_work":true}],"base_url_options":[{"url":"https://claude.ai-platform.space/v1","for":"Clients that add path directly (OpenClaw, some OpenAI SDK configs)"},{"url":"https://claude.ai-platform.space","for":"Clients that add /v1/ automatically (standard OpenAI SDK, LiteLLM)"}],"note":"If you get 404, check whether your client adds /v1/ prefix automatically. If yes, set base_url WITHOUT /v1. If no, set base_url WITH /v1."},"request_formats":{"description":"All endpoints auto-detect request format. Send in any format - it just works.","formats":[{"name":"OpenAI","example":"{\"model\": \"sonnet\", \"messages\": [{\"role\": \"system\", \"content\": \"You are helpful\"}, {\"role\": \"user\", \"content\": \"Hello\"}]}","for":"OpenAI SDK, Open WebUI, LiteLLM, LangChain, any OpenAI-compatible client"},{"name":"Anthropic","example":"{\"model\": \"claude-sonnet-4-6\", \"system\": \"You are helpful\", \"max_tokens\": 4096, \"messages\": [{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello\"}]}]}","for":"Anthropic SDK, langchain-anthropic, Claude API clients"},{"name":"Simple text","example":"{\"prompt\": \"Hello\"} or {\"content\": \"Hello\"} or {\"query\": \"Hello\"}","for":"curl, scripts, webhooks, any simple integration"},{"name":"String messages","example":"{\"messages\": \"just send a string instead of array\"}","for":"Quick prototyping, simple scripts"}],"model_aliases":{"description":"50+ model names accepted and auto-mapped to Claude equivalents","groups":[{"name":"Short","examples":"opus, sonnet, haiku"},{"name":"Claude 4.x","examples":"claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5 (+ legacy claude-opus-4-6)"},{"name":"Claude 3.x","examples":"claude-3-opus-20240229, claude-3-5-sonnet-20241022, claude-3-haiku-20240307"},{"name":"OpenAI","examples":"gpt-4, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, gpt-4.1, o3, o3-mini"},{"name":"Google","examples":"gemini-pro, gemini-2.5-pro, gemini-2.0-flash"},{"name":"LiteLLM","examples":"anthropic/claude-sonnet-4-6, anthropic/claude-opus-4-8, anthropic/claude-opus-4-7"}]},"auth_headers":{"description":"Both OpenAI and Anthropic auth headers accepted on ALL endpoints","accepted":["Authorization: Bearer KEY","x-api-key: KEY","?key=KEY"]},"encoding_tolerance":{"description":"Body decoding fallback chain. Прокси принимает request body даже если клиент шлёт его в legacy encoding (Windows-based интеграции, старые CRM-системы). Битые UTF-8 байты заменяются на U+FFFD вместо отказа с 400.","fallback_order":["UTF-8 (strict)","windows-1251 (cp1251)","latin-1","UTF-8 with replace errors"],"introduced":"28.05.2026 (фикс инцидента '0xcf at position 56' — клиент шёл cp1251)","applies_to":["/v1/chat/completions","/v1/messages","/v1/messages/count_tokens"],"note":"Если в body есть кириллица — рекомендуется всё же слать в UTF-8. Fallback — safety net для legacy."}},"authentication":{"type":"API Key","methods":[{"method":"Header","format":"Authorization: Bearer YOUR_API_KEY"},{"method":"Query param","format":"?key=YOUR_API_KEY"},{"method":"Anthropic SDK","format":"x-api-key: YOUR_API_KEY"}],"note":"API keys managed via Admin Panel (/admin). Each user gets a unique sk-cc-XXXX key. Admin panel access: /admin?key=YOUR_API_KEY","internal_auth":{"description":"How the proxy authenticates with Anthropic (for reference)","method":"OAuth Bearer Token (sk-ant-oat01-*)","headers":{"Authorization":"Bearer {oauth_token}","anthropic-beta":"claude-code-20250219,oauth-2025-04-20","user-agent":"claude-cli/2.1.85 (external, cli)","x-app":"cli"},"identity_requirement":"Sonnet and Opus via OAuth require system prompt identity assertion: 'You are Claude Code, Anthropic's official CLI for Claude.' as first system block. Without this, API returns 400. Haiku works without it.","source":"GitHub Issues #35269, #40515 (discovered 06.04.2026)"}},"models":{"description":"Available Claude models","list":[{"id":"claude-opus-4-8","alias":["opus","gpt-4","gpt-4-turbo","claude-opus-4-7","claude-opus-4-6"],"description":"Most capable. Deep reasoning, complex architecture. Default Anthropic API since 28.05.2026. 200K context (1M opt-in).","context":200000,"max_output":128000,"max_output_proxy_cap":"21000 tokens for non-streaming direct mode (use stream=true for full output)","price_input":"$5/MTok","price_output":"$25/MTok","price_cache_read":"$0.50/MTok","price_cache_write_5m":"$6.25/MTok","price_cache_write_1h":"$10/MTok","extended_thinking":true,"adaptive_thinking":true,"effort_levels":["low","medium","high","xhigh","max"],"thinking_shape":"{type: 'adaptive'} — Opus 4.7/4.8 picks its own budget. Proxy adds output_config.effort=high for effort=max."},{"id":"claude-sonnet-4-6","alias":["sonnet","gpt-4o"],"description":"Best balance of speed and quality. Default model. 200K context (1M opt-in).","context":200000,"max_output":64000,"max_output_proxy_cap":"21000 tokens for non-streaming direct mode (use stream=true for full output)","price_input":"$3/MTok","price_output":"$15/MTok","price_cache_read":"$0.30/MTok","price_cache_write_5m":"$3.75/MTok","price_cache_write_1h":"$6/MTok","extended_thinking":true,"adaptive_thinking":false,"effort_levels":["low","medium","high","max"],"thinking_shape":"{type: 'enabled', budget_tokens: N} — effort=medium→4000, high→10000, max→16000."},{"id":"claude-haiku-4-5","alias":["haiku","gpt-3.5-turbo","gpt-4o-mini"],"description":"Fastest. Simple tasks, quick answers. 200K context.","context":200000,"max_output":64000,"max_output_proxy_cap":"21000 tokens for non-streaming direct mode (use stream=true for full output)","price_input":"$1/MTok","price_output":"$5/MTok","price_cache_read":"$0.10/MTok","price_cache_write_5m":"$1.25/MTok","price_cache_write_1h":"$2/MTok","extended_thinking":true,"adaptive_thinking":false,"effort_levels":["low","medium","high","max"],"thinking_shape":"{type: 'enabled', budget_tokens: N} — same mapping as Sonnet."}],"note":"OpenAI model names auto-mapped to Claude equivalents. Model id with date suffix (claude-haiku-4-5-20251001) also accepted. NOTE: real Sonnet thinking is 'enabled' (not adaptive) — corrected 17.05.2026.","effort_to_thinking_mapping":{"opus (4.7)":"effort=medium/high/max → thinking={type: 'adaptive'} + output_config.effort=high (max internally mapped to high — Anthropic 4.7 output_config doesn't accept 'max')","sonnet (4.6) / haiku (4.5)":"effort=medium → 4000 tokens, high → 10000, max → 16000 (budget_tokens for type=enabled)","effort=low":"thinking is disabled for all models"}},"usage_modes":{"description":"Two ways to use the API: one-shot or conversation with memory","one_shot":{"description":"Single question -> single answer. No memory between requests. Like a standard OpenAI API call.","when_to_use":"Quick questions, code generation, one-time tasks, integrations (LiveKit, Open WebUI).","example_curl":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"messages\": [{\"role\": \"user\", \"content\": \"Explain async/await in Python\"}]}'","example_python":"from openai import OpenAI\nclient = OpenAI(api_key=\"YOUR_API_KEY\", base_url=\"https://claude.ai-platform.space/v1\")\nresponse = client.chat.completions.create(\n    model=\"sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Explain async/await in Python\"}]\n)\nprint(response.choices[0].message.content)"},"conversation":{"description":"⚠️ NOT IMPLEMENTED YET (planned). session_id is returned in response and accepted in subsequent requests, but the proxy does NOT currently resume the SDK session — every call starts fresh. Send conversation history yourself via messages[] array (standard OpenAI behavior). True server-side resume is on the roadmap.","when_to_use":"For now: include full message history in each request (standard OpenAI flow). Future: persistent server-side memory across requests.","status":"planned — session_id resume not active as of 17.05.2026","how_it_works":["1. Send first request normally -> response includes session_id","2. Save session_id from response","3. Send next request with session_id field -> Claude has full context from previous turns","4. Repeat as needed. Session persists on server."],"example_first_request":"# Step 1: First request (creates session)\ncurl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"messages\": [{\"role\": \"user\", \"content\": \"Analyze the auth module in this project\"}]}'\n\n# Response includes: \"session_id\": \"abc-123-def\"","example_continue":"# Step 2: Continue conversation (Claude remembers everything)\ncurl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"session_id\": \"abc-123-def\", \"messages\": [{\"role\": \"user\", \"content\": \"Now refactor it to use JWT\"}]}'\n\n# Claude already knows the auth module from step 1","example_python":"from openai import OpenAI\nclient = OpenAI(api_key=\"YOUR_API_KEY\", base_url=\"https://claude.ai-platform.space/v1\")\n\n# First message\nr1 = client.chat.completions.create(\n    model=\"sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Read and analyze auth.py\"}]\n)\nprint(r1.choices[0].message.content)\nsession_id = r1.session_id  # Save this!\n\n# Follow-up with context\nr2 = client.chat.completions.create(\n    model=\"sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Now add rate limiting to it\"}],\n    extra_body={\"session_id\": session_id}  # Continue session\n)\nprint(r2.choices[0].message.content)"},"direct_api":{"description":"Direct Anthropic Messages API call. ALL 3 models supported (haiku, sonnet, opus). Zero SDK overhead. Use mode=direct or scope=light/medium.","when_to_use":"Git commit messages, translations, text summarization, copywriting, simple completions. Any task where you just need text in -> text out without file access or tools.","why":"SDK persistent sessions add ~20-40K cache tokens overhead per request ($0.25-0.70) because they load Claude Code system prompt and tool definitions. Direct API sends ONLY your prompt - zero overhead. Same Claude model, same quality, 700x cheaper for simple tasks.","how_it_works":"When mode=direct (or scope=light) is set, the proxy bypasses the SDK session pool entirely and calls Anthropic Messages API directly using OAuth Bearer token. For sonnet/opus, the proxy automatically injects the required system prompt identity assertion. No persistent session, no tools, no context accumulation.","supported_models":{"haiku":{"speed":"~900ms","status":"Works without identity assertion"},"sonnet":{"speed":"~1700ms","status":"Works WITH identity assertion (auto-injected by proxy)"},"opus":{"speed":"~2700ms","status":"Works WITH identity assertion (auto-injected by proxy)"}},"identity_fix_note":"Since 06.04.2026: sonnet and opus via OAuth require system prompt identity. The proxy handles this automatically - you don't need to do anything special. Just pass model=sonnet or model=opus with mode=direct.","example_curl_haiku":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"haiku\", \"mode\": \"direct\", \"messages\": [{\"role\": \"user\", \"content\": \"Translate to English: Привет мир\"}]}' ","example_curl_sonnet":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"mode\": \"direct\", \"messages\": [{\"role\": \"user\", \"content\": \"Explain async/await in Python\"}]}' ","example_curl_opus":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"opus\", \"mode\": \"direct\", \"messages\": [{\"role\": \"user\", \"content\": \"Design a microservice architecture for e-commerce\"}]}' ","example_with_system":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"mode\": \"direct\", \"messages\": [{\"role\": \"system\", \"content\": \"Generate a git commit message\"}, {\"role\": \"user\", \"content\": \"diff --git a/file.py ...\"}]}' ","comparison":{"sdk_session":{"cost_per_request":"$0.25-0.70","cache_tokens":"20-40K","speed":"2-30s","tools":true,"method":"Agent SDK persistent session"},"direct_api":{"cost_per_request":"depends on tokens","cache_tokens":"0","speed":"0.7-5s","tools":false,"method":"Direct Anthropic Messages API"}},"available_models":"ALL 3: haiku (default for scope=light), sonnet, opus. Pass model explicitly: model=sonnet with mode=direct.","response_field":"Response includes 'mode': 'direct_api' to confirm direct API was used."},"extended_thinking":{"description":"Claude Opus 4.7/4.8 / Sonnet 4.6 / Haiku 4.5 support extended thinking — reasoning before responding. Available in BOTH Direct API and SDK session. Two ways to enable:","method_1_effort":{"description":"Easiest: just set 'effort' field. Proxy automatically picks the right thinking config for each model.","mapping":{"opus (4.7/4.8)":"effort=medium/high/xhigh/max → thinking={type:'adaptive'} + output_config.effort (Claude 4.7/4.8 decides its own budget). 'xhigh' is default for Opus 4.7, 'high' is default for Opus 4.8.","sonnet (4.6)":"effort=medium → budget_tokens=4000, effort=high → 10000, effort=max → 16000","haiku (4.5)":"same as sonnet"},"example_curl":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"opus\",\"mode\":\"direct\",\"effort\":\"max\",\"max_tokens\":2048,\"messages\":[{\"role\":\"user\",\"content\":\"Plan a migration from monolith to microservices\"}]}'"},"method_2_raw_thinking":{"description":"Explicit: pass Anthropic-native 'thinking' field. Overrides effort-mapping.","shapes":{"adaptive":"{type: 'adaptive'} — REQUIRED for Opus 4.7/4.8 (classic 'enabled' fails with 400). Claude auto-picks budget.","enabled":"{type: 'enabled', budget_tokens: N} — for Sonnet 4.6, Haiku 4.5, Opus 4.6 (legacy). N must be < max_tokens. Budgets >~20k usually require stream=true.","disabled":"{type: 'disabled'} — force no thinking even if model defaults to it."},"example_curl":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"sonnet\",\"mode\":\"direct\",\"thinking\":{\"type\":\"enabled\",\"budget_tokens\":8000},\"max_tokens\":2000,\"messages\":[{\"role\":\"user\",\"content\":\"Derive a regex for matching URLs\"}]}'"},"notes":["Extended thinking only engages for Direct API (mode=direct or scope=light). SDK sessions use their own --effort flag passed to Claude Code CLI.","Proxy strips ThinkingBlocks from the response and returns only the final text in 'choices[0].message.content'.","If budget_tokens >= max_tokens, proxy auto-raises max_tokens to budget+1024 to satisfy Anthropic's requirement.","Opus 4.7/4.8 does NOT accept 'enabled' type — proxy converts effort=* to 'adaptive' automatically. Old 'enabled' requests will get 400 from Anthropic.","Opus 4.8 introduces new effort level 'xhigh' (between 'high' and 'max'). Proxy passes it through."]}},"endpoints":[{"method":"POST","path":"/v1/chat/completions","description":"Send a message and get Claude's response. Fully OpenAI-compatible. Supports text, vision (multimodal), tools/function calling, structured output (json_schema), streaming, extended thinking, prompt caching.","request_body":{"model":{"type":"string","required":false,"default":"sonnet","description":"Model to use. Accepts Claude IDs (claude-sonnet-4-6), short names (sonnet), or OpenAI aliases (gpt-4o, gpt-4o-mini)."},"messages":{"type":"array | string","required":true,"description":"Array of {role, content} objects. content can be string OR array of blocks (text/image). role values: 'system', 'user', 'assistant', 'tool'. Also accepts plain string (shortcut).","example":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}]},"system":{"type":"string | array","required":false,"description":"Anthropic top-level system prompt. Alternative to messages[role=system]. Accepts string OR array of blocks [{type:'text', text:'...', cache_control:{...}}]. Used by Anthropic SDK clients."},"stream":{"type":"boolean","required":false,"default":false,"description":"Enable real-time SSE streaming. Tokens arrive as they are generated. Tool calls also stream incrementally (tool_use_start / input_delta / tool_use_stop)."},"max_tokens":{"type":"integer","required":false,"default":4096,"description":"Max output tokens. PASSTHROUGH: клиентский max_tokens пробрасывается в Anthropic (fix 01.06.2026 — раньше терялся в парсинге и любой ответ обрезался на 4096). Caps по модели + режиму: non-stream = 21000 (SDK 10-min timeout limit). Stream: opus = 32000, sonnet/haiku = 64000 (per-model Anthropic limits). Если передать больше — proxy clamp'нет до cap. Используй stream=true для длинных ответов. Aliased as max_completion_tokens (OpenAI). Для extended thinking: если budget_tokens >= max_tokens, proxy auto-raises max_tokens до budget+1024."},"max_completion_tokens":{"type":"integer","required":false,"description":"Alias of max_tokens (OpenAI o-series naming). Used identically."},"temperature":{"type":"float","required":false,"description":"Sampling temperature 0.0-1.0. Lower = more deterministic, higher = more creative. Passed through to Anthropic. (Active since 17.05.2026 — earlier versions silently ignored this field.)"},"top_k":{"type":"integer","required":false,"description":"Top-K sampling (Anthropic-only param). Sample only from top K tokens at each step. (Active since 01.06.2026.) ⚠️ Opus 4.7/4.8 silently игнорирует этот параметр — Anthropic вернёт 400 если передать. Для Sonnet/Haiku работает."},"top_p":{"type":"float","required":false,"description":"Top-P (nucleus) sampling 0.0-1.0. Sample from smallest set of tokens whose cumulative probability >= top_p. (Active since 01.06.2026.) ⚠️ Opus 4.7/4.8 silently игнорирует — Anthropic вернёт 400. Для Sonnet/Haiku работает. НЕ используй одновременно с temperature."},"stop_sequences":{"type":"array","required":false,"description":"До 4 строк при встрече которых модель остановит генерацию. Также принимаются alias 'stop' (OpenAI naming). Stop sequences не возвращаются в ответе. (Active since 01.06.2026.)"},"stop":{"type":"string | array","required":false,"description":"OpenAI alias для stop_sequences. Принимается string или array of strings. Конвертируется в Anthropic stop_sequences."},"service_tier":{"type":"string","required":false,"description":"Anthropic service tier: 'auto' (default — Anthropic выбирает) или 'standard_only'. (Active since 01.06.2026.) Работает ТОЛЬКО для non-stream. Stream запросы всегда идут на standard. На OAuth-подписке max — оба варианта эквивалентны."},"timeout":{"type":"integer","required":false,"default":120,"description":"Request timeout in seconds. Max 300."},"effort":{"type":"string","required":false,"default":"medium","description":"Reasoning depth: 'low' (no thinking), 'medium', 'high', 'xhigh' (Opus 4.7/4.8 only), 'max'. Works in BOTH modes. For SDK session → --effort flag to Claude Code. For Direct API → maps to Anthropic extended thinking (see effort_to_thinking_mapping in models)."},"thinking":{"type":"object","required":false,"description":"Raw Anthropic extended thinking config (Direct API only). Overrides effort-mapping. Shapes: {type:'adaptive'} for Opus 4.7/4.8 (REQUIRED — classic 'enabled' fails with 400), {type:'enabled', budget_tokens:N} for Sonnet/Haiku/Opus 4.6, {type:'disabled'}."},"mode":{"type":"string","required":false,"default":"session","description":"Execution mode: 'direct' = Direct Anthropic Messages API (fast, supports vision/tools/thinking), 'session' = SDK persistent session (full Claude Code agent with file tools, web search, bash). Aliases: 'lightweight' (= direct), 'stream' (= direct + force streaming). Auto-forced to direct when tools, response_format, or multimodal content is present."},"scope":{"type":"string","required":false,"description":"Preset alias: 'light' = direct+haiku, 'medium' = session+sonnet+medium, 'strong' = session+opus+high, 'max' = session+opus+max. Overrides model/effort/mode."},"tools":{"type":"array","required":false,"description":"Function calling. Accepts BOTH OpenAI format [{type:'function', function:{name, description, parameters}}] AND Anthropic format [{name, description, input_schema}]. Also passes Anthropic server tools (web_search_*, bash_*, code_execution_*, text_editor_*) without conversion. Presence of tools forces mode=direct."},"tool_choice":{"type":"string | object","required":false,"description":"How model should pick tools. Values: 'auto' (default), 'none', 'required' (must call any tool), 'any' (Anthropic synonym for required), OR {type:'function', function:{name:'X'}} to force specific tool. Converted to Anthropic {type:'auto'|'any'|'tool', name:?}."},"response_format":{"type":"object","required":false,"description":"Structured output. {type:'json_object'} — appends JSON instruction to system + strips ```json``` wrapper from response. {type:'json_schema', json_schema:{name:'X', schema:{...}, description?:'...'}} — internally creates a forced tool call (Anthropic tool_use) and returns the JSON in choices[0].message.content. Used by LangChain with_structured_output, Pydantic-AI, Graphiti."},"cache_control":{"type":"object","required":false,"description":"Per-message or per-block Anthropic prompt cache marker. Shape: {type:'ephemeral', ttl?:'5m'|'1h'}. Can be placed on message-level (msg.cache_control) or on individual content blocks. See full prompt_caching section."},"disable_auto_cache":{"type":"boolean","required":false,"default":false,"description":"Disable proxy's automatic system_prompt caching. Equivalent to sending header 'X-No-Auto-Cache: 1'. Use when you don't want any cache_control added automatically (e.g. for testing exact cache behavior)."},"metadata":{"type":"object","required":false,"description":"Anthropic metadata object + proxy-specific extension fields. Recognized subfields: metadata.user_id (≤256 chars, opaque end-user identifier for Anthropic abuse-detection — НЕ email/PII), metadata.prompt_type (string label for аналитики — появляется в /v1/usage?group_by=prompt_type). Also accepts top-level prompt_type field as shortcut for metadata.prompt_type."},"session_id":{"type":"string","required":false,"description":"⚠️ PLANNED feature. Currently returned in response but NOT yet used to resume SDK sessions. Include full message history yourself for now."}},"response":{"id":"chatcmpl-xxxx","object":"chat.completion","created":1773224718,"model":"haiku","choices":[{"index":0,"message":{"role":"assistant","content":"Response text"},"finish_reason":"end_turn"}],"usage":{"prompt_tokens":10,"completion_tokens":75,"total_tokens":85,"cache_read_tokens":20548,"cache_creation_tokens":1615},"session_id":"uuid-of-session","cost_usd":0.0045,"duration_ms":1072,"duration_api_ms":1033},"examples":{"curl_basic":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]}'","curl_stream":"curl -N -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"stream\": true, \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]}'","curl_direct":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"haiku\", \"mode\": \"direct\", \"messages\": [{\"role\": \"user\", \"content\": \"Say OK\"}]}'","curl_effort":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"sonnet\", \"effort\": \"high\", \"messages\": [{\"role\": \"user\", \"content\": \"Analyze this\"}]}'","python_openai":"from openai import OpenAI\n\nclient = OpenAI(\n    api_key=\"YOUR_API_KEY\",\n    base_url=\"https://claude.ai-platform.space/v1\"\n)\n\nresponse = client.chat.completions.create(\n    model=\"sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n)\nprint(response.choices[0].message.content)","python_stream":"from openai import OpenAI\n\nclient = OpenAI(\n    api_key=\"YOUR_API_KEY\",\n    base_url=\"https://claude.ai-platform.space/v1\"\n)\n\nstream = client.chat.completions.create(\n    model=\"sonnet\",\n    messages=[{\"role\": \"user\", \"content\": \"Write a poem\"}],\n    stream=True\n)\nfor chunk in stream:\n    if chunk.choices[0].delta.content:\n        print(chunk.choices[0].delta.content, end=\"\")","javascript_fetch":"const response = await fetch(\"https://claude.ai-platform.space/v1/chat/completions\", {\n  method: \"POST\",\n  headers: {\n    \"Authorization\": \"Bearer YOUR_API_KEY\",\n    \"Content-Type\": \"application/json\"\n  },\n  body: JSON.stringify({\n    model: \"sonnet\",\n    messages: [{ role: \"user\", content: \"Hello!\" }]\n  })\n});\nconst data = await response.json();\nconsole.log(data.choices[0].message.content);","livekit_agent":"# LiveKit Voice Agent integration\nfrom livekit.plugins import openai\nimport httpx\n\nllm = openai.LLM(\n    model=\"claude-sonnet-4-6\",\n    base_url=\"http://127.0.0.1:8092/v1\",\n    api_key=\"YOUR_API_KEY\",\n    timeout=httpx.Timeout(connect=15.0, read=120.0, write=15.0, pool=15.0),\n)"},"response_headers":{"X-Request-ID":"req-XXXXXXXXXXXX (unique id, always returned — also reflected in 500 traceback log for cross-service debugging). Clients may send their own X-Request-ID to override.","Content-Type":"application/json (non-stream) OR text/event-stream (stream=true)"},"error_responses":{"400":{"error":{"type":"invalid_request_error","message":"<reason from Anthropic>","request_id":"req-..."},"_when":"malformed body, unsupported field combinations, Anthropic 400 propagated."},"401":{"detail":"Invalid or expired API key","_when":"missing/invalid Bearer key, x-api-key, or ?key="},"402":{"error":{"type":"insufficient_balance","message":"Balance too low. Top up at /cabinet#billing","balance":-0.34,"credit_limit":1.0,"tier":"new","payments_count":0},"_when":"balance + credit_limit < estimated cost. credit_limit depends on credit tier (new=$1, active=$10, loyal=$20)."},"403":{"detail":"Model not allowed for this key","_when":"key's allowed_models field doesn't include requested model"},"429":{"error":"rate_limit_error","rate_limit":{"unified_status":"warning","utilization_7d_pct":95.2,"fallback_percentage":95.2},"_when":"Daily limit (max_requests_per_day) OR OAuth pool unified rate limit (5h/7d windows)"},"500":{"error":{"type":"internal_server_error","message":"...","request_id":"req-..."},"_when":"Unhandled exception. Use X-Request-ID to correlate with /var/log/nginx/...5xx.log and server_500.log. Added 15.05.2026."},"503":{"detail":"Service is in maintenance mode","_when":"admin enabled maintenance flag"},"504":{"detail":"Gateway Timeout","_when":"Anthropic API upstream timeout (rare, retried automatically)"}},"billing_note":"Each request costs tokens from your balance. $5 bonus on signup. Credit tier extends overdraft limit ($1/$10/$20). Pricing: /billing/pricing"},{"method":"POST","path":"/v1/messages","description":"Anthropic Messages API compatible endpoint. Native /v1/messages format for langchain-anthropic, anthropic Python SDK, Cline, Cursor, Claude Desktop connectors.","auth_note":"Accepts Authorization: Bearer and x-api-key. NOTE: ?key= query param is NOT supported on this endpoint (only on /v1/chat/completions). Use header.","request_body":{"model":{"type":"string","required":true,"description":"Claude model id or alias (claude-sonnet-4-6, sonnet, opus, haiku)."},"messages":{"type":"array","required":true,"description":"Anthropic format: [{role, content}]. content can be string OR array of blocks ({type:'text'|'image'|'tool_use'|'tool_result'})."},"system":{"type":"string | array","required":false,"description":"System prompt. String or array of {type:'text', text, cache_control?}. Multi-block supported with caching."},"max_tokens":{"type":"integer","required":true,"description":"Max output tokens. Per Anthropic spec — required для этого endpoint. PASSTHROUGH (fix 01.06.2026): клиентский max_tokens пробрасывается. Caps: non-stream = 21000 (SDK 10-мин timeout), stream opus = 32000, stream sonnet/haiku = 64000. Превышение → proxy clamp до cap."},"stream":{"type":"boolean","required":false,"default":false,"description":"SSE in Anthropic format: message_start / content_block_start / content_block_delta / content_block_stop / message_delta / message_stop / ping events (keepalive every 15s)."},"temperature":{"type":"float","required":false,"description":"0.0-1.0 sampling temperature. (Active since 17.05.2026 fix.)"},"top_k":{"type":"integer","required":false,"description":"Anthropic Top-K sampling. (Active since 01.06.2026.) ⚠️ Opus 4.7/4.8 silently игнорирует — Anthropic вернёт 400."},"top_p":{"type":"float","required":false,"description":"Anthropic Top-P (nucleus) sampling 0.0-1.0. (Active since 01.06.2026.) ⚠️ Opus 4.7/4.8 silently игнорирует. НЕ используй одновременно с temperature."},"stop_sequences":{"type":"array","required":false,"description":"До 4 строк. При встрече модель останавливается. Не возвращаются в ответе. (Active since 01.06.2026.)"},"service_tier":{"type":"string","required":false,"description":"'auto' (default) или 'standard_only'. (Active since 01.06.2026.) Только для non-stream. На OAuth max-подписке оба варианта эквивалентны."},"tools":{"type":"array","required":false,"description":"Anthropic native tools format: [{name, description, input_schema}]. Also accepts OpenAI tools — auto-converted."},"tool_choice":{"type":"object","required":false,"description":"{type:'auto'|'any'|'tool'|'none', name?:'X'}"},"effort":{"type":"string","required":false,"description":"Proxy extension: 'low'/'medium'/'high'/'xhigh'/'max'. Auto-maps to thinking config."},"thinking":{"type":"object","required":false,"description":"Anthropic extended thinking. {type:'adaptive'} (Opus 4.7/4.8), {type:'enabled', budget_tokens:N} (Sonnet/Haiku/Opus 4.6)."},"metadata":{"type":"object","required":false,"description":"Anthropic metadata + proxy fields. Recognized subfields: metadata.user_id (≤256 chars opaque end-user id для Anthropic abuse-detection), metadata.prompt_type (analytics label)."}},"response":{"id":"msg_xxxx","type":"message","role":"assistant","model":"claude-sonnet-4-6","content":[{"type":"text","text":"Response"}],"stop_reason":"end_turn","usage":{"input_tokens":12,"output_tokens":45,"cache_creation_input_tokens":0,"cache_read_input_tokens":0}},"examples":{"python_anthropic":"from anthropic import Anthropic\n\nclient = Anthropic(\n    api_key=\"YOUR_API_KEY\",  # sk-cc-...\n    base_url=\"https://claude.ai-platform.space\"\n)\nmessage = client.messages.create(\n    model=\"claude-sonnet-4-6\",\n    max_tokens=4096,\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n)\nprint(message.content[0].text)","python_stream":"from anthropic import Anthropic\nclient = Anthropic(api_key=\"YOUR_API_KEY\", base_url=\"https://claude.ai-platform.space\")\nwith client.messages.stream(\n    model=\"sonnet\", max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Tell me a joke\"}]\n) as stream:\n    for text in stream.text_stream:\n        print(text, end=\"\", flush=True)","curl_basic":"curl -X POST https://claude.ai-platform.space/v1/messages \\\n  -H \"x-api-key: YOUR_API_KEY\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"claude-sonnet-4-6\",\"max_tokens\":1024,\"messages\":[{\"role\":\"user\",\"content\":\"Hello!\"}]}'"},"error_responses":{"401":{"detail":"Invalid or expired API key"},"402":{"error":{"type":"insufficient_balance","message":"Balance too low. Top up at /cabinet#billing"},"_note":"Balance check added 17.05.2026 — earlier versions skipped this on /v1/messages."},"403":{"detail":"Model not allowed for this key"},"429":{"detail":"Daily request limit exceeded"}},"billing_note":"Each request costs tokens from your balance. Same billing as /v1/chat/completions."},{"method":"POST","path":"/v1/messages/count_tokens","description":"Anthropic Messages Token Counting API. Возвращает точное число input_tokens для конкретного payload БЕЗ генерации ответа. Полезно для LangChain/LlamaIndex/Pydantic-AI — оценка стоимости до отправки, проверка лимита контекста, бюджетирование. БЕСПЛАТНО (не списывает с баланса). Идентичный shape запроса/ответа с api.anthropic.com.","auth_note":"Accepts Authorization: Bearer и x-api-key. Header anthropic-version: 2023-06-01 рекомендуется (не обязателен).","request_body":{"model":{"type":"string","required":true,"description":"Claude model id или alias (claude-sonnet-4-6, sonnet, opus, haiku). Считает токены по tokenizer'у конкретной модели."},"messages":{"type":"array","required":true,"description":"Anthropic format: [{role, content}]. content может быть string ИЛИ array of blocks (text/image/tool_use/tool_result)."},"system":{"type":"string | array","required":false,"description":"System prompt. String или array of {type:'text', text, cache_control?}."},"tools":{"type":"array","required":false,"description":"Если есть — tools schema тоже учитывается в подсчёте (tools кушают input_tokens)."},"tool_choice":{"type":"object","required":false,"description":"Tool choice (учитывается)."}},"response":{"input_tokens":12345},"examples":{"curl_basic":"curl -X POST https://claude.ai-platform.space/v1/messages/count_tokens \\\n  -H \"x-api-key: YOUR_API_KEY\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"claude-sonnet-4-6\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello, world!\"}]}'","python_anthropic":"from anthropic import Anthropic\n\nclient = Anthropic(api_key=\"sk-cc-YOUR_KEY\", base_url=\"https://claude.ai-platform.space\")\nresult = client.messages.count_tokens(\n    model=\"claude-sonnet-4-6\",\n    messages=[{\"role\": \"user\", \"content\": \"How many tokens is this?\"}],\n)\nprint(result.input_tokens)","langchain":"# LangChain использует count_tokens внутри для get_num_tokens_from_messages()\nfrom langchain_anthropic import ChatAnthropic\n\nllm = ChatAnthropic(\n    model=\"claude-sonnet-4-6\",\n    anthropic_api_url=\"https://claude.ai-platform.space\",\n    anthropic_api_key=\"sk-cc-YOUR_KEY\",\n)\nn_tokens = llm.get_num_tokens_from_messages([HumanMessage(\"Hello\")])"},"response_headers":{"X-Request-ID":"req-XXX","Content-Type":"application/json"},"error_responses":{"400":{"error":{"type":"invalid_request_error","message":"<reason>"},"_when":"невалидный shape запроса, неподдерживаемая модель"},"401":{"detail":"Invalid or expired API key"}},"billing_note":"БЕСПЛАТНО — token counting не списывает с баланса. Anthropic считает токены своим tokenizer'ом без вызова модели."},{"method":"GET","path":"/v1/models","description":"List available models for your API key.","request_body":null,"response":{"object":"list","data":[{"id":"claude-opus-4-8","object":"model","owned_by":"anthropic"},{"id":"claude-opus-4-7","object":"model","owned_by":"anthropic"},{"id":"claude-sonnet-4-6","object":"model","owned_by":"anthropic"},{"id":"claude-haiku-4-5","object":"model","owned_by":"anthropic"}]},"examples":{"curl":"curl https://claude.ai-platform.space/v1/models \\\n  -H \"Authorization: Bearer YOUR_API_KEY\""}},{"method":"GET","path":"/v1/status","description":"Check API status, version, OAuth account pool health.","request_body":null,"response":{"status":"running","version":"4.7.0","mode":"direct-api","default_model":"sonnet","pool":{"ready_count":0,"healthy":true}},"examples":{"curl":"curl https://claude.ai-platform.space/v1/status?key=YOUR_API_KEY"}},{"method":"GET","path":"/health","description":"Health check. Returns 200 OK if the proxy is up. No authentication required.","request_body":null,"response":{"status":"ok","service":"claude-api-sdk-proxy","version":"4.7.0","mode":"direct-api","pool":{"pool_size":0,"ready_count":0,"healthy":true}},"examples":{"curl":"curl https://claude.ai-platform.space/health"}},{"method":"GET","path":"/rate-limits","description":"Monitor unified rate limits from OAuth subscriptions. Shows 5-hour, 7-day and overage utilization.","request_body":null,"response":{"unified_status":"normal","representative_claim":"opus","five_hour":{"status":"normal","utilization_pct":35.2,"resets_in_sec":12400},"seven_day":{"status":"normal","utilization_pct":12.8,"resets_in_sec":504000},"overage":{"status":"normal","in_use":false,"utilization_pct":0},"fallback_percentage":35.2,"last_model":"opus","last_check_ago_sec":45},"examples":{"curl":"curl https://claude.ai-platform.space/rate-limits \\\n  -H \"Authorization: Bearer YOUR_API_KEY\""}},{"method":"GET","path":"/v1/usage","description":"Per-client usage statistics with flexible grouping. Returns aggregated tokens/cost/requests for the authenticated user.","request_body":{"from":{"type":"string (query)","required":false,"description":"Start date YYYY-MM-DD. Default: 7 days ago."},"to":{"type":"string (query)","required":false,"description":"End date YYYY-MM-DD. Default: today."},"group_by":{"type":"string (query)","required":false,"default":"date,model","description":"Comma-separated dimensions: 'date', 'model', 'prompt_type'. E.g. group_by=date,model returns per-day per-model breakdown. Allowed values: date, model, prompt_type."}},"response":{"period":{"from":"2026-04-17","to":"2026-05-17"},"total_requests":1247,"total_input_tokens":482553,"total_output_tokens":95210,"total_cache_read_tokens":312000,"total_cache_creation_tokens":18420,"total_cost_usd":4.82,"by_date":[{"date":"2026-05-17","requests":42,"cost_usd":0.18}],"by_model":[{"model":"sonnet","requests":900,"cost_usd":3.6}],"by_prompt_type":[{"prompt_type":"agent_chat","requests":500,"cost_usd":2.1}]},"examples":{"curl_today":"curl \"https://claude.ai-platform.space/v1/usage?from=2026-05-17&to=2026-05-17&group_by=model\" \\\n  -H \"Authorization: Bearer YOUR_API_KEY\"","curl_7d_by_prompt_type":"curl \"https://claude.ai-platform.space/v1/usage?from=2026-05-10&to=2026-05-17&group_by=date,prompt_type\" \\\n  -H \"Authorization: Bearer YOUR_API_KEY\""}}],"auth_endpoints":{"description":"User authentication and registration. Session-token based auth.","base_path":"/auth","endpoints":[{"method":"POST","path":"/auth/register","description":"Register new user. $5 bonus on signup. Rate limit: 3 per hour.","request_body":{"email":{"type":"string","required":true,"description":"Valid email (temp-email domains blocked)"},"password":{"type":"string","required":true,"description":"Min 8 characters"},"lang":{"type":"string","required":false,"default":"en","description":"Interface language: en, uk, ru"}},"response":{"success":true,"user":{"id":1,"username":"user_abc","email":"user@example.com","role":"user"},"api_key":"sk-cc-xxxx","balance":5.0,"session_token":"tok_xxxx"}},{"method":"POST","path":"/auth/login","description":"Login with email and password. Rate limit: 5 attempts per 15 min.","request_body":{"email":{"type":"string","required":true,"description":"User email"},"password":{"type":"string","required":true,"description":"User password"}},"response":{"success":true,"user":{"id":1,"username":"user_abc","email":"user@example.com","role":"user"},"balance":12.5,"session_token":"tok_xxxx"}},{"method":"POST","path":"/auth/logout","description":"Invalidate current session token. Requires authentication.","request_body":null,"response":{"success":true}},{"method":"GET","path":"/auth/me","description":"Get current authenticated user profile.","request_body":null,"response":{"user":{"id":1,"username":"user_abc","email":"user@example.com","role":"user","balance":12.5,"credit_limit":10.0,"tier_name":"active","bonus_claimed":true,"email_verified":true}}},{"method":"POST","path":"/auth/forgot-password","description":"Request password reset email. Always returns success to prevent email enumeration. Rate limit applies.","request_body":{"email":{"type":"string","required":true}},"response":{"success":true,"message":"If the email exists, a reset link was sent"}},{"method":"POST","path":"/auth/reset-password","description":"Apply password reset using token from email.","request_body":{"token":{"type":"string","required":true,"description":"Reset token from email link"},"new_password":{"type":"string","required":true,"description":"Min 8 characters"}},"response":{"success":true}},{"method":"POST","path":"/auth/resend-verify","description":"Resend email verification message. Rate limited.","request_body":null,"response":{"success":true}},{"method":"GET","path":"/auth/verify-email","description":"Apply email verification token (link from email). Query: ?token=...","request_body":null,"response":{"success":true,"message":"Email verified"}},{"method":"GET","path":"/auth/unsubscribe","description":"Unsubscribe from notification emails via signed link. Query: ?token=...","request_body":null,"response":{"success":true,"category":"marketing"}}]},"cabinet_endpoints":{"description":"User cabinet API - dashboard, keys, logs, balance, settings. All endpoints require session token authentication.","base_path":"/cabinet/api","endpoints":[{"method":"GET","path":"/cabinet/api/dashboard","description":"User dashboard with today's stats, lifetime totals, chart data (30 days)."},{"method":"GET","path":"/cabinet/api/keys","description":"List user's API keys (values masked). Max 10 keys per user."},{"method":"POST","path":"/cabinet/api/keys","description":"Create new API key. Body: {name}. Returns full key value (shown once)."},{"method":"DELETE","path":"/cabinet/api/keys/{key_id}","description":"Revoke/deactivate an API key."},{"method":"GET","path":"/cabinet/api/logs","description":"User's request logs. Params: model, status, limit (max 100), offset."},{"method":"GET","path":"/cabinet/api/logs/{request_id}","description":"Full details of a specific request."},{"method":"GET","path":"/cabinet/api/balance","description":"Balance and transaction history. Params: limit, offset."},{"method":"GET","path":"/cabinet/api/payments","description":"Payment history (separate from balance transactions). Params: limit, offset, status."},{"method":"GET","path":"/cabinet/api/usage","description":"Usage stats by model and day. Params: days (default 30)."},{"method":"GET","path":"/cabinet/api/settings","description":"User profile settings: email, daily spend limit, notifications, credit tier info."},{"method":"PUT","path":"/cabinet/api/settings","description":"Update settings. Body: {email, daily_spend_limit, notification_settings}."},{"method":"PUT","path":"/cabinet/api/settings/password","description":"Change password. Body: {current_password, new_password}. Invalidates other sessions."},{"method":"GET","path":"/cabinet/api/pricing","description":"Pricing for all active models."}]},"billing_endpoints":{"description":"Billing and payments. Top-up balance, check payment status.","base_path":"/billing","endpoints":[{"method":"GET","path":"/billing/pricing","description":"Public pricing for all models with cache token prices. No auth required.","request_body":null,"response":{"models":[{"model":"claude-sonnet-4-6","display_name":"Sonnet","price_input_per_mtok":3.0,"price_output_per_mtok":15.0,"price_cache_read_per_mtok":0.3,"price_cache_write_per_mtok":3.75}]}},{"method":"POST","path":"/billing/topup","description":"Initiate balance top-up. Amount: $5-$1000. Returns payment order reference.","request_body":{"amount":{"type":"float","required":true,"description":"Amount in USD ($5-$1000)"}},"response":{"order_reference":"CC-xxxx","amount":10.0,"status":"pending","message":"Payment initiated"}},{"method":"POST","path":"/billing/webhook","description":"Payment provider webhook callback. Idempotent, HMAC verification.","request_body":null,"response":{"orderReference":"CC-xxxx","status":"accept"}},{"method":"GET","path":"/billing/status/{order_ref}","description":"Check payment status by order reference.","request_body":null,"response":{"status":"completed","amount_usd":10.0,"created_at":"2026-04-06T12:00:00","completed_at":"2026-04-06T12:01:00"}},{"method":"POST","path":"/billing/card/verify","description":"Initiate card binding for recurring charges (WayForPay 3DS flow). Body: {amount?:1.0}. Returns redirect URL."},{"method":"GET","path":"/billing/cards","description":"List saved cards. Returns masked PAN + default flag + expiry."},{"method":"POST","path":"/billing/cards/{id}/default","description":"Set card as default for charges/autotopup."},{"method":"DELETE","path":"/billing/cards/{id}","description":"Remove saved card from WayForPay vault."},{"method":"POST","path":"/billing/charge","description":"Charge saved default card. Body: {amount}. Used by autotopup cron and manual top-up via stored card."},{"method":"POST","path":"/billing/autotopup","description":"Enable/disable auto top-up. Body: {enabled, threshold_usd, amount_usd}. When balance < threshold, default card is charged."},{"method":"GET","path":"/billing/wfp-balance","description":"WayForPay merchant balance (admin view via user endpoint)."},{"method":"GET","path":"/billing/wfp-transactions","description":"WayForPay transaction log filtered by user. Params: limit, offset."},{"method":"GET","path":"/billing/subscription-recommendations","description":"Suggested subscription plans based on user's last 30d spend."},{"method":"GET","path":"/billing/subscriptions","description":"List user's active subscriptions."},{"method":"POST","path":"/billing/subscribe","description":"Create new subscription. Body: {plan_id, payment_method?}. Uses default card."},{"method":"GET","path":"/billing/subscriptions/{id}","description":"Subscription details + upcoming payment date."},{"method":"POST","path":"/billing/subscriptions/{id}/pause","description":"Pause subscription. Resume preserves remaining days."},{"method":"POST","path":"/billing/subscriptions/{id}/resume","description":"Resume paused subscription."},{"method":"POST","path":"/billing/subscriptions/{id}/cancel","description":"Cancel subscription. Active until end of paid period."},{"method":"POST","path":"/billing/subscriptions/{id}/change","description":"Change subscription amount/plan (WFP CHANGE flow). Body: {new_amount}."},{"method":"POST","path":"/billing/subscriptions/{id}/sync","description":"Force sync subscription state with WayForPay (use after webhook lag)."}]},"admin_endpoints":[{"method":"GET","path":"/admin/dashboard","description":"Dashboard metrics, charts data, top users, errors."},{"method":"GET","path":"/admin/users","description":"List all users with stats."},{"method":"POST","path":"/admin/users","description":"Create new user. Body: {username, password, email, role, max_requests_per_day, max_tokens_per_day, allowed_models, notes}. Returns: {user_id, api_key, username}"},{"method":"PUT","path":"/admin/users/{id}","description":"Update user fields."},{"method":"DELETE","path":"/admin/users/{id}","description":"Delete user and all their keys."},{"method":"GET","path":"/admin/users/{id}/keys","description":"List API keys for user."},{"method":"POST","path":"/admin/users/{id}/keys","description":"Create new API key. Body: {name, expires_at}"},{"method":"PUT","path":"/admin/keys/{id}","description":"Update key (name, is_active, expires_at)."},{"method":"DELETE","path":"/admin/keys/{id}","description":"Delete API key."},{"method":"GET","path":"/admin/logs","description":"Request logs with filters. Params: limit, offset, user_id, model, status, date_from, date_to, search."},{"method":"GET","path":"/admin/logs/{request_id}","description":"Full log detail including raw JSON."},{"method":"GET","path":"/admin/sessions","description":"List Claude sessions. Params: limit, user_id."},{"method":"GET","path":"/admin/sessions/{session_id}/messages","description":"All messages in a session with full details (prompt, response, tokens, cost)."},{"method":"GET","path":"/admin/stats/daily","description":"Daily aggregated stats. Params: days, user_id."},{"method":"GET","path":"/admin/stats/models","description":"Stats by model. Params: days."},{"method":"GET","path":"/admin/stats/users","description":"Stats by user. Params: days."},{"method":"GET","path":"/admin/stats/cache","description":"Cache hit/miss ratio. Params: days."},{"method":"GET","path":"/admin/settings","description":"Get all settings."},{"method":"PUT","path":"/admin/settings","description":"Update settings. Body: {key: value, ...}"},{"method":"GET","path":"/admin/billing/overview","description":"Billing overview: revenue, bonuses, consumed, user balances."},{"method":"POST","path":"/admin/billing/adjust","description":"Adjust user balance. Body: {user_id, amount, description}"},{"method":"GET","path":"/admin/billing/pricing","description":"Get pricing for all models."},{"method":"PUT","path":"/admin/billing/pricing/{model}","description":"Update pricing for a model."},{"method":"GET","path":"/api/docs/json","description":"This documentation in JSON format."},{"method":"GET","path":"/admin/accounts","description":"List OAuth account pool with stats and rate limit info."},{"method":"POST","path":"/admin/accounts","description":"Add OAuth account. Body: {email, credentials_file, notes}."},{"method":"PUT","path":"/admin/accounts/{id}","description":"Update account (priority, is_active, status, notes, email)."},{"method":"POST","path":"/admin/accounts/probe","description":"Probe all accounts with light haiku request to update rate limit headers."},{"method":"POST","path":"/admin/accounts/set-active/{id}","description":"Force given account active in pool (sticky selection)."},{"method":"POST","path":"/admin/billing/refund/{payment_id}","description":"Refund a payment via WayForPay/Lava. Body: {amount?, reason?}."},{"method":"GET","path":"/admin/billing/payments","description":"Admin payments list with filters. Params: status, user_id, date_from, date_to, limit, offset."},{"method":"GET","path":"/admin/billing/statements","description":"Financial statements by month. Params: year, month."},{"method":"GET","path":"/admin/billing/statements/csv","description":"Export statements as CSV."},{"method":"GET","path":"/admin/email/templates","description":"List email templates."},{"method":"PUT","path":"/admin/email/templates/{id}","description":"Update email template body/subject."},{"method":"GET","path":"/admin/email/log","description":"Email send log with status (sent/bounced/failed)."},{"method":"POST","path":"/admin/email/send","description":"Send transactional email. Body: {user_id, template_id, vars}."},{"method":"POST","path":"/admin/email/broadcast","description":"Broadcast to user segment. Body: {segment, template_id, vars}."},{"method":"GET","path":"/admin/email/stats","description":"Email delivery stats: sent, bounced, open rate."},{"method":"GET","path":"/admin/cache-report","description":"Prompt cache statistics dashboard. Params: period (24h|7d|30d|since-fix), since=YYYY-MM-DD. Reflects auto-cache effectiveness per user/model."},{"method":"GET","path":"/admin/usage-today","description":"Real-time usage stats for today (all users, by model)."},{"method":"GET","path":"/admin/subscriptions","description":"Admin subscriptions list. Params: status, user_id."},{"method":"POST","path":"/admin/subscriptions/{id}/sync","description":"Force WFP sync for a subscription."},{"method":"POST","path":"/admin/subscriptions/{id}/cancel","description":"Admin cancel subscription."},{"method":"POST","path":"/admin/subscriptions/{id}/change","description":"Admin change subscription amount."},{"method":"GET","path":"/admin/subscriptions/{id}/payments","description":"Subscription payments history."},{"method":"POST","path":"/admin/oauth/init","description":"OAuth provisioning: initiate flow (returns Anthropic OAuth URL + state)."},{"method":"POST","path":"/admin/oauth/exchange","description":"OAuth provisioning: exchange code for tokens (callback handler)."},{"method":"POST","path":"/admin/oauth/provision","description":"OAuth provisioning: finalize, add account to pool."},{"method":"GET","path":"/admin/oauth/status","description":"OAuth provisioning current state (in-progress flows)."},{"method":"POST","path":"/admin/oauth/cancel","description":"Cancel current OAuth provisioning flow."},{"method":"POST","path":"/admin/oauth/resume","description":"Resume interrupted provisioning."},{"method":"POST","path":"/admin/oauth/invites/create","description":"Create OAuth invite token (self-service onboarding for OAuth account owner). Body: {note, expires_in_hours}."},{"method":"GET","path":"/admin/oauth/invites","description":"List active invites."},{"method":"POST","path":"/admin/oauth/invites/revoke","description":"Revoke an invite. Body: {token}."},{"method":"POST","path":"/public/oauth/init","description":"PUBLIC invite-token endpoint. Initiate OAuth flow for invited user (no admin auth, only invite token)."},{"method":"POST","path":"/public/oauth/exchange","description":"PUBLIC: exchange code (callback for invitee)."},{"method":"POST","path":"/public/oauth/provision","description":"PUBLIC: finalize OAuth provisioning via invite."},{"method":"GET","path":"/public/oauth/status","description":"PUBLIC: invite flow state."}],"telemetry_fields":{"description":"Every request logs 25+ fields for full observability","fields":[{"name":"input_tokens","description":"Input tokens sent to model"},{"name":"output_tokens","description":"Tokens generated by model"},{"name":"cache_read_input_tokens","description":"Tokens read from prompt cache (cost savings)"},{"name":"cache_creation_input_tokens","description":"Tokens written to prompt cache"},{"name":"cost_usd","description":"Estimated cost in USD"},{"name":"duration_ms","description":"Total request time"},{"name":"duration_api_ms","description":"Pure API call time"},{"name":"num_turns","description":"Number of tool-use turns"},{"name":"stop_reason","description":"Why Claude stopped (end_turn, max_tokens)"},{"name":"session_id","description":"Session ID for conversation continuity"},{"name":"stream_mode","description":"Whether request used SSE streaming"}]},"rate_limits":{"description":"Per-user configurable limits + автоматический failover между OAuth-аккаунтами при 429/401","fields":[{"name":"max_requests_per_day","default":100,"description":"Maximum API requests per day per user"},{"name":"max_tokens_per_day","default":500000,"description":"Maximum tokens per day per user (tracked)"},{"name":"allowed_models","default":"haiku,sonnet,opus","description":"Comma-separated list of allowed models. All 3 models available by default."}],"pool_failover":{"description":"Transparent multi-account failover. Прокси держит пул из 4+ OAuth-аккаунтов (Claude Max subscriptions). При rate limit или auth failure на текущем аккаунте — автоматически переключается на следующий по приоритету и повторяет запрос. Клиент НЕ видит 429/401 если в пуле есть свободный аккаунт.","endpoints_covered":["/v1/chat/completions (stream + non-stream)","/v1/messages (stream + non-stream)"],"triggers":["429 Rate Limit от Anthropic (5h или 7d окно исчерпано на конкретном аккаунте)","401 Authentication Error (OAuth токен истёк — обычно auto-refresh ловит до этого)"],"max_retries":"5 попыток на запрос. Каждая retry идёт на ДРУГОЙ аккаунт (трекинг через _tried_accounts set, без повторов).","rotation_strategy":"try_next_account: sort by priority ASC + id, skip overloaded (≥95% utilization), берёт первый available. Sticky: после успешного switch следующий запрос идёт на тот же аккаунт пока не упрётся.","stream_behavior":"Для stream запросов: если 429/401 пришёл ДО первого SSE event'а — retry прозрачен (новый стрим из start). Если посередине стрима — retry невозможен (часть данных уже отдана клиенту), будет yield error event.","transparency":"Клиент видит увеличенный latency (~0.5-2s на retry) вместо ошибки. Если ВСЕ 4 аккаунта в 429 одновременно — клиент получает finальный 429 с unified_status / utilization_7d_pct в теле.","history":"Stream retry активирован 01.06.2026. До этого только non-stream имел retry — stream возвращал 429 мгновенно даже если 3 других аккаунта свободны.","monitoring":"GET /rate-limits показывает текущий active аккаунт + utilization. GET /admin/accounts — полная картина пула с per-account метриками."}},"vision":{"description":"Vision (multimodal) — отправь картинку вместе с текстом и Claude её опишет/проанализирует. Прокси принимает ВСЕ известные форматы блоков от любых клиентов (OpenAI Chat, OpenAI Responses API, Anthropic native, LangChain, LibreChat, n8n) и нормализует в Anthropic native перед отправкой. Universal-конвертер на стороне прокси (без правок на стороне клиента).","supported_input_formats":{"description":"16+ принимаемых вариаций блоков картинок — все автоматически нормализуются","variants":[{"name":"OpenAI Chat Completions standard","example":{"type":"image_url","image_url":{"url":"data:image/png;base64,..."}}},{"name":"OpenAI Chat Completions с публичным URL","example":{"type":"image_url","image_url":{"url":"https://example.com/cat.jpg"}}},{"name":"OpenAI с detail (игнорится)","example":{"type":"image_url","image_url":{"url":"https://...","detail":"high"}}},{"name":"image_url как строка (legacy/сокращение)","example":{"type":"image_url","image_url":"data:image/jpeg;base64,..."}},{"name":"Плоский url на уровне блока","example":{"type":"image_url","url":"https://example.com/x.png"}},{"name":"OpenAI Responses API input_image","example":{"type":"input_image","image_url":{"url":"data:image/webp;base64,..."}}},{"name":"OpenAI Responses API + строка","example":{"type":"input_image","image_url":"data:image/png;base64,..."}},{"name":"OpenAI Responses API + плоский url","example":{"type":"input_image","url":"https://..."}},{"name":"Плоский Anthropic-like","example":{"type":"image","url":"https://..."}},{"name":"Микс: image + image_url","example":{"type":"image","image_url":"https://..."}},{"name":"Anthropic native base64 (canonical)","example":{"type":"image","source":{"type":"base64","media_type":"image/png","data":"..."}}},{"name":"Anthropic native URL (canonical)","example":{"type":"image","source":{"type":"url","url":"https://..."}}},{"name":"data URL без ;base64 (urlencoded)","example":{"type":"image_url","image_url":{"url":"data:image/svg+xml,%3Csvg/%3E"}}},{"name":"data URL с доп. параметрами (charset etc.)","example":{"type":"image_url","image_url":{"url":"data:image/png;charset=utf-8;base64,..."}}},{"name":"Голый base64 с media_type подсказкой","example":{"type":"image_url","image_url":{"url":"iVBORw0KGgo...","media_type":"image/png"}}},{"name":"Голый base64 без подсказки (детект magic bytes)","example":{"type":"image_url","image_url":"iVBORw0KGgo..."}},{"name":"image/jpg алиас → image/jpeg","example":{"type":"image_url","image_url":{"url":"data:image/jpg;base64,..."}}}]},"supported_text_formats":{"description":"Text-блоки тоже нормализуются: 'input_text' и 'output_text' (OpenAI Responses API) → 'text' (Anthropic).","variants":[{"type":"text","note":"Anthropic / OpenAI Chat Completions standard"},{"type":"input_text","note":"OpenAI Responses API → нормализуется в text"},{"type":"output_text","note":"OpenAI Responses API (assistant) → нормализуется в text"}]},"supported_media_types":{"description":"Anthropic Claude API нативно принимает 4 формата изображений: image/jpeg, image/png, image/gif, image/webp. Прокси расширяет список через автоконверсию на нашей стороне.","native":[{"media_type":"image/jpeg","aliases":["image/jpg","image/pjpeg"],"max_size":"5MB (base64) / 5MB (url)"},{"media_type":"image/png","aliases":["image/x-png"],"max_size":"5MB"},{"media_type":"image/gif","aliases":[],"max_size":"5MB"},{"media_type":"image/webp","aliases":[],"max_size":"5MB"}],"auto_converted_by_proxy":[{"media_type":"image/svg+xml","converts_to":"image/png","tool":"rsvg-convert (librsvg)","note":"SVG автоматически растеризуется в PNG на стороне прокси перед отправкой в Anthropic. Прозрачно для клиента."}],"not_supported":[{"media_type":"image/bmp","reason":"Anthropic не принимает; конверсия в PNG возможна, но пока не реализована"},{"media_type":"image/tiff","reason":"Anthropic не принимает; конверсия в PNG возможна, но пока не реализована"},{"media_type":"image/heic","reason":"Anthropic не принимает; iOS-only формат"}],"official_anthropic_docs":"https://docs.claude.com/en/docs/build-with-claude/vision"},"size_and_resolution_limits":{"max_file_size":"5 MB на изображение (base64 ИЛИ url) — лимит Anthropic API","max_dimension":"8000×8000 пикселей — для не-batch запросов; для batch 2000×2000","max_megapixels":"Recommended: 1.15 megapixels (1568×1568) для оптимального качества; больше — auto-resize","max_images_per_request":"100 изображений на запрос (per Anthropic Vision docs)","tokens_per_image":"Approx (width×height) / 750. Image 1000×1000 ≈ 1334 tokens. Учитывается в input_tokens биллинга."},"examples":{"curl_data_url":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"haiku\",\n    \"mode\": \"direct\",\n    \"max_tokens\": 100,\n    \"messages\": [{\n      \"role\": \"user\",\n      \"content\": [\n        {\"type\": \"text\", \"text\": \"What do you see?\"},\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,iVBORw0KGgo...\"}}\n      ]\n    }]\n  }'","curl_http_url":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -d '{\n    \"model\": \"sonnet\",\n    \"mode\": \"direct\",\n    \"messages\": [{\n      \"role\": \"user\",\n      \"content\": [\n        {\"type\": \"text\", \"text\": \"Describe in 5 words\"},\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"https://images.unsplash.com/photo-1574158622682-e40e69881006?w=400\"}}\n      ]\n    }]\n  }'","curl_svg_auto_convert":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -d '{\n    \"model\": \"haiku\",\n    \"mode\": \"direct\",\n    \"messages\": [{\n      \"role\": \"user\",\n      \"content\": [\n        {\"type\": \"text\", \"text\": \"What color?\"},\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/svg+xml;base64,PHN2Zy4uLg==\"}}\n      ]\n    }]\n  }'  # SVG автоконвертируется в PNG прокси","python_openai_sdk":"from openai import OpenAI\nimport base64\n\nclient = OpenAI(\n    api_key=\"YOUR_API_KEY\",\n    base_url=\"https://claude.ai-platform.space/v1\"\n)\n\nwith open('image.png', 'rb') as f:\n    img_b64 = base64.b64encode(f.read()).decode()\n\nresponse = client.chat.completions.create(\n    model=\"sonnet\",\n    extra_body={\"mode\": \"direct\"},\n    messages=[{\n        \"role\": \"user\",\n        \"content\": [\n            {\"type\": \"text\", \"text\": \"Describe this image\"},\n            {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:image/png;base64,{img_b64}\"}}\n        ]\n    }]\n)\nprint(response.choices[0].message.content)","langchain":"from langchain_openai import ChatOpenAI\nfrom langchain_core.messages import HumanMessage\n\nllm = ChatOpenAI(\n    base_url=\"https://claude.ai-platform.space/v1\",\n    api_key=\"YOUR_API_KEY\",\n    model=\"claude-sonnet-4-6\",\n    extra_body={\"mode\": \"direct\"},\n)\n\nmsg = HumanMessage(content=[\n    {\"type\": \"text\", \"text\": \"What is in this image?\"},\n    {\"type\": \"image_url\", \"image_url\": {\"url\": \"https://example.com/x.png\"}},\n])\nresponse = llm.invoke([msg])\nprint(response.content)"},"tips":["Используй mode=direct для vision — быстрее и дешевле чем SDK session.","5 MB лимит = размер base64-данных. Если файл больше — сначала уменьши/сожми.","HTTP URL вариант: Anthropic сам качает картинку (не наш прокси). Для приватных URL используй base64.","image/svg+xml автоматически растеризуется в PNG (rsvg-convert на VPS). Прозрачно для клиента.","Все 3 модели поддерживают vision: haiku ($0.0001/картинка), sonnet ($0.0003), opus ($0.0015) — типичные значения для 100×100 px.","Идемпотентность: если шлёшь Anthropic native формат — прокси его НЕ трогает (zero overhead).","Идентичный текст-блок: 'input_text' (OpenAI Responses API) автоматически конвертится в 'text' — клиенту не нужно знать про различия SDK."]},"prompt_caching":{"description":"Anthropic prompt caching support — reuse a stable prefix (system prompt, tools, RAG context) at 10% of normal input cost. Available in mode=direct only.","how_to_use":"Add cache_control to the system message (or any text block in Anthropic-style content array). Two TTL options: '5m' (default, 1.25× write cost) and '1h' (extended cache, 2× write cost).","example":{"request":{"model":"haiku","mode":"direct","max_tokens":100,"messages":[{"role":"system","content":"...your large stable system prompt (>=4500 chars for sonnet, >=17000 for haiku/opus)...","cache_control":{"type":"ephemeral","ttl":"1h"}},{"role":"user","content":"Your dynamic question"}]},"response_first_call":{"usage":{"prompt_tokens":9,"completion_tokens":30,"cache_creation_input_tokens":6166,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":6166}},"cost_usd":0.009889},"response_repeat_call":{"usage":{"prompt_tokens":9,"completion_tokens":30,"cache_creation_input_tokens":0,"cache_read_input_tokens":6166,"cache_creation":{}},"cost_usd":0.000572,"note":"94% cheaper than first call thanks to cache hit"}},"ttl_options":[{"value":"5m","default_when":"ttl omitted","write_multiplier":"1.25× base input price","use_for":"bursty workloads, conversations within minutes"},{"value":"1h","default_when":"explicit","write_multiplier":"2× base input price","use_for":"long pauses (3-6h), patient flows, batch processing","beta_header":"extended-cache-ttl-2025-04-11 (auto-applied by proxy)"}],"minimum_size":[{"model":"sonnet","min_tokens":1024,"approx_chars":4500,"note":"Anthropic published threshold"},{"model":"haiku","min_tokens":4096,"approx_chars":17000,"note":"Below this, Anthropic returns cache_creation=0 silently"},{"model":"opus","min_tokens":4096,"approx_chars":17000,"note":"Below this, Anthropic returns cache_creation=0 silently"}],"pricing":[{"model":"haiku","display":"Haiku 4.5","input_per_mtok":1.0,"cache_read_per_mtok":0.1,"cache_write_5m_per_mtok":1.25,"cache_write_1h_per_mtok":2.0,"output_per_mtok":5.0},{"model":"sonnet","display":"Sonnet 4.6","input_per_mtok":3.0,"cache_read_per_mtok":0.3,"cache_write_5m_per_mtok":3.75,"cache_write_1h_per_mtok":6.0,"output_per_mtok":15.0},{"model":"opus","display":"Opus 4.8","input_per_mtok":5.0,"cache_read_per_mtok":0.5,"cache_write_5m_per_mtok":6.25,"cache_write_1h_per_mtok":10.0,"output_per_mtok":25.0}],"auto_caching":"If you don't pass cache_control, the proxy auto-marks system_prompt as ephemeral with ttl='1h' (CHANGED 18.05.2026) when system_prompt char length exceeds the proxy threshold: sonnet ≥ 4500 chars, haiku/opus ≥ 17000 chars. Reason: Anthropic silently dropped default TTL from 1h to 5min on 06.03.2026 — explicit ttl='1h' restores prior behavior. Below threshold, the proxy does NOT add cache_control (Anthropic would silently return cache_creation=0).","proxy_thresholds":{"sonnet":"4500 chars (~1024 tokens — matches Anthropic published minimum)","haiku":"17000 chars (~4096 tokens)","opus":"17000 chars (~4096 tokens)","note":"These are PROXY thresholds (CACHE_MIN_CHARS in session_pool.py). Anthropic's official min for haiku/opus is also 4096 tokens, so below 17000 chars proxy correctly skips cache."},"disable_auto_caching":{"description":"Three ways to disable proxy's automatic system_prompt cache_control:","methods":["1. Request body field: {'disable_auto_cache': true}","2. HTTP header: X-No-Auto-Cache: 1","3. Send your own cache_control on the system message — proxy then respects yours (does not double-add)"],"when_to_disable":"Testing exact cache behavior; intentionally avoiding cache writes; A/B benchmarking; prompts that change every request anyway."},"session_mode_note":"Legacy mode=session (CLI subprocess pool) is deprecated. All /v1/messages and /v1/chat/completions traffic now goes through direct Anthropic API. cache_control is fully honored.","tips":["Keep dynamic content (timestamps, user IDs) AFTER cache_control breakpoint — anything before invalidates the cache.","Watch usage.cache_creation.ephemeral_1h_input_tokens to confirm 1h cache is active.","Cache key is per OAuth account in our pool; warming up is fast since the pool clusters requests by user.","5min TTL resets every read — high-traffic prompts stay warm indefinitely.","Use disable_auto_cache=true if you want raw input billing (e.g. for cost comparison benchmarking)."]},"tools_and_function_calling":{"description":"Function calling / tool use. The proxy accepts BOTH OpenAI tools format and Anthropic native tools format on /v1/chat/completions and /v1/messages. Auto-converts as needed. Since 01.06.2026 all non-stream /v1/messages requests use the same direct Anthropic API path as streaming and tools — no SDK pool roundtrip.","openai_format":{"description":"Standard OpenAI tools array used by langchain.tools, OpenAI SDK, LiteLLM, Pydantic-AI.","example":{"tools":[{"type":"function","function":{"name":"get_weather","description":"Get current weather for a city","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}],"tool_choice":"auto"}},"anthropic_format":{"description":"Anthropic native tools — used by anthropic Python SDK.","example":{"tools":[{"name":"get_weather","description":"Get current weather for a city","input_schema":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}]}},"tool_choice_values":{"auto":"Default — model decides whether to call tools or respond directly","none":"Force model to respond without using any tool","required / any":"Force model to call at least one tool (any of them)","{type:'function', function:{name:'X'}}":"OpenAI: force specific tool — converted to Anthropic {type:'tool', name:'X'}","{type:'tool', name:'X'}":"Anthropic native: same as above"},"server_tools":{"description":"Native Anthropic server-side tools — passed through without conversion. Useful for agents that need built-in capabilities.","supported":["web_search_*","bash_*","code_execution_*","text_editor_*"],"note":"These run on Anthropic's infrastructure, not on our proxy. Subject to Anthropic Tool Use beta headers."},"structured_output":{"description":"Two response_format shapes for forcing JSON output:","json_object":{"request":{"response_format":{"type":"json_object"}},"behavior":"Proxy appends 'Respond ONLY with valid JSON' to system + strips ```json...``` markdown wrappers from response. No schema validation."},"json_schema":{"request":{"response_format":{"type":"json_schema","json_schema":{"name":"user_profile","description":"User profile data","schema":{"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}},"required":["name","age"]}}}},"behavior":"Proxy creates a synthetic forced tool_use (Anthropic tool_choice={type:'tool', name:'user_profile'}). Tool_use.input is then returned as JSON string in choices[0].message.content. Used by LangChain with_structured_output, Pydantic-AI, Graphiti, Instructor."}},"tool_history_in_messages":{"description":"Multi-turn tool use — include prior assistant tool_calls and role=tool results in messages array.","openai_shape":[{"role":"user","content":"Weather in Kyiv?"},{"role":"assistant","content":null,"tool_calls":[{"id":"call_1","type":"function","function":{"name":"get_weather","arguments":"{\"city\":\"Kyiv\"}"}}]},{"role":"tool","tool_call_id":"call_1","content":"{\"temp\":18}"},{"role":"user","content":"Convert to F"}],"anthropic_shape":[{"role":"user","content":"Weather in Kyiv?"},{"role":"assistant","content":[{"type":"tool_use","id":"toolu_1","name":"get_weather","input":{"city":"Kyiv"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"toolu_1","content":"{\"temp\":18}"}]}],"note":"Both shapes accepted. Proxy converts OpenAI → Anthropic via convert_openai_messages_to_anthropic()."},"tool_calls_streaming":{"description":"stream=true with tools: tool_calls arrive incrementally.","openai_events":["data: {... delta: {tool_calls: [{index:0, id:'call_1', function:{name:'get_weather'}}]}}","data: {... delta: {tool_calls: [{index:0, function:{arguments:'{\"city\":'}}]}}","data: {... delta: {tool_calls: [{index:0, function:{arguments:'\"Kyiv\"}'}}]}}","data: {... finish_reason: 'tool_calls'}","data: [DONE]"],"note":"OpenAI index ↔ Anthropic content block index mapping preserved. Multiple parallel tool calls supported."},"example_curl_basic_tools":"curl -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\":\"sonnet\",\n    \"messages\":[{\"role\":\"user\",\"content\":\"What is the weather in Kyiv?\"}],\n    \"tools\":[{\n      \"type\":\"function\",\n      \"function\":{\n        \"name\":\"get_weather\",\n        \"description\":\"Get weather for a city\",\n        \"parameters\":{\"type\":\"object\",\"properties\":{\"city\":{\"type\":\"string\"}},\"required\":[\"city\"]}\n      }\n    }],\n    \"tool_choice\":\"auto\"\n  }' ","example_langchain_structured":"from langchain_openai import ChatOpenAI\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nllm = ChatOpenAI(\n    base_url=\"https://claude.ai-platform.space/v1\",\n    api_key=\"YOUR_API_KEY\",\n    model=\"claude-sonnet-4-6\",\n)\nstructured = llm.with_structured_output(User)\nresult = structured.invoke(\"John Doe is 32 years old\")\nprint(result)  # User(name='John Doe', age=32)","tips":["Tools always use direct Anthropic API path (SDK pool is deprecated for /v1/messages and /v1/chat/completions).","Empty arguments string ('{}') is valid for OpenAI — proxy converts to empty Anthropic input{}.","Streaming tool_calls: aggregate by index, not by id (id is sent once at start).","If you need both tools AND extended thinking — use Sonnet/Haiku with thinking={type:'enabled', budget_tokens:N}. Opus 4.7/4.8 adaptive thinking + tools works too.","response_format=json_schema is the most reliable way to get structured JSON — no markdown wrapper to strip."]},"server_side_tools":{"description":"Anthropic-executed tools: model decides when to call, Anthropic runs them server-side, returns results. NO client implementation needed — unlike classic function calling. Supported via Bearer OAuth subscription (Claude Max). Added 01.06.2026.","available_tools":{"web_search":{"tool_type":"web_search_20260209 (recommended, with dynamic filtering) OR web_search_20250305 (legacy, no filtering)","description":"Model searches the web during generation and returns answers with citations.","client_side_implementation":"NONE — pass tools=[{type: 'web_search_20260209', name: 'web_search', max_uses?: 5, allowed_domains?, blocked_domains?, user_location?}].","price":"$10 / 1000 searches + standard tokens for retrieved content. Failed searches NOT billed.","models_supported":["claude-opus-4-8","claude-opus-4-7","claude-sonnet-4-6","claude-haiku-4-5"],"response_blocks":["server_tool_use (model's search query)","web_search_tool_result (results array)","text (with citations)"],"multi_turn":"Re-send encrypted_content/encrypted_index back in subsequent messages for citations to keep working."},"web_fetch":{"tool_type":"web_fetch_20260209 (recommended) OR web_fetch_20250910 (legacy)","description":"Model fetches full content from a specific URL (HTML, PDF). For PDFs auto-extract base64.","client_side_implementation":"NONE — pass tools=[{type: 'web_fetch_20260209', name: 'web_fetch', max_uses?: 5, allowed_domains?, blocked_domains?, citations?: {enabled: true}, max_content_tokens?: 100000}].","price":"FREE (no extra charges, only token cost of fetched content).","models_supported":["claude-opus-4-8","claude-opus-4-7","claude-sonnet-4-6","claude-haiku-4-5"],"url_validation":"Can ONLY fetch URLs that already appeared in conversation context (user messages, prior tool results). No arbitrary URLs."},"code_execution":{"tool_type":"code_execution_20250825 (Bash + file ops) OR code_execution_20260120 (latest, REPL state, Opus 4.5+/Sonnet 4.5+ only)","description":"Model writes Python/Bash code and executes in Anthropic sandbox container (200MB RAM, files, matplotlib graphs).","client_side_implementation":"NONE — pass tools=[{type: 'code_execution_20250825', name: 'code_execution'}]. Requires anthropic-beta: code-execution-2025-08-25 (already in proxy default headers since 01.06.2026).","price":"FREE when used together with web_search/web_fetch. Otherwise $0.05/h per container (min 5 minutes), 1550 free hours/month per org.","models_supported":["claude-opus-4-8","claude-opus-4-7","claude-opus-4-6","claude-sonnet-4-6","claude-haiku-4-5"],"response_block_types":["server_tool_use (action: bash_code_execution or text_editor_code_execution)","bash_code_execution_tool_result (with stdout, stderr, return_code)","text_editor_code_execution_tool_result (file ops)"],"rate_limits_note":"OAuth subscription may hit too_many_requests on code_execution faster than API-key tier. Proxy returns the error in result.content.error_code."}},"response_format_openai":{"description":"/v1/chat/completions returns server tools via extension fields (OpenAI standard doesn't define them):","fields":{"server_tool_uses":"list of {id, name, input, result: {type, content, is_error?, error_code?}}","citations":"list of {type: 'web_search_result_location'|'char_location', url?, title?, cited_text?, encrypted_index?, ...}","usage.server_tool_use":"{web_search_requests: N, web_fetch_requests: N, code_execution_requests: N}","cost_usd":"includes $0.01 × web_search_requests (web_fetch/code_exec free or token-only)"}},"response_format_anthropic":{"description":"/v1/messages returns full Anthropic native blocks in content[]:","blocks":["text (with optional citations array)","server_tool_use","web_search_tool_result","web_fetch_tool_result","bash_code_execution_tool_result","text_editor_code_execution_tool_result"]},"streaming_sse_events":{"description":"Streaming SSE flow for /v1/chat/completions when server tools fire:","sequence":["data: {delta: {role: 'assistant', content: ''}} — init","data: {delta: {server_tool_use: {index, id, name, status: 'start'}}} — tool block started","data: {delta: {server_tool_use: {index, status: 'stop', input: {query}}}} — full input ready","data: {delta: {server_tool_result: {tool_use_id, type: 'web_search_tool_result', content: [...]}}} — results back","data: {delta: {citation: {url, title, cited_text, ...}}} — model cites a source","data: {delta: {content: 'text chunk'}} — actual model response","data: {delta: {}, finish_reason: 'stop'} — end","data: [DONE]"]},"curl_examples":{"web_search_haiku":"curl https://claude.ai-platform.space/v1/chat/completions \\\n  -H 'Authorization: Bearer YOUR_KEY' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"model\": \"claude-haiku-4-5\",\n    \"mode\": \"direct\",\n    \"max_tokens\": 512,\n    \"messages\": [{\"role\":\"user\",\"content\":\"What time is it in Tokyo?\"}],\n    \"tools\": [{\"type\": \"web_search_20250305\", \"name\": \"web_search\", \"max_uses\": 1}]\n  }'","web_fetch_sonnet":"curl https://claude.ai-platform.space/v1/messages \\\n  -H 'x-api-key: YOUR_KEY' \\\n  -H 'anthropic-version: 2023-06-01' \\\n  -d '{\n    \"model\": \"claude-sonnet-4-6\",\n    \"max_tokens\": 256,\n    \"messages\": [{\"role\":\"user\",\"content\":\"Fetch https://example.com and summarize\"}],\n    \"tools\": [{\"type\": \"web_fetch_20250910\", \"name\": \"web_fetch\", \"max_uses\": 1}]\n  }'"},"billing_logic":"Per-request: input_tokens × model_input_price + output_tokens × model_output_price + $0.01 × web_search_requests. web_fetch and code_execution don't add direct charges (already covered by tokens).","notes":["OAuth subscription verified working with web_search/web_fetch/code_execution (tested 01.06.2026).","Server tools can be combined with custom function calling — both work in same request.","Multi-turn: re-send encrypted_content and encrypted_index from prior responses to keep citations valid.","Errors come back in *_tool_result.content.error_code (too_many_requests, max_uses_exceeded, etc.) — HTTP status stays 200."]},"streaming":{"description":"Real-time SSE streaming for both /v1/chat/completions (OpenAI format) and /v1/messages (Anthropic format). Since 06.05.2026 all streaming requests go through native Anthropic SDK (client.messages.stream). Since 01.06.2026 non-streaming /v1/messages also goes through direct Anthropic API — the legacy SDK pool path is fully retired.","openai_format":{"description":"Standard OpenAI Chat Completions SSE — text deltas.","event_sequence":["data: {chatcmpl-..., delta: {role:'assistant', content:''}}  # first chunk = role marker","data: {chatcmpl-..., delta: {content:'Hello'}}","data: {chatcmpl-..., delta: {content:' world'}}","data: {chatcmpl-..., delta: {}, finish_reason: 'stop'}","data: [DONE]"],"tool_calls_in_stream":"When tools fire, deltas include {tool_calls:[{index:0, id?, function:{name?, arguments:'partial json'}}]}. Accumulate arguments by index.","example_curl":"curl -N -X POST https://claude.ai-platform.space/v1/chat/completions \\\n  -H \"Authorization: Bearer YOUR_API_KEY\" \\\n  -d '{\"model\":\"sonnet\",\"stream\":true,\"messages\":[{\"role\":\"user\",\"content\":\"Count to 5\"}]}'"},"anthropic_format":{"description":"Native Anthropic SSE for /v1/messages.","event_sequence":["event: message_start  →  {message: {id, role, content:[], model, ...}}","event: content_block_start  →  {index:0, content_block: {type:'text', text:''}}","event: content_block_delta  →  {index:0, delta: {type:'text_delta', text:'Hello'}}","event: content_block_delta  →  {index:0, delta: {type:'text_delta', text:' world'}}","event: content_block_stop  →  {index:0}","event: message_delta  →  {delta: {stop_reason:'end_turn'}, usage:{output_tokens:42}}","event: message_stop","event: ping  →  (keepalive every 15s while generating)"],"tool_use_blocks":"Tool calls stream as content_block_start (type:'tool_use', id, name), then content_block_delta (type:'input_json_delta', partial_json:'...'), then content_block_stop.","example_python_sdk":"from anthropic import Anthropic\nclient = Anthropic(api_key=\"YOUR_API_KEY\", base_url=\"https://claude.ai-platform.space\")\nwith client.messages.stream(model=\"sonnet\", max_tokens=512,\n    messages=[{\"role\":\"user\",\"content\":\"Tell a story\"}]) as s:\n    for txt in s.text_stream:\n        print(txt, end=\"\", flush=True)"},"implementation_notes":["ALL streaming requests → native Anthropic SDK path (session_pool.direct_api_stream). Non-stream /v1/messages → session_pool.direct_api_call. SDK pool path is retired (01.06.2026).","Keepalive ping every 15 seconds for /v1/messages prevents nginx 504 on long thinking.","Cache tokens and cost are computed AFTER stream ends from final usage. Reflected in /v1/usage and request_logs.","If first yield in stream generator throws (e.g. auth check passes but Anthropic rejects) — proxy emits a single SSE error event + [DONE] (see STREAM_PRE_YIELD_FAIL in server_500.log)."]},"headers":{"description":"HTTP request/response headers supported by the proxy.","request_headers":[{"name":"Authorization","value":"Bearer sk-cc-...","required":"one of three auth methods","description":"Standard OpenAI auth. Works on ALL endpoints."},{"name":"x-api-key","value":"sk-cc-...","required":"one of three auth methods","description":"Anthropic SDK auth. Works on ALL endpoints."},{"name":"?key= (query)","value":"sk-cc-...","required":"one of three auth methods","description":"Query param fallback. Works on /v1/chat/completions, /v1/status, /v1/models. NOT supported on /v1/messages — use header."},{"name":"Content-Type","value":"application/json","required":"for POST","description":"Standard."},{"name":"anthropic-version","value":"2023-06-01","required":"for /v1/messages (some SDKs)","description":"Native Anthropic header. Most SDKs send automatically."},{"name":"X-Request-ID","value":"any string","required":false,"description":"If sent, proxy uses your id for correlation logs + reflects it in response header. Otherwise auto-generated (req-<12hex>). Added 16.05.2026."},{"name":"X-No-Auto-Cache","value":"1","required":false,"description":"Disable proxy's automatic system_prompt cache_control. Equivalent to body field disable_auto_cache:true."},{"name":"X-Prompt-Type","value":"any string label","required":false,"description":"Analytics label for this request. Appears in /v1/usage?group_by=prompt_type. Examples: 'agent_chat', 'summarize', 'code_review'."}],"response_headers":[{"name":"X-Request-ID","description":"Always returned (echoed or generated). Use this to correlate with logs/server_500.log and nginx 5xx.log when reporting issues."},{"name":"Content-Type","description":"application/json (non-stream), text/event-stream (stream=true), text/plain (some health responses)."},{"name":"Access-Control-Allow-Origin","description":"CORS — claude.ai-platform.space + localhost:8092 only."}],"debugging_tip":"When reporting a 500 error to support, include the X-Request-ID from response headers. Engineers can grep server_500.log for the full traceback in seconds."},"client_integrations":{"description":"Drop-in replacement instructions for popular Claude clients. All clients work by pointing them at https://claude.ai-platform.space instead of api.anthropic.com, using your sk-cc-... API key.","claude_code_cli":{"name":"Claude Code CLI (Anthropic official `claude` command)","tested_version":"2.1.154","status":"Fully working: text, tool use (Read/Write/Edit/Bash), streaming, extended thinking, multi-turn","setup_macos_linux":{"description":"Export TWO env vars before running `claude`. Both are required — Claude Code 2.1.x falls back to OAuth flow if only ANTHROPIC_API_KEY is set.","commands":["export ANTHROPIC_BASE_URL=https://claude.ai-platform.space","export ANTHROPIC_API_KEY=sk-cc-YOUR_KEY","export ANTHROPIC_AUTH_TOKEN=sk-cc-YOUR_KEY","claude --print 'Hello, world'"],"persist_in_shell":"Add the 3 export lines to ~/.zshrc or ~/.bashrc to persist across sessions."},"setup_windows_powershell":{"commands":["$env:ANTHROPIC_BASE_URL = 'https://claude.ai-platform.space'","$env:ANTHROPIC_API_KEY = 'sk-cc-YOUR_KEY'","$env:ANTHROPIC_AUTH_TOKEN = 'sk-cc-YOUR_KEY'","claude --print 'Hello, world'"]},"common_issues":[{"symptom":"Failed to authenticate. API Error: 401 Invalid or expired API key","cause":"Only ANTHROPIC_API_KEY is set; Claude Code attempts internal OAuth refresh and gets 401.","fix":"Also set ANTHROPIC_AUTH_TOKEN to the SAME value (sk-cc-...). Both env vars are required."},{"symptom":"Hangs after first response, no follow-up tool calls","cause":"Multi-turn requests stuck on auth (same as above).","fix":"Set ANTHROPIC_AUTH_TOKEN."}]},"cursor_ide":{"name":"Cursor IDE","tested_version":"0.45.x","status":"Working in chat mode. Agent/Composer features remain on Cursor's backend (cannot be redirected).","setup":{"description":"Settings → Models → API Keys section. Enable 'Override OpenAI API Base URL' AND select Claude model.","steps":["1. Open Cursor → Settings (⌘,) → Models","2. Scroll to 'Anthropic API Key' (NOT OpenAI section — Cursor sends Anthropic-native payload to /v1/messages)","3. Enter your sk-cc-... key as the Anthropic API Key","4. Toggle 'Override Anthropic Base URL' ON","5. Set base URL: https://claude.ai-platform.space","6. Pick a Claude model (Sonnet 4.6 recommended for code)","7. Verify with a test chat: 'Hi'"]},"limitations":["Cursor Agent/Composer always use Cursor's own backend regardless of base URL — only chat is proxied","Tab autocomplete uses Cursor's own model, not proxy","Tool definitions must match Anthropic-native format (proxy auto-converts OpenAI format)"]},"langchain_python":{"name":"langchain-anthropic (Python)","setup_code":"from langchain_anthropic import ChatAnthropic\nllm = ChatAnthropic(\n    model='claude-sonnet-4-6',\n    api_key='sk-cc-YOUR_KEY',\n    anthropic_api_url='https://claude.ai-platform.space',\n)\nresult = llm.invoke('Hello!')\nprint(result.content)","notes":"Full support: streaming, tool calling, structured output, multimodal (base64 images)."},"anthropic_sdk_python":{"name":"Official Anthropic Python SDK","setup_code":"from anthropic import Anthropic\nclient = Anthropic(\n    api_key='sk-cc-YOUR_KEY',\n    base_url='https://claude.ai-platform.space',\n)\nresponse = client.messages.create(\n    model='claude-sonnet-4-6',\n    max_tokens=1024,\n    messages=[{'role': 'user', 'content': 'Hello!'}]\n)\nprint(response.content[0].text)"},"openai_sdk_compat":{"name":"OpenAI SDK (compatibility mode)","description":"Works via /v1/chat/completions. Auto-converts OpenAI → Anthropic and back.","setup_code":"from openai import OpenAI\nclient = OpenAI(\n    api_key='sk-cc-YOUR_KEY',\n    base_url='https://claude.ai-platform.space/v1',\n)\nresp = client.chat.completions.create(\n    model='sonnet',  # short alias OK\n    messages=[{'role': 'user', 'content': 'Hi'}]\n)\nprint(resp.choices[0].message.content)"},"cline_continue":{"name":"Cline / Continue.dev / VS Code extensions","description":"Use 'Anthropic' provider, set Anthropic API Key = sk-cc-..., set Base URL = https://claude.ai-platform.space","tested":"Cline 3.x verified working with /v1/messages streaming + tool use"},"what_works":["Text generation (haiku, sonnet, opus)","Streaming SSE (real-time deltas)","Tool use / function calling (Anthropic native + OpenAI tools format)","Multi-turn conversation with tool results","Vision (image_url + base64 in content blocks)","Extended thinking (Sonnet/Haiku enabled, Opus 4.7/4.8 adaptive)","Prompt caching (cache_control with 5min and 1h TTL)","Server-side tools: web_search, web_fetch, code_execution","All anthropic-beta flags forwarded (interleaved-thinking, prompt-caching-scope, etc.)"],"what_does_not_work":["/v1/files (Files API) — returns 501. Anthropic OAuth Max subscription has no access. Use base64 attachments in content blocks instead.","/v1/messages/batches (Batches API) — returns 501. OAuth token lacks user:batch scope.","Cursor Composer/Agent features (use Cursor's own backend, not proxied)","Claude Code's `--resume` of conversations stored on Anthropic side (we don't proxy conversation storage endpoints)"]},"credit_tiers":{"description":"Overdraft / credit limit system. New users can go slightly negative without payment friction — the lower the trust, the smaller the overdraft. Tier upgrades happen automatically based on successful payment count.","tiers":[{"name":"new","credit_limit_usd":1.0,"criteria":"0 successful payments","use_case":"Free $5 bonus on signup + $1 overdraft buffer. Lets a single oversized request finish even if balance drops below 0."},{"name":"active","credit_limit_usd":10.0,"criteria":"1+ successful payment","use_case":"User has proven willingness to pay. Allows larger spikes without interruption — top-up has time to land."},{"name":"loyal","credit_limit_usd":20.0,"criteria":"2+ successful payments","use_case":"Returning paying customer. Most of agent-style workloads (multi-step research) fit within this buffer."}],"how_it_works":["1. Each request: proxy estimates cost (input_tokens × price + safety margin).","2. If balance + credit_limit < estimated_cost → 402 insufficient_balance.","3. After completion, real cost is debited. Balance may go negative up to -credit_limit.","4. Auto-topup (if enabled in cabinet) charges the saved card when balance < threshold.","5. Tier recomputed on each successful payment via WayForPay/Lava webhook."],"response_field_in_error":{"402":{"error":{"type":"insufficient_balance","message":"...","balance":-0.45,"credit_limit":1.0,"tier":"new","payments_count":0}}},"see_also":["GET /auth/me — returns user's tier_name, credit_limit, balance","POST /billing/topup — increase balance + advance tier"]},"errors":{"description":"Unified error response shapes across all endpoints.","formats":{"standard":{"error":{"type":"<error_type>","message":"<human-readable>","request_id":"req-..."}},"fastapi_legacy":{"detail":"<message>"},"rate_limit":{"error":"rate_limit_error","rate_limit":{"unified_status":"warning|critical","utilization_5h_pct":0,"utilization_7d_pct":95.2}},"balance":{"error":{"type":"insufficient_balance","message":"...","balance":0.0,"credit_limit":1.0,"tier":"new","payments_count":0}}},"common_codes":{"400":"invalid_request_error — malformed body, unsupported field combo, Anthropic 400 propagated","401":"Invalid or expired API key","402":"insufficient_balance — see credit_tiers section","403":"Model not allowed for this key (check allowed_models in /admin/users)","429":"Rate limit — either daily user limit or OAuth pool unified limit (see /rate-limits)","500":"Unhandled exception — X-Request-ID in headers for debugging (added 15.05.2026)","502":"Bad Gateway — only on nginx layer (upstream Python down). Watcher alerts to TG.","503":"Service in maintenance mode","504":"Gateway timeout — SDK session execute() exceeded timeout"},"correlation":"All 500-level responses include X-Request-ID header. Engineering team can grep /var/log/nginx/domains/claude.ai-platform.space.5xx.log and /var/www/claude-api/logs/server_500.log for full traceback in seconds. When reporting bugs — always include X-Request-ID.","monitoring":"claude_5xx_watcher cron */2 min triggers TG alert in 'Claude Тех' chat when ≥3 5xx errors hit within 5 minutes. Dedup 15 min by status-signature. Watch logs proactively — don't wait for client complaints."},"performance":{"description":"Proxy latency profile after optimizations deployed 18.05.2026. TTFB (Time To First Byte) is the dominant metric for voice/realtime — measured from client send to first SSE chunk received. Use mode=direct + stream=true for minimum TTFB.","ttfb_benchmarks_haiku":{"description":"Real measurements after deploy 18.05.2026 (haiku model, ~70 chars system prompt)","non_stream":{"ttlb_avg_5req":"~1.10s (range 0.97-1.33s)","before_optimization":"~3.0s","improvement":"2.7x faster"},"stream_cold":{"ttfb":"~0.65s (first request after pool gone cold)","before_optimization":"~3.0s","improvement":"4.6x faster","when":"After >30s idle (TCP keepalive expires)"},"stream_warm":{"ttfb":"~0.17s (requests 2+)","before_optimization":"~3.0s","improvement":"17x faster","when":"Within 30s of previous request","use_for":"Voice agents, realtime chat, calls — keep connection warm"}},"what_was_optimized":["Singleton httpx.AsyncClient with keepalive_expiry=30s (vs default 5s) — TLS+TCP reused between requests, no handshake on warm pool","asyncio.gather for parallel SQL: maintenance_check + authenticate_api_key in phase 1, check_rate_limit + get_user_credit_info in phase 2","Fire-and-forget log_api_request via asyncio.create_task — client gets response immediately, MySQL INSERT happens in background (semaphore 50 for DoS protection)","In-memory pricing cache TTL 60s — no MySQL SELECT for each calculate_cost","Explicit cache_control ttl='1h' in auto-mode (Anthropic dropped default from 1h to 5min on 06.03.2026)","All sync PyMySQL operations wrapped in asyncio.to_thread — no event loop blocking"],"tips_for_minimum_latency":["Use mode=direct + stream=true for voice/realtime — TTFB ~170ms warm","Keep a long-lived connection: same client SDK instance, don't recreate per call (otherwise TLS handshake adds ~150-200ms)","For repeated system prompts: use cache_control with ttl='1h' — 85% cost reduction + faster response on cache hit","Choose haiku for shortest TTFB (~600ms upstream) vs sonnet (~1200ms) vs opus (~2700ms)","Send Authorization: Bearer header (not query param) — query parsing is slightly slower"],"where_the_remaining_time_goes":{"anthropic_inference":"60-65% (1000-1200ms upstream — outside our control, model-dependent)","network_rtt":"5-10% (Kyiv → US anycast Anthropic ~80-110ms one-way)","tls_handshake":"0% on warm pool, ~150-200ms on cold (mitigated by keepalive 30s)","proxy_overhead":"5-10% (auth, billing, normalization — async parallel where possible)","mysql_writes":"0% on response path (moved to background via fire-and-forget)"},"future_deferred":["Background keepalive ping every 30-60s to guarantee warm pool","Auto-fallback to Haiku 4.5 for short contexts in voice scenarios (~597ms TTFT upstream)","LiveKit Anthropic plugin for direct client→Anthropic bypass (eliminates proxy for voice)","VPS in US-East would save ~80-100ms RTT, but breaks 60+ other services on same machine — not pursued"]},"how_it_works":{"description":"Architecture overview - two execution paths","two_paths":{"direct_api":{"when":"mode=direct or scope=light/medium","description":"Direct call to Anthropic Messages API via OAuth. ALL 3 models: haiku (~900ms), sonnet (~1700ms), opus (~2700ms). Zero SDK overhead. Proxy auto-injects identity assertion for sonnet/opus.","flow":["1. Client sends request with mode=direct (or scope=light)","2. Proxy authenticates, checks limits","3. Proxy calls Anthropic Messages API directly with OAuth Bearer + identity headers","4. For sonnet/opus: system prompt identity auto-injected","5. Response returned immediately","6. Logged to MySQL"],"speed":"0.7-2.7s (model dependent)","cost":"$0 (subscription)","supported_models":"haiku, sonnet, opus (all 3)","use_for":"Git commits, translations, summaries, copywriting, any text generation without tools"},"sdk_session":{"when":"scope=medium/strong/max or default without scope","description":"Request goes through persistent Agent SDK session. Claude has access to tools (Read, Write, Bash, Web) and conversation memory.","flow":["1. Client sends request","2. Proxy authenticates, checks limits","3. Proxy routes to pre-warmed SDK session pool","4. Claude may use tools, multi-turn reasoning","5. Response streams as SSE or returns as JSON","6. Logged to MySQL"],"speed":"2-6s","cost":"$0.01-0.70 (cache tokens from SDK system prompt)","use_for":"Code analysis, file operations, complex reasoning, chat with memory"}},"tech_stack":["FastAPI + uvicorn (Python) - API server","Anthropic Python SDK - direct API calls for lightweight requests","Claude Agent SDK (Python) - persistent sessions for full agent mode","MySQL 8.4 (PyMySQL+DBUtils) - logging, billing and user management","nginx (HestiaCP) - SSL termination and reverse proxy"],"session_pool":{"description":"12 persistent SDK sessions across 4 scopes + Direct API for all 3 models","scopes":{"light":"Direct API: haiku (~900ms), sonnet (~1700ms), opus (~2700ms) - zero SDK overhead","medium":"sonnet/medium, 2 SDK slots - balanced speed and quality with tools","strong":"opus/high, 1 SDK slot - deep analysis with tools","max":"opus/max, 1 SDK slot - maximum reasoning depth with tools"},"oauth_accounts":"Multi-account OAuth pool (4 accounts). Smart fallback: if one account hits rate limit, proxy tries next.","benefits":["Direct API for ALL 3 models: haiku/sonnet/opus with zero cache overhead","Real token-by-token SSE streaming for SDK sessions","Session context preserved between requests (optional, via session_id)","Automatic health monitoring and session recreation","Multi-account OAuth pool with smart fallback on 429","Unified rate limit monitoring via /rate-limits endpoint"],"limits":["SDK sessions: 1-2 parallel requests per scope (queued if busy)","Sessions auto-recreate every 100 requests (context window protection)","Effort is fixed per scope (medium=medium, strong=high, max=max)","OAuth unified rate limits: 5-hour and 7-day windows per subscription"]}}}