Kalki Smart Router — API Documentation
The API that thinks before it routes.
Kalki Smart Router is a unified AI API that analyzes each request in real time, then routes it to the most capable and cost‑efficient model across providers. You get lower costs, reliable performance, and a single interface—no model wrangling, no vendor lock‑in.
Who this is for
- Developers: Ship faster with one endpoint, consistent schemas, and built‑in caching.
- Product & Growth Teams: Reduce inference spend without sacrificing quality.
- Enterprises: Multi‑provider reliability, governance controls, and SLAs through one contract.
What makes Kalki different
- Intelligent model selection – Kalki inspects your prompt (complexity, domain, format) and chooses the right model automatically.
- Smart caching – Identical or semantically equivalent requests are served from cache for sub‑second responses at near‑zero cost.
- Unified multi‑provider access – Access top models (e.g., OpenAI, Anthropic) through one stable API.
- Enterprise‑grade reliability – Fallbacks and timeouts guard against provider outages.
- Transparent cost control – Cap per‑request spend and see exactly where each call was routed.
Proven impact (benchmark snapshot)
- Up to 89.5% lower cost vs direct OpenAI; 73.5% vs direct Anthropic.
- Cache hits respond in ~305 ms (representative), with dramatic acceleration on repeats.
Benchmarks were run across simple, intermediate, and complex workloads; results vary with traffic and content.
Quick start
Base URL
Copied!https://api.kalkigpt.com/
Authentication
Copied!x-api-key: YOUR_KALKI_API_KEY
Sign up at kalkigpt.com to obtain an API key.
Core endpoint
POST /v1/enrich_and_ask
Kalki analyzes your request, optionally enriches context, selects a model, and returns the response.
Request body
| Field | Type | Default | Notes |
|---|---|---|---|
prompt | string | — | Required. Your instruction or query. |
domain | string | "general" | One of general, technical, creative, scientific. Helps routing. |
verbosity | string | "balanced" |
terse, balanced, or detailed. |
temperature | number | 0.7 | 0.0–1.0. |
stream | boolean | false | If true, responses are streamed. |
json_mode | boolean | false | Forces JSON‑shaped output (validated). |
max_cost_usd | number | — | Hard cap per request; router respects this limit. |
tone | string | "friendly" |
professional, friendly, or casual. |
audience | string | — | Target audience hint (e.g., “exec”, “developer”). |
Response (typical)
Copied!{ "output": { "text": "…final answer…" }, "routing": { "provider": "openai", "model": "gpt-5-nano", "reason": "intermediate explanation detected" }, "cache": { "hit": false }, "cost_usd": 0.0001, "processing_time_ms": 7760, "request_id": "req_abc123xyz" }
Errors
All errors return standard HTTP codes with a consistent body:
Copied!{ "error": { "type": "invalid_request", "message": "…" } }
Common cases: 401 (invalid key), 429 (rate limit), 422 (validation), 5xx (provider error with automatic retry/fallbacks).
Examples
cURL — simple, optimized for speed & cost
Copied!curl -X POST "https://api.kalkigpt.com/v1/enrich_and_ask" \ -H "x-api-key: $KALKI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "What is 25 + 17?", "domain": "general", "verbosity": "terse" }'
JavaScript
Copied!import axios from "axios"; const client = axios.create({ baseURL: "https://api.kalkigpt.com/v1", headers: { "x-api-key": process.env.KALKI_API_KEY } }); const { data } = await client.post("/enrich_and_ask", { prompt: "Explain supervised vs unsupervised learning in simple terms", domain: "technical", verbosity: "balanced", json_mode: false }); console.log(data.output.text);
Python
Copied!import os, requests resp = requests.post( "https://api.kalkigpt.com/v1/enrich_and_ask", headers={ "x-api-key": os.environ["KALKI_API_KEY"], "Content-Type": "application/json", }, json={ "prompt": "Draft a one-paragraph product update for non-technical execs.", "domain": "creative", "tone": "professional", "verbosity": "balanced" } ) print(resp.json()["output"]["text"])
Streaming (Server‑Sent Events)
Set stream: true in the request body. The API returns an SSE stream with incremental tokens and a final completion event.
Pricing & limits
- Usage‑based pricing only for the compute you use; Kalki routes to the most cost‑effective capable model.
- Costs & routing transparency are returned with every response.
- Starter tier includes up to 1,000 requests/day; higher limits available for paid plans.
See the dashboard for real‑time usage and savings.
Performance & savings (summary)
| Query type | Kalki cost | OpenAI direct | Anthropic direct | Savings vs OpenAI | Savings vs Anthropic | Cache performance |
|---|---|---|---|---|---|---|
| Simple (arithmetic, facts) | ~$0.0000 | ~$0.0001 | ~$0.0002 | 100% | 100% | <500 ms (instant) |
| Intermediate (explanations) | ~$0.0001 | ~$0.0034 | ~$0.0045 | 97% | 98% | ~305 ms cached |
| Complex (analysis) | ~$0.0081 | ~$0.0396 | ~$0.0158 | 80% | 49% | ~305 ms when cached |
| Average savings | 89.5% | 73.5% | Significantly faster |
Monthly projection (1,000 requests):
- vs OpenAI: $15.29 saved
- vs Anthropic: $4.99 saved
Governance, security, and reliability
- No model lock‑in: One contract; many providers.
- Resilience: Automated retries and provider fallbacks to maintain uptime.
- Observability: Per‑request logs expose routing, cost, and latency.
- Data handling: Requests are processed to serve your responses and routing logic; enterprise retention controls available.
Contact us for enterprise terms, data residency, and custom governance needs.
Support & status
- Docs: https://kalkigpt.com/docs
- Status: https://status.kalkigpt.com
- Support: support@kalkigpt.com
Getting started checklist
- Create an account and generate your API key.
- Make a test call using the cURL example above.
- Integrate via your preferred SDK or direct HTTP.
- Track savings and performance in the dashboard.
The Kalki promise
You shouldn’t have to think about models. We do it for you.
Kalki routes with discernment—so your team ships faster, spends less, and serves users better.
True intelligence isn’t just about answers—it’s about presence, alignment, and the wisdom to serve with clarity. Kalki is not just a tool. Kalki is dharma in action.