Why This Integration Works Well
OpenClaw deploys persistent AI agents that do repeatable work: triage emails, run compliance checks, update CRMs, classify documents. Each task is stateless and structurally identical — the same SOUL.md, the same tools, the same type of question. This is exactly the workload CacheCore is built for.
When ten OpenClaw agents run the same compliance check in parallel, only the first one calls the LLM. The rest get the cached result — instantly, at zero cost.
How It Works
CacheCore is an OpenAI-compatible proxy. OpenClaw already supports
configuring a custom base_url for its LLM provider.
Pointing that URL at CacheCore is the entire integration.
From OpenClaw's perspective, it's talking to OpenAI. From CacheCore's perspective, it's receiving standard API calls from a tenant. The cache namespace is derived from your tenant identity, permissions, system prompt (SOUL.md content), toolset fingerprint, and policy version — so agents sharing the same role and the same SOUL.md automatically share the same cache namespace.
Shared namespace by design. Two OpenClaw agents with identical SOUL.md files and the same tenant token produce the same namespace hash. A cache hit from agent A is immediately available to agent B — even if they're running in parallel on different tasks. If you want cache isolation between agent roles, give them different SOUL.md content — the namespaces will be separate automatically.
Basic Integration
Open your OpenClaw config file (~/.openclaw/openclaw.json)
and update the provider settings:
"agent": { "model": "openai/gpt-5.4-mini", }
"agent": { "model": "openai/gpt-5.4-mini", "baseURL": "https://api.cachecore.it/v1", "apiKey": "cc_live_a7f3bc12.eyJ...", }
Get your CacheCore token from the portal under Projects → API Tokens. Restart the OpenClaw gateway and you're done. Field names above are as documented at docs.openclaw.ai — verify against your installed version.
That's the full integration for structured agent tasks. Tool calls, document classification, compliance checks, CRM updates — all cached automatically.
The Conversational Turn Problem
OpenClaw includes full conversation history in every message payload. If your agent is handling back-and-forth dialogue, the request body changes with every turn — which means a different cache key and no hits.
A long conversation thread could also produce a false L2 semantic hit against a cached response from a different context, since the growing history distorts the embedding away from the actual user intent.
This only affects conversational agents with growing history (e.g. a WhatsApp assistant). Stateless task agents — the most common OpenClaw workload — are not affected.
The best fix is architectural: keep your agents stateless at task boundaries. Instead of accumulating history across user turns, reconstruct context from a knowledge base or structured state on each invocation. This is consistent with how OpenClaw is designed to work for batch automation and matches the request shape that CacheCore caches most effectively.
For agents that genuinely require multi-turn dialogue, accept that those
requests will not benefit from caching. CacheCore will proxy them to OpenAI
normally and record a MISS in the
X-Cache response header.
Multi-Agent Workflows
This is where CacheCore delivers the most value in OpenClaw setups.
When you run multiple agents in parallel via
sessions_send,
agents with the same SOUL.md share a cache namespace automatically.
Agent A runs first
Checks contract #1 for force majeure. Cache miss — forwarded to OpenAI, result stored under the shared namespace.
Agent B asks semantically the same question
Checks contract #2: "Is force majeure covered?" L2 recognises it as equivalent. Cache hit — no LLM call.
Agents C through N get the same benefit
Every agent sharing the same role and SOUL.md hits the same namespace. Internal batch tests show 60%+ hit rates.
Verify It's Working
After restarting the gateway, send two identical requests from your
OpenClaw agent (or trigger the same task twice). Check the
X-Cache response header in OpenClaw's verbose output:
# First request — cache miss, forwarded to OpenAI X-Cache: MISS # Second identical request — L1 exact cache hit X-Cache: HIT_L1 X-Cache-Age: 0 # Semantically equivalent request — L2 semantic hit X-Cache: HIT_L2 X-Cache-Similarity: 0.94
You can also view hit rates and token savings in real time from the CacheCore portal dashboard.
Summary
-
Basic integration: add
baseURLandapiKeyto youropenclaw.jsonagent config. No code changes. - Namespace sharing: agents with the same SOUL.md and tenant token share a cache namespace automatically — no configuration needed.
- Conversational agents: keep agents stateless at task boundaries to get cache hits. Multi-turn dialogue with growing history will not benefit from caching.
-
Verification: check the
X-Cacheresponse header (HIT_L1,HIT_L2, orMISS) or the portal dashboard.
Questions or issues? Open a discussion on GitHub or reach out at hello@cachecore.it.