Cut AI costs with response caching
Skip repeat Claude calls by caching results in a sheet — same prompt, never billed twice.
Published Dec 13, 2025
When Northwind runs an AI script across a spreadsheet, the same prompt often comes up again and again — the same product description, the same boilerplate question, the same row that gets reprocessed on the next run. Every one of those repeats is a paid API call for an answer you already have.
This is a drop-in cache layer. Instead of calling callClaude directly, your
script calls cachedClaude, which checks two stores before it spends anything:
a fast in-memory cache that lasts six hours, and a persistent sheet that holds
answers forever. Identical prompts are answered instantly and for free; only a
genuinely new prompt reaches the API. It is a wrapper, not a rewrite — the rest
of your code does not change.
What you’ll need
- A
CacheGoogle Sheet — a single empty tab is enough. The script appends rows to it as[hash, answer, timestamp]; you never edit it by hand. - An Anthropic API key saved as
ANTHROPIC_API_KEYin Script Properties — see Store API keys and secrets securely. - Any existing AI script — swap its
callClaude(prompt)calls forcachedClaude(prompt)and nothing else changes.
The cache layer
// The sheet that stores answers permanently, one per row.
const CACHE_SHEET_ID = '1abcCacheId';
// How long the fast in-memory cache holds an answer, in seconds (6 hours).
const MEMO_TTL_SECONDS = 6 * 60 * 60;
/**
* Drop-in replacement for callClaude that checks two caches first.
* Identical prompts are answered for free; only new prompts hit the API.
*/
function cachedClaude(prompt) {
// 1. Hash the prompt — a short, stable key for both cache layers.
const hash = Utilities.base64Encode(
Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, prompt)
);
// 2. Fast layer: the in-memory script cache (lasts MEMO_TTL_SECONDS).
const memo = CacheService.getScriptCache();
const hit = memo.get(hash);
if (hit) return hit;
// 3. Slow layer: the persistent sheet. Find the row for this hash.
const sheet = SpreadsheetApp.openById(CACHE_SHEET_ID).getSheets()[0];
const found = sheet.createTextFinder(hash).findNext();
if (found) {
// 4. Sheet hit — read the answer and warm the fast cache with it.
const cached = sheet.getRange(found.getRow(), 2).getValue();
memo.put(hash, cached, MEMO_TTL_SECONDS);
return cached;
}
// 5. Miss on both layers — pay for one API call, then store the answer
// in the fast cache and append it to the sheet for next time.
const answer = callClaude(prompt);
memo.put(hash, answer, MEMO_TTL_SECONDS);
sheet.appendRow([hash, answer, new Date()]);
return answer;
}
/**
* Minimal Anthropic API call. The key lives in Script Properties — it
* is never pasted into the code.
*/
function callClaude(prompt) {
const key = PropertiesService.getScriptProperties()
.getProperty('ANTHROPIC_API_KEY');
const res = UrlFetchApp.fetch('https://api.anthropic.com/v1/messages', {
method: 'post',
contentType: 'application/json',
headers: { 'x-api-key': key, 'anthropic-version': '2023-06-01' },
payload: JSON.stringify({
model: 'claude-haiku-4-5-20251001',
max_tokens: 400,
messages: [{ role: 'user', content: prompt }],
}),
});
return JSON.parse(res.getContentText()).content[0].text.trim();
}
How it works
cachedClaudehashes the prompt with SHA-1 and base64-encodes it. The hash is a short, fixed-length key — the same prompt always produces the same key, a different prompt almost never collides.- It checks the fast layer first:
CacheServicekeeps answers in memory forMEMO_TTL_SECONDS. A hit here returns instantly with no spreadsheet read. - On a memory miss it checks the persistent layer.
createTextFinderlocates the row whose first cell holds the hash. - If the sheet has the answer, it reads it and writes it back into the fast cache, so the next call in the same six-hour window skips the spreadsheet too.
- Only when both layers miss does it call the API. The fresh answer is put into
the fast cache and appended to the sheet as
[hash, answer, timestamp], so that exact prompt is never billed again. - Because the function signature matches
callClaude, any existing script adopts caching just by changing which function it calls.
Example run
Suppose a script classifies 500 feedback rows and 180 of them repeat an earlier comment word for word.
| Run | API calls without cache | API calls with cachedClaude |
|---|---|---|
| First run (cold cache) | 500 | 320 (180 duplicates served from memory) |
| Same script re-run later | 500 | 0 (every answer in the sheet) |
The first pass already skips the in-run duplicates; a re-run is free entirely,
because every prompt is now sitting in the Cache sheet.
Use it
This is a library, not a stand-alone job — you call it from your own scripts:
- Add the
Cachesheet and setCACHE_SHEET_IDto its ID. - Find every
callClaude(prompt)in your AI scripts and change it tocachedClaude(prompt). - Run your script as normal. The first run fills the cache; later runs read from it.
To inspect what is cached, open the Cache sheet — each row shows the hash, the
stored answer and when it was first saved.
Watch out for
- Cache only deterministic prompts. If a prompt embeds today’s date, a random ID or a timestamp, every call produces a new hash and nothing is ever reused. Strip the volatile bits before caching.
- The cache never expires on its own. The sheet keeps answers forever, which is the point — but if a prompt’s correct answer can change over time, you must delete its row by hand to force a refresh.
CacheServicevalues are capped at 100 KB and the whole script cache at around 10 MB. Very long answers will not fit the fast layer — they still cache in the sheet, just without the in-memory speed-up.createTextFinderscans the sheet on every memory miss. That stays quick into the low thousands of rows; past that, consider keying off a second sheet or archiving old entries.- Two different prompts could in theory share a SHA-1 hash. The odds are vanishingly small for normal text, but it is the reason to hash the full prompt and never a truncated version.
Related
Generate and test email subject lines
A/B test AI-written Northwind subject lines for open rate — outputs ranked by past performance.
Updated Mar 3, 2026
Build retrieval-augmented Q&A over your data
Answer Northwind questions grounded in your own Sheet data — pass relevant rows as context.
Updated Feb 27, 2026
Build an AI weekly-report narrator
Turn Northwind metrics into a written executive summary — numbers in, prose out.
Updated Feb 23, 2026
Build a multi-step AI agent workflow
Chain Claude prompts to complete a Northwind task end to end — research → draft → critique → finalise.
Updated Feb 11, 2026
Adapt marketing copy per region
Localise Northwind tone and references by market with AI — same message, regional flavour.
Updated Jan 30, 2026