Cut AI costs with response caching

When Northwind runs an AI script across a spreadsheet, the same prompt often comes up again and again — the same product description, the same boilerplate question, the same row that gets reprocessed on the next run. Every one of those repeats is a paid API call for an answer you already have.

This is a drop-in cache layer. Instead of calling callClaude directly, your script calls cachedClaude, which checks two stores before it spends anything: a fast in-memory cache that lasts six hours, and a persistent sheet that holds answers forever. Identical prompts are answered instantly and for free; only a genuinely new prompt reaches the API. It is a wrapper, not a rewrite — the rest of your code does not change.

What you’ll need

A Cache Google Sheet — a single empty tab is enough. The script appends rows to it as [hash, answer, timestamp]; you never edit it by hand.
An Anthropic API key saved as ANTHROPIC_API_KEY in Script Properties — see Store API keys and secrets securely.
Any existing AI script — swap its callClaude(prompt) calls for cachedClaude(prompt) and nothing else changes.

The cache layer

// The sheet that stores answers permanently, one per row.
const CACHE_SHEET_ID = '1abcCacheId';

// How long the fast in-memory cache holds an answer, in seconds (6 hours).
const MEMO_TTL_SECONDS = 6 * 60 * 60;

/**
 * Drop-in replacement for callClaude that checks two caches first.
 * Identical prompts are answered for free; only new prompts hit the API.
 */
function cachedClaude(prompt) {
  // 1. Hash the prompt — a short, stable key for both cache layers.
  const hash = Utilities.base64Encode(
    Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, prompt)
  );

  // 2. Fast layer: the in-memory script cache (lasts MEMO_TTL_SECONDS).
  const memo = CacheService.getScriptCache();
  const hit = memo.get(hash);
  if (hit) return hit;

  // 3. Slow layer: the persistent sheet. Find the row for this hash.
  const sheet = SpreadsheetApp.openById(CACHE_SHEET_ID).getSheets()[0];
  const found = sheet.createTextFinder(hash).findNext();
  if (found) {
    // 4. Sheet hit — read the answer and warm the fast cache with it.
    const cached = sheet.getRange(found.getRow(), 2).getValue();
    memo.put(hash, cached, MEMO_TTL_SECONDS);
    return cached;
  }

  // 5. Miss on both layers — pay for one API call, then store the answer
  //    in the fast cache and append it to the sheet for next time.
  const answer = callClaude(prompt);
  memo.put(hash, answer, MEMO_TTL_SECONDS);
  sheet.appendRow([hash, answer, new Date()]);
  return answer;
}

/**
 * Minimal Anthropic API call. The key lives in Script Properties — it
 * is never pasted into the code.
 */
function callClaude(prompt) {
  const key = PropertiesService.getScriptProperties()
    .getProperty('ANTHROPIC_API_KEY');
  const res = UrlFetchApp.fetch('https://api.anthropic.com/v1/messages', {
    method: 'post',
    contentType: 'application/json',
    headers: { 'x-api-key': key, 'anthropic-version': '2023-06-01' },
    payload: JSON.stringify({
      model: 'claude-haiku-4-5-20251001',
      max_tokens: 400,
      messages: [{ role: 'user', content: prompt }],
    }),
  });
  return JSON.parse(res.getContentText()).content[0].text.trim();
}

How it works

cachedClaude hashes the prompt with SHA-1 and base64-encodes it. The hash is a short, fixed-length key — the same prompt always produces the same key, a different prompt almost never collides.
It checks the fast layer first: CacheService keeps answers in memory for MEMO_TTL_SECONDS. A hit here returns instantly with no spreadsheet read.
On a memory miss it checks the persistent layer. createTextFinder locates the row whose first cell holds the hash.
If the sheet has the answer, it reads it and writes it back into the fast cache, so the next call in the same six-hour window skips the spreadsheet too.
Only when both layers miss does it call the API. The fresh answer is put into the fast cache and appended to the sheet as [hash, answer, timestamp], so that exact prompt is never billed again.
Because the function signature matches callClaude, any existing script adopts caching just by changing which function it calls.

Example run

Suppose a script classifies 500 feedback rows and 180 of them repeat an earlier comment word for word.

Run	API calls without cache	API calls with `cachedClaude`
First run (cold cache)	500	320 (180 duplicates served from memory)
Same script re-run later	500	0 (every answer in the sheet)

The first pass already skips the in-run duplicates; a re-run is free entirely, because every prompt is now sitting in the Cache sheet.

Use it

This is a library, not a stand-alone job — you call it from your own scripts:

Add the Cache sheet and set CACHE_SHEET_ID to its ID.
Find every callClaude(prompt) in your AI scripts and change it to cachedClaude(prompt).
Run your script as normal. The first run fills the cache; later runs read from it.

To inspect what is cached, open the Cache sheet — each row shows the hash, the stored answer and when it was first saved.

Watch out for

Cache only deterministic prompts. If a prompt embeds today’s date, a random ID or a timestamp, every call produces a new hash and nothing is ever reused. Strip the volatile bits before caching.
The cache never expires on its own. The sheet keeps answers forever, which is the point — but if a prompt’s correct answer can change over time, you must delete its row by hand to force a refresh.
CacheService values are capped at 100 KB and the whole script cache at around 10 MB. Very long answers will not fit the fast layer — they still cache in the sheet, just without the in-memory speed-up.
createTextFinder scans the sheet on every memory miss. That stays quick into the low thousands of rows; past that, consider keying off a second sheet or archiving old entries.
Two different prompts could in theory share a SHA-1 hash. The odds are vanishingly small for normal text, but it is the reason to hash the full prompt and never a truncated version.

Cut AI costs with response caching

What you’ll need

The cache layer

How it works

Example run

Use it

Watch out for

Related

Generate and test email subject lines

Build retrieval-augmented Q&A over your data

Build an AI weekly-report narrator

Build a multi-step AI agent workflow

Adapt marketing copy per region