Build retrieval-augmented Q&A over your data
Answer Northwind questions grounded in your own Sheet data — pass relevant rows as context.
Published Feb 27, 2026
Northwind keeps a lot of useful facts in spreadsheets — pricing tiers, project histories, supplier contacts, policy notes. The trouble is that nobody can remember which tab holds what, and a general-purpose chatbot will happily invent an answer rather than admit it does not know. What you want is an assistant that answers only from your own data and says so plainly when the data falls short.
That pattern is called retrieval-augmented generation: instead of asking the model to recall facts, you retrieve the rows that look relevant and hand them over as context. This script does the simplest honest version of it — a keyword match to pull candidate rows, then Claude to read those rows and answer. No vector database, no embeddings; just a Sheet and a prompt that forbids guessing.
What you’ll need
- A Google Sheet acting as your knowledge base, with a header row and one fact per row. Column names can be anything — the script reads them all.
- An Anthropic API key saved as
ANTHROPIC_API_KEYin Script Properties — see Store API keys and secrets securely. - Nothing else. Retrieval happens in plain JavaScript.
The script
// The Sheet that holds your knowledge base — one fact per row.
const KNOWLEDGE_SHEET_ID = '1abcKnowledgeId';
// How many matching rows to pass to Claude as context. Enough to be
// useful, capped so the prompt stays within a sensible token budget.
const MAX_CONTEXT_ROWS = 20;
/**
* Answers a question using only the Northwind knowledge Sheet.
* Retrieves candidate rows by keyword, then asks Claude to answer
* from that context alone.
*/
function answer(question) {
// 1. Bail out early on an empty question.
if (!question || !question.trim()) {
return 'Ask a question and I will look it up.';
}
// 2. Load every row of the knowledge base as keyed objects.
const data = readSheet(KNOWLEDGE_SHEET_ID);
if (!data.length) {
return 'The knowledge base is empty — nothing to search.';
}
// 3. Naive retrieval: split the question into words, keep any row
// whose serialised text contains at least one of them.
const words = question.toLowerCase().split(/\W+/).filter(Boolean);
const relevant = data
.filter((r) => words.some((w) => JSON.stringify(r).toLowerCase().includes(w)))
.slice(0, MAX_CONTEXT_ROWS);
// 4. If nothing matched, do not waste an API call.
if (!relevant.length) {
return 'No rows in the knowledge base mention that — needs a human.';
}
// 5. Build a grounded prompt: the data is the only allowed source.
const prompt =
'Answer this question for Northwind using ONLY the data below. ' +
'If the data does not cover it, say so.\n\nData:\n' +
JSON.stringify(relevant) + '\n\nQuestion: ' + question;
// 6. Sonnet reads the rows and writes the answer.
return callClaude(prompt, 'claude-sonnet-4-6', 600);
}
/**
* Reads the first tab of a Sheet and returns each row as an object
* keyed by the header cells.
*/
function readSheet(id) {
const [h, ...rows] = SpreadsheetApp.openById(id)
.getSheets()[0]
.getDataRange()
.getValues();
return rows.map((r) => Object.fromEntries(h.map((k, i) => [k, r[i]])));
}
/**
* Minimal Anthropic API call. The key lives in Script Properties — it
* is never pasted into the code.
*/
function callClaude(prompt, model = 'claude-haiku-4-5-20251001', maxTokens = 400) {
const key = PropertiesService.getScriptProperties()
.getProperty('ANTHROPIC_API_KEY');
const res = UrlFetchApp.fetch('https://api.anthropic.com/v1/messages', {
method: 'post',
contentType: 'application/json',
headers: { 'x-api-key': key, 'anthropic-version': '2023-06-01' },
payload: JSON.stringify({
model,
max_tokens: maxTokens,
messages: [{ role: 'user', content: prompt }],
}),
muteHttpExceptions: true,
});
return JSON.parse(res.getContentText()).content[0].text.trim();
}
How it works
answerfirst checks the question is not blank — an empty string gets a friendly nudge instead of a pointless API call.readSheetloads the whole knowledge base and turns each row into an object keyed by the header, so a row reads like{ topic: 'Refunds', detail: '...' }.- Retrieval is deliberately simple: the question is split into words, and any row whose serialised JSON contains one of those words is kept. It is crude, but for a few hundred rows it is fast and good enough.
- The matches are capped at
MAX_CONTEXT_ROWSso the prompt never balloons, and if nothing matched at all the script returns early. - The prompt hands Claude the matched rows as the only permitted source and instructs it to admit when the data does not cover the question.
- Claude Sonnet reads the context and returns a grounded answer — one that cites your data rather than its training.
Example run
Say the knowledge Sheet holds rows like these:
| topic | detail |
|---|---|
| Refund window | Northwind refunds within 30 days of delivery, no questions asked. |
| Rush delivery | Rush orders ship next business day for a 15% surcharge. |
| Warranty | Hardware carries a 2-year warranty; software is sold as-is. |
Calling answer('how long do I have to return something?') retrieves the
“Refund window” row and returns:
Northwind accepts returns within 30 days of delivery, with no questions asked.
Calling answer('do you ship internationally?') retrieves nothing, so it
returns “No rows in the knowledge base mention that — needs a human.” — which is
exactly the behaviour you want from a grounded assistant.
Run it
This is an on-demand function — call it whenever you have a question:
- In the Apps Script editor, open the function and call
answer('your question here')from a wrapper, or paste the question into a test run. - Approve the authorisation prompt the first time.
- Read the returned string in the execution log.
To make it usable from a Sheet, wrap it as a custom function so colleagues can
type =ASK("when do refunds expire?") in a cell. Keep in mind custom functions
cannot use services that need authorisation beyond the spreadsheet, so test the
plain answer first.
Watch out for
- Keyword retrieval is literal. “Return” will not match a row that only says “refund”, so a question and its facts can miss each other. If recall matters, add synonyms to your rows or move to embeddings-based retrieval.
- The whole knowledge base is read on every call. That is fine for hundreds of rows; for thousands, cache the parsed data or narrow the search first.
- Grounding depends on the prompt holding. Claude is told to use only the supplied data — keep that instruction, and never loosen it to “be helpful”, or it will start filling gaps from memory.
- Stale rows produce confident wrong answers. The script trusts the Sheet completely, so a knowledge base is only as good as its last edit.
Related
Generate and test email subject lines
A/B test AI-written Northwind subject lines for open rate — outputs ranked by past performance.
Updated Mar 3, 2026
Build an AI weekly-report narrator
Turn Northwind metrics into a written executive summary — numbers in, prose out.
Updated Feb 23, 2026
Build a multi-step AI agent workflow
Chain Claude prompts to complete a Northwind task end to end — research → draft → critique → finalise.
Updated Feb 11, 2026
Adapt marketing copy per region
Localise Northwind tone and references by market with AI — same message, regional flavour.
Updated Jan 30, 2026
Auto-write CRM notes from call summaries
Generate Northwind account updates after each client call — pulled from the transcript.
Updated Jan 26, 2026