Auto-tag a content library to a taxonomy
Apply consistent tags to Northwind's untagged articles from a controlled vocabulary.
Published Sep 8, 2025
Northwind’s content library has grown faster than anyone has tagged it. The
articles are all listed in a sheet, but the tags column is half empty — and
the tags that do exist are inconsistent, because three people tagged things
their own way over two years. Filtering the library by topic is unreliable, and
nobody wants to spend a day reading old articles to fix it.
This script tags the library against a fixed taxonomy. You define a short, controlled list of allowed tags; the script sends each untagged article’s title and excerpt to Claude and asks for one to three tags drawn only from that list. Because every article is judged against the same vocabulary, the tags come out consistent — and only blank rows are touched, so existing work is left alone.
What you’ll need
- An articles sheet with a header row including the columns
title,excerpt, andtags. - An Anthropic API key saved as
ANTHROPIC_API_KEYin Script Properties — see Store API keys and secrets securely. - A taxonomy: the short list of tags you want the library classified into,
set in the
TAXONOMYconstant.
The script
// The articles sheet to tag.
const ARTICLES_SHEET_ID = '1abcArticlesId';
// The controlled vocabulary. Claude may only pick from this list —
// keep it short and meaningful so tags stay consistent.
const TAXONOMY = [
'design',
'engineering',
'business',
'process',
'tools',
'people',
];
/**
* Reads every article row, and for any row with an empty tags cell,
* asks Claude to pick 1-3 tags from the taxonomy and writes them back.
*/
function autoTagContent() {
const sheet = SpreadsheetApp.openById(ARTICLES_SHEET_ID).getSheets()[0];
// 1. Read the whole sheet; split the header off from the data rows.
const [header, ...rows] = sheet.getDataRange().getValues();
if (!rows.length) {
Logger.log('No article rows to tag — nothing to do.');
return;
}
// 2. Map header names to column indexes so we can read by name.
const col = Object.fromEntries(header.map((name, i) => [name, i]));
let tagged = 0;
// 3. Walk each row, skipping any that are already tagged.
rows.forEach((row, i) => {
if (row[col.tags]) return; // already has tags — leave it alone
// 4. Build a prompt that pins the answer to the taxonomy.
const prompt =
'Pick 1-3 tags from this list for the article. ' +
'Return comma-separated, tags only, nothing else.\n' +
'Tags: ' + TAXONOMY.join(', ') + '\n\n' +
'Title: ' + row[col.title] + '\n' +
'Excerpt: ' + row[col.excerpt];
// 5. Ask Claude and write the result into the tags cell.
// The +2 accounts for the header row and 1-based row numbers.
const reply = callClaude(prompt);
sheet.getRange(i + 2, col.tags + 1).setValue(reply);
tagged++;
});
Logger.log('Tagged ' + tagged + ' article(s).');
}
/**
* Minimal Anthropic API call. The key lives in Script Properties — it
* is never pasted into the code.
*/
function callClaude(prompt, model = 'claude-haiku-4-5-20251001', maxTokens = 400) {
const key = PropertiesService.getScriptProperties()
.getProperty('ANTHROPIC_API_KEY');
const res = UrlFetchApp.fetch('https://api.anthropic.com/v1/messages', {
method: 'post',
contentType: 'application/json',
headers: { 'x-api-key': key, 'anthropic-version': '2023-06-01' },
payload: JSON.stringify({
model,
max_tokens: maxTokens,
messages: [{ role: 'user', content: prompt }],
}),
muteHttpExceptions: true,
});
return JSON.parse(res.getContentText()).content[0].text.trim();
}
How it works
autoTagContentopens the articles sheet and reads everything in one call, destructuring the first row off asheaderand the rest asrows.- It builds a
collookup from header name to column index, so the rest of the code can sayrow[col.title]rather than relying on a hard-coded position. - It walks each row and skips any where the
tagscell already holds a value — so existing tags, however inconsistent, are never overwritten. - For each untagged row it builds a prompt that lists the taxonomy and tells Claude to return only comma-separated tags from that list. Constraining the vocabulary is what keeps the tagging consistent.
- It calls Claude with the cheap, fast Haiku model — tagging is a simple
classification, not a reasoning task — and writes the reply straight into the
tagscell. Thei + 2offset converts the zero-based loop index into a 1-based sheet row, allowing for the header.
Example run
Two untagged rows in the articles sheet:
| title | excerpt | tags |
|---|---|---|
| Naming things in Figma | A system for layer and component names… | |
| Hiring without a recruiter | How we ran our last three searches in-house… |
After a run:
| title | excerpt | tags |
|---|---|---|
| Naming things in Figma | A system for layer and component names… | design, process, tools |
| Hiring without a recruiter | How we ran our last three searches in-house… | people, process, business |
Every tag is drawn from the six-word taxonomy, so filtering the library by
design or people now returns a complete, reliable set.
Run it
This is a clean-up job you run on demand:
- In the Apps Script editor, select
autoTagContentand click Run. - Approve the authorisation prompt the first time.
- Check the
tagscolumn — only the previously blank rows will have changed.
To re-tag the whole library against a revised taxonomy, clear the tags column
first, then run again.
Watch out for
- The script only tags blank rows. To re-tag everything after changing the
TAXONOMY, clear the existingtagscolumn before running. - Claude is asked for tags only, but a model can still occasionally add a stray
word. If you see one, tighten the prompt — and consider validating each
returned tag against
TAXONOMYbefore writing it back. - One API call per row. A library of a few thousand articles is a few thousand calls — watch your Anthropic usage, and note Apps Script’s 6-minute runtime limit. For a large backlog, tag in batches or run on a trigger that resumes where it left off.
UrlFetchApphas a daily quota (50,000 calls on consumer accounts). A one-off library clean-up is well within it, but a repeatedly re-run job is not free.- Keep the taxonomy short. A list of 30 tags gives the model too many near-synonyms to choose between, and consistency collapses — six to a dozen clear categories works far better.
Related
Build an AI keyword-clustering tool
Group Northwind's tracked search terms into topic clusters — for SEO content planning.
Updated Feb 19, 2026
Build an AI customer-churn predictor
Flag at-risk Northwind accounts from behavioural signals — usage, support tickets, billing.
Updated Feb 15, 2026
Build a context-aware AI data validator
Catch values that look wrong in context — '£10' for a Northwind retainer is suspicious.
Updated Feb 7, 2026
Auto-categorize a photo library
Tag Northwind Drive images by visual content — product, team, event, behind-the-scenes.
Updated Feb 3, 2026
Build an AI bug-triage system
Categorise and prioritise Northwind's reported issues automatically — type, severity, owner.
Updated Jan 22, 2026