appscript.dev
Automation Intermediate Sheets

Auto-tag a content library to a taxonomy

Apply consistent tags to Northwind's untagged articles from a controlled vocabulary.

Published Sep 8, 2025

Northwind’s content library has grown faster than anyone has tagged it. The articles are all listed in a sheet, but the tags column is half empty — and the tags that do exist are inconsistent, because three people tagged things their own way over two years. Filtering the library by topic is unreliable, and nobody wants to spend a day reading old articles to fix it.

This script tags the library against a fixed taxonomy. You define a short, controlled list of allowed tags; the script sends each untagged article’s title and excerpt to Claude and asks for one to three tags drawn only from that list. Because every article is judged against the same vocabulary, the tags come out consistent — and only blank rows are touched, so existing work is left alone.

What you’ll need

  • An articles sheet with a header row including the columns title, excerpt, and tags.
  • An Anthropic API key saved as ANTHROPIC_API_KEY in Script Properties — see Store API keys and secrets securely.
  • A taxonomy: the short list of tags you want the library classified into, set in the TAXONOMY constant.

The script

// The articles sheet to tag.
const ARTICLES_SHEET_ID = '1abcArticlesId';

// The controlled vocabulary. Claude may only pick from this list —
// keep it short and meaningful so tags stay consistent.
const TAXONOMY = [
  'design',
  'engineering',
  'business',
  'process',
  'tools',
  'people',
];

/**
 * Reads every article row, and for any row with an empty tags cell,
 * asks Claude to pick 1-3 tags from the taxonomy and writes them back.
 */
function autoTagContent() {
  const sheet = SpreadsheetApp.openById(ARTICLES_SHEET_ID).getSheets()[0];

  // 1. Read the whole sheet; split the header off from the data rows.
  const [header, ...rows] = sheet.getDataRange().getValues();
  if (!rows.length) {
    Logger.log('No article rows to tag — nothing to do.');
    return;
  }

  // 2. Map header names to column indexes so we can read by name.
  const col = Object.fromEntries(header.map((name, i) => [name, i]));

  let tagged = 0;

  // 3. Walk each row, skipping any that are already tagged.
  rows.forEach((row, i) => {
    if (row[col.tags]) return; // already has tags — leave it alone

    // 4. Build a prompt that pins the answer to the taxonomy.
    const prompt =
      'Pick 1-3 tags from this list for the article. ' +
      'Return comma-separated, tags only, nothing else.\n' +
      'Tags: ' + TAXONOMY.join(', ') + '\n\n' +
      'Title: ' + row[col.title] + '\n' +
      'Excerpt: ' + row[col.excerpt];

    // 5. Ask Claude and write the result into the tags cell.
    //    The +2 accounts for the header row and 1-based row numbers.
    const reply = callClaude(prompt);
    sheet.getRange(i + 2, col.tags + 1).setValue(reply);
    tagged++;
  });

  Logger.log('Tagged ' + tagged + ' article(s).');
}

/**
 * Minimal Anthropic API call. The key lives in Script Properties — it
 * is never pasted into the code.
 */
function callClaude(prompt, model = 'claude-haiku-4-5-20251001', maxTokens = 400) {
  const key = PropertiesService.getScriptProperties()
    .getProperty('ANTHROPIC_API_KEY');
  const res = UrlFetchApp.fetch('https://api.anthropic.com/v1/messages', {
    method: 'post',
    contentType: 'application/json',
    headers: { 'x-api-key': key, 'anthropic-version': '2023-06-01' },
    payload: JSON.stringify({
      model,
      max_tokens: maxTokens,
      messages: [{ role: 'user', content: prompt }],
    }),
    muteHttpExceptions: true,
  });
  return JSON.parse(res.getContentText()).content[0].text.trim();
}

How it works

  1. autoTagContent opens the articles sheet and reads everything in one call, destructuring the first row off as header and the rest as rows.
  2. It builds a col lookup from header name to column index, so the rest of the code can say row[col.title] rather than relying on a hard-coded position.
  3. It walks each row and skips any where the tags cell already holds a value — so existing tags, however inconsistent, are never overwritten.
  4. For each untagged row it builds a prompt that lists the taxonomy and tells Claude to return only comma-separated tags from that list. Constraining the vocabulary is what keeps the tagging consistent.
  5. It calls Claude with the cheap, fast Haiku model — tagging is a simple classification, not a reasoning task — and writes the reply straight into the tags cell. The i + 2 offset converts the zero-based loop index into a 1-based sheet row, allowing for the header.

Example run

Two untagged rows in the articles sheet:

titleexcerpttags
Naming things in FigmaA system for layer and component names…
Hiring without a recruiterHow we ran our last three searches in-house…

After a run:

titleexcerpttags
Naming things in FigmaA system for layer and component names…design, process, tools
Hiring without a recruiterHow we ran our last three searches in-house…people, process, business

Every tag is drawn from the six-word taxonomy, so filtering the library by design or people now returns a complete, reliable set.

Run it

This is a clean-up job you run on demand:

  1. In the Apps Script editor, select autoTagContent and click Run.
  2. Approve the authorisation prompt the first time.
  3. Check the tags column — only the previously blank rows will have changed.

To re-tag the whole library against a revised taxonomy, clear the tags column first, then run again.

Watch out for

  • The script only tags blank rows. To re-tag everything after changing the TAXONOMY, clear the existing tags column before running.
  • Claude is asked for tags only, but a model can still occasionally add a stray word. If you see one, tighten the prompt — and consider validating each returned tag against TAXONOMY before writing it back.
  • One API call per row. A library of a few thousand articles is a few thousand calls — watch your Anthropic usage, and note Apps Script’s 6-minute runtime limit. For a large backlog, tag in batches or run on a trigger that resumes where it left off.
  • UrlFetchApp has a daily quota (50,000 calls on consumer accounts). A one-off library clean-up is well within it, but a repeatedly re-run job is not free.
  • Keep the taxonomy short. A list of 30 tags gives the model too many near-synonyms to choose between, and consistency collapses — six to a dozen clear categories works far better.

Related