appscript.dev
Automation Intermediate Docs Sheets Drive

Audit Docs for word count and readability

Report length and reading grade per Northwind Doc, into a single audit Sheet.

Published Nov 23, 2025

Northwind keeps a folder of client-facing Docs — proposals, guides, case studies — and nobody has a clear view of how they compare. Some are sprawling and dense, others are thin. Opening each one to eyeball its length and tone is the kind of audit that gets put off indefinitely.

This script does the audit in one pass. It walks every Doc in a folder, counts the words and sentences, scores each one with a readability formula, and writes a row per Doc to an audit Sheet. The result is a single sortable table showing which Docs are too long, too dense, or just right.

What you’ll need

  • A Drive folder containing the Google Docs you want to audit. The script only reads files, so it is safe to point at a live folder.
  • A Google Sheet to receive the audit. The script clears the first tab and rewrites it on every run, so use a Sheet dedicated to this report.
  • The folder ID and the Sheet ID, both taken from their URLs — pass them in as arguments when you run the function.

The script

// Flesch Reading Ease constants. The full formula also weights syllables
// per word; this script uses the words-per-sentence term, which is the
// dominant factor and needs no syllable counting.
const FLESCH_BASE = 206.835;
const FLESCH_SENTENCE_WEIGHT = 1.015;

/**
 * Audits every Google Doc in a folder and writes a length and
 * readability report to the first tab of a Sheet.
 *
 * @param {string} folderId  ID of the Drive folder to scan.
 * @param {string} sheetId   ID of the Sheet to write the report into.
 */
function auditDocsInFolder(folderId, sheetId) {
  // 1. Grab an iterator over the Google Docs in the folder.
  const files = DriveApp.getFolderById(folderId).getFilesByType(MimeType.GOOGLE_DOCS);

  // 2. Walk each Doc, measuring length and scoring readability.
  const rows = [];
  while (files.hasNext()) {
    const file = files.next();
    const text = DocumentApp.openById(file.getId()).getBody().getText();

    // Count words as runs of non-whitespace characters.
    const words = (text.match(/\S+/g) || []).length;

    // Count sentences as runs of terminal punctuation. Guard against
    // zero so the division below never blows up.
    const sentences = (text.match(/[.!?]+/g) || []).length || 1;

    // Higher score = easier to read. ~60-70 is plain English.
    const readability = FLESCH_BASE - FLESCH_SENTENCE_WEIGHT * (words / sentences);

    rows.push([file.getName(), words, sentences, Math.round(readability), file.getUrl()]);
  }

  // 3. Rebuild the audit Sheet from scratch with a fresh header.
  const sheet = SpreadsheetApp.openById(sheetId).getSheets()[0];
  sheet.clear();
  sheet.getRange(1, 1, 1, 5)
    .setValues([['Doc', 'Words', 'Sentences', 'Readability', 'Link']]);

  // 4. Write one row per Doc beneath the header, if there were any.
  if (rows.length) {
    sheet.getRange(2, 1, rows.length, 5).setValues(rows);
  }
  Logger.log('Audited ' + rows.length + ' Doc(s).');
}

How it works

  1. auditDocsInFolder opens the folder and gets an iterator filtered to Google Docs only, so other file types in the folder are ignored.
  2. For each Doc it pulls the full body text in one call, then derives three numbers: word count from runs of non-whitespace, sentence count from runs of terminal punctuation (., !, ?), and a readability score.
  3. The score is a simplified Flesch Reading Ease — FLESCH_BASE minus FLESCH_SENTENCE_WEIGHT times the average sentence length. A higher number means easier reading; roughly 60-70 is plain English, below 40 is dense.
  4. Each Doc becomes a five-column row, including a clickable link back to the file so you can jump straight to anything that scores badly.
  5. The audit Sheet is cleared and rebuilt with a fresh header on every run, so the table always reflects the current contents of the folder.

Example run

Pointed at a folder of four client Docs, the audit Sheet comes back like this:

DocWordsSentencesReadabilityLink
Onboarding guide1,2409664(link)
Q1 proposal — Harbour Co3,81014238(link)
Service overview6205171(link)
Case study — Lumen Ltd2,05011052(link)

Sort by Readability and the Q1 proposal jumps out at 38 — long sentences, hard going — while the service overview at 71 reads cleanly. That is the cue for where editing effort should go.

Run it

This is an on-demand audit, not a scheduled one. Run it from the editor:

  1. Add a small wrapper that passes your real IDs, since the editor cannot call a function with arguments directly:
function runAudit() {
  auditDocsInFolder('1abcFolderId', '1abcAuditSheetId');
}
  1. Select runAudit and click Run.
  2. Approve the Drive, Docs, and spreadsheet authorisation prompt the first time.
  3. Open the audit Sheet to read the report.

Watch out for

  • The readability score is a simplification. The true Flesch formula also weights syllables per word; this version uses only sentence length, which tracks the real score closely but is not identical. Treat it as a relative guide, not an exact grade.
  • Sentence counting is naive. Abbreviations like “e.g.” or “Inc.” each add a false sentence break, which slightly inflates the readability score for Docs full of them.
  • Word counting includes everything in the body — headings, captions, table cells — but not text inside the header, footer, or footnotes, which getBody().getText() does not return.
  • Opening every Doc in a large folder is slow. A few hundred Docs can push the run past the six-minute execution limit; audit in sub-folders if you hit it.
  • The audit Sheet is cleared on every run. Keep this report in its own Sheet so nothing else gets wiped.

Related