appscript.dev
Automation Advanced Sheets

Build a data-quality scorecard

Grade any sheet on completeness, validity, and freshness — surface gaps as a single score.

Published Dec 31, 2025

A spreadsheet rots quietly. Blank cells creep in, someone types n/a instead of leaving a field empty, and the “last contacted” dates slide further into the past. Northwind’s clients sheet looked fine at a glance, but nobody could say how trustworthy it actually was — and so nobody fixed it.

This script gives the sheet a grade. It scores three things you can measure without knowing the business: completeness (how few cells are blank), validity (how few cells are filler like none or -), and freshness (how recent the date columns are). Three numbers between 0 and 1 turn a vague worry into something you can track week to week.

What you’ll need

  • A Google Sheet whose first tab holds the data you want to grade, with a header row and one record per row.
  • Nothing else — the script reads the sheet and returns scores. It does not write back unless you add the dashboard step yourself.

The script

// The sheet to grade by default. Pass a different ID to scoreDataQuality.
const DEFAULT_SHEET_ID = '1abcClientsId';

// A date is "stale" if it is older than this many days.
const DEFAULT_FRESH_DAYS = 30;

// Milliseconds in one day — used for the freshness comparison.
const MS_PER_DAY = 86400000;

// Strings that count as filler rather than real data.
const FILLER_PATTERN = /^(none|null|n\/a|-)$/i;

/**
 * Grades a sheet on completeness, validity, and freshness.
 * Each score is a fraction from 0 (bad) to 1 (perfect).
 *
 * @param {string} sheetId    The spreadsheet to grade.
 * @param {number} freshDays  Age in days past which a date is stale.
 * @return {{completeness:number, validity:number, freshness:number}}
 */
function scoreDataQuality(sheetId = DEFAULT_SHEET_ID, freshDays = DEFAULT_FRESH_DAYS) {
  const sheet = SpreadsheetApp.openById(sheetId).getSheets()[0];
  const [header, ...rows] = sheet.getDataRange().getValues();

  // Guard: an empty sheet is perfect by default — there is nothing to fault.
  if (!rows.length) {
    return { completeness: 1, validity: 1, freshness: 1 };
  }

  // 1. Completeness: count every blank cell across all rows.
  const totalCells = rows.length * header.length;
  const emptyCells = rows.flat()
    .filter((value) => value === '' || value === null)
    .length;

  // 2. Validity: count cells holding filler text instead of real data.
  const fillerCells = rows.flat()
    .filter((value) => typeof value === 'string' && FILLER_PATTERN.test(value))
    .length;

  // 3. Find which columns are date columns — every non-blank cell is a Date.
  const dateCols = header
    .map((_, i) => rows.every((r) => r[i] instanceof Date || !r[i]) ? i : -1)
    .filter((i) => i >= 0);

  // 4. Freshness: count date cells older than the freshness window.
  const staleCells = dateCols.reduce((sum, ci) => {
    const cutoff = Date.now() - freshDays * MS_PER_DAY;
    return sum + rows.filter((r) =>
      r[ci] instanceof Date && r[ci].getTime() < cutoff).length;
  }, 0);

  return {
    completeness: totalCells ? 1 - emptyCells / totalCells : 1,
    validity: totalCells ? 1 - fillerCells / totalCells : 1,
    // Divide by the number of date cells, not all cells, so non-date
    // sheets are not penalised for having no dates.
    freshness: 1 - staleCells / (rows.length * Math.max(dateCols.length, 1)),
  };
}

/**
 * Convenience wrapper: grades the default sheet and logs the result.
 */
function logQuality() {
  const score = scoreDataQuality();
  Logger.log(JSON.stringify(score, null, 2));
}

How it works

  1. scoreDataQuality opens the first tab of the target sheet and splits the header row from the data rows.
  2. If there are no data rows it returns a perfect score for all three measures — an empty sheet has no faults to count.
  3. Completeness flattens every data cell into one list and counts the blanks ('' or null), then expresses the rest as a fraction of all cells.
  4. Validity scans the same flattened cells for filler strings — none, null, n/a, or - in any case — and treats those as not-real-data.
  5. To score freshness it first detects the date columns: a column is a date column if every non-blank cell in it is a Date object.
  6. Freshness counts date cells older than freshDays and divides by the number of date cells, so a sheet with no dates scores a clean 1 rather than being punished for it.
  7. logQuality is a thin wrapper that grades the default sheet and prints the three scores to the execution log.

Example run

A clients sheet with 50 rows and 6 columns — 14 blank cells, 6 cells holding n/a, and a Last contacted column where 9 dates are over 30 days old — produces:

{
  "completeness": 0.953,
  "validity": 0.98,
  "freshness": 0.82
}

Read as percentages: the sheet is 95% complete, 98% clean, but only 82% fresh — so the date column is where the next clean-up should go.

Trigger it

Run it weekly so the scores form a trend rather than a one-off snapshot:

  1. In the Apps Script editor, open Triggers (the clock icon).
  2. Add a trigger for logQuality, Time-driven, Week timer.
  3. Pick a quiet hour, such as Monday 06:00.

To turn the log into a tracked metric, append the three scores plus a timestamp to a Scorecard tab inside logQuality, and the weekly run becomes a history you can chart.

Watch out for

  • Date-column detection is heuristic. A column counts as a date column only if every filled cell is a real Date — a single text value like TBC will exclude the whole column from the freshness score.
  • Validity only catches the four filler strings in FILLER_PATTERN. It will not spot a malformed email or a number stored as text. Add patterns or a per-column check if you need stricter validation.
  • Numbers stored as text count as filled, not blank, so completeness will not flag them. Completeness measures presence, not correctness.
  • getDataRange reads the whole used range. On a very large sheet, flattening every cell twice is memory-heavy — score a sample of rows instead.
  • The scores are unweighted. If freshness matters more than validity for your use, combine them into a single headline number with weights of your choosing.

Related