Build a logging and monitoring system

A script that runs on a trigger does its work with nobody watching. When it succeeds you never think about it; when it fails — or worse, when it stops running at all — there is no error message on a screen, because there is no screen. The first sign of trouble is often someone asking why a report is a day late.

A logging and monitoring system turns that invisible work into a record you can read. Logging answers “what happened on this run”; monitoring answers “is the job still running at all”. Together they mean you find out about a problem because an alert told you, not because a colleague did. This guide builds both on top of a plain spreadsheet, which needs no extra setup and stays readable for anyone.

Logging vs monitoring

The two jobs are related but distinct, and a complete system needs both.

Concern	Logging	Monitoring
Question answered	What happened?	Is it still alive?
When you read it	After a failure	Continuously
Triggered by	The script itself	A separate watchdog
Output	Rows of events	An alert when silent

Logging without monitoring leaves you blind to a job that never starts. Monitoring without logging tells you something broke but not what. Build them together.

A simple sheet logger

A spreadsheet makes a fine log store: it is sortable, filterable, and readable by non-developers. Each call appends one timestamped row.

// Append one structured row to the log sheet. Keep the open() call cheap by
// always writing to the same known spreadsheet and first tab.
function log(level, fn, message, extra = '') {
  SpreadsheetApp.openById('1abcLogId').getSheets()[0]
    // Columns: when, severity, which function, what happened, detail.
    .appendRow([new Date(), level, fn, message, extra]);
}

// Thin wrappers so call sites read clearly and levels stay consistent.
function info(fn, msg) { log('info', fn, msg); }
function error(fn, msg, stack) { log('error', fn, msg, stack); }

Use a fixed set of levels — info, warn, error — so you can filter the sheet later. Put the stack trace in the extra column for errors; it is the single most useful thing to have when debugging after the fact.

Wrap entry points

Calling info() and error() by hand everywhere is tedious and easy to forget. Instead, wrap each entry-point function once so logging is automatic.

// Decorate a function so every run logs start, success, and failure.
function safe(fn, name) {
  return (...args) => {
    info(name, 'start');           // mark the run beginning
    try {
      const r = fn(...args);
      info(name, 'ok');            // mark a clean finish
      return r;
    } catch (e) {
      // Capture the message and stack before re-throwing.
      error(name, e.message, e.stack);
      throw e;                     // still fail visibly — do not swallow
    }
  };
}

Now safe(syncOrders, 'syncOrders') produces a start and an ok row on every healthy run, and an error row with a stack trace on every failure — with no logging code inside syncOrders itself.

Heartbeats

A log only grows when the script runs. If a trigger is deleted or the project loses authorisation, the script stops entirely and the log simply goes quiet — no error row, just silence.

Monitoring catches this. For every critical job, an ok row is a heartbeat: a signal the job is alive. A separate watchdog checks that a heartbeat has arrived recently and alerts when one has not.

// Run this on its own trigger, independent of the jobs it watches.
function alertOnStaleHeartbeats() {
  // Find the most recent successful run across the log.
  const recent = readLog()
    .filter((r) => r.level === 'info' && r.message === 'ok')
    .sort((a, b) => b.timestamp - a.timestamp)[0];

  // No 'ok' in the last 6 hours means the job has gone silent — alert.
  if (Date.now() - recent.timestamp > 6 * 3600_000) {
    GmailApp.sendEmail(
      '[email protected]',
      'Cron is silent',
      'Last OK: ' + recent.timestamp
    );
  }
}

The watchdog must run on its own independent trigger. If it shared a trigger with the job it watches, the same failure would silence both.

Keep the log healthy

A log sheet that grows forever eventually slows down and approaches the cell limit. Trim it on a schedule.

Keep a fixed window — the last 30 days, or the last 5,000 rows.
Delete old rows in one batched range operation, not row by row.
Consider a second sheet for error rows only, so failures are never trimmed away with routine info noise.

Common mistakes

Logging only errors. Without start/ok rows you have no heartbeat and no way to confirm a job ran at all.
Putting the watchdog on the same trigger as the monitored job — one failure then disables both the job and its alarm.
Calling log() inside a tight loop. Each appendRow() is a service call; collect rows in an array and write them once.
Swallowing the error after logging it. safe() must re-throw so the run is still marked failed in the executions panel.
Never trimming the sheet. A log with hundreds of thousands of rows becomes slow to append to and eventually unusable.