appscript.dev
Automation Intermediate Gmail Sheets

Extract order numbers from confirmation emails

Pull structured fields out of receipts and write them to a tracking sheet.

Published Sep 30, 2025

When Northwind orders kit and software from suppliers, every purchase comes back as a confirmation email with an order number buried somewhere in the body. Weeks later, when Awadesh needs to chase a delivery or query an invoice, that number is the key to the whole conversation — and it is sitting in an email he can no longer find.

This script keeps a running log so the inbox is never the system of record. It checks recent confirmation emails, extracts the order number with a regex, labels the thread so it is never processed twice, and appends a row to an Orders sheet. The result is one searchable list of every order number, who sent it, and when.

What you’ll need

  • A Google Sheet for the log. The script writes to the first tab; a header row of Captured at, From, Order number, Subject is recommended so the columns read clearly.
  • A Gmail account that receives the supplier confirmations.
  • Nothing else. The script creates the orders/captured label itself the first time it runs.

The script

// The spreadsheet that holds the order log.
const ORDERS_SHEET_ID = '1abcOrdersSheetId';

// The label applied to threads already logged, so they are skipped next time.
const CAPTURED_LABEL = 'orders/captured';

// Matches "Order #ABC123" or "Order number: ABC-12345". Order IDs are
// assumed to be at least six characters of letters, digits and hyphens.
const ORDER_REGEX = /Order\s+(?:#|number[:\s]+)([A-Z0-9-]{6,})/i;

/**
 * Finds recent confirmation emails, extracts the order number from each,
 * appends a row to the Orders sheet, and labels the thread as captured.
 */
function captureOrders() {
  // 1. Find confirmation threads from the last day that we have not
  //    already captured.
  const threads = GmailApp.search(
    'subject:(order OR confirmation) newer_than:1d -label:' + CAPTURED_LABEL
  );
  if (!threads.length) {
    Logger.log('No new confirmation emails to process.');
    return;
  }

  // 2. Get (or create) the label used to mark processed threads.
  const captured = GmailApp.getUserLabelByName(CAPTURED_LABEL)
    || GmailApp.createLabel(CAPTURED_LABEL);

  // 3. Build one row per email that contains a recognisable order number.
  const rows = threads.map((thread) => {
    const msg = thread.getMessages()[0];
    const body = msg.getPlainBody();
    const match = body.match(ORDER_REGEX);

    // No order number found — leave the thread unlabelled so a later,
    // smarter run could still pick it up.
    if (!match) return null;

    // Mark this thread so the next run skips it.
    thread.addLabel(captured);
    return [new Date(), msg.getFrom(), match[1], msg.getSubject()];
  }).filter(Boolean);

  // 4. Append all the new rows in one write.
  if (rows.length) {
    const sheet = SpreadsheetApp.openById(ORDERS_SHEET_ID).getSheets()[0];
    sheet.getRange(sheet.getLastRow() + 1, 1, rows.length, 4).setValues(rows);
    Logger.log('Logged ' + rows.length + ' order(s).');
  } else {
    Logger.log('No order numbers matched in this batch.');
  }
}

How it works

  1. captureOrders searches Gmail for threads whose subject mentions “order” or “confirmation”, limited to the last day and excluding anything already carrying the orders/captured label. If there is nothing new, it stops.
  2. It fetches the orders/captured label, creating it on the first run so the script is self-setting-up.
  3. For each thread it reads the first message, runs ORDER_REGEX over the plain-text body, and pulls out the captured group — the order number itself.
  4. A thread with no match returns null and is filtered out. Crucially, it is not labelled, so if you later improve the regex a future run can still capture it.
  5. A thread with a match is labelled immediately, and a four-field row is built: the capture time, the sender, the order number, and the subject.
  6. All the rows are appended to the sheet in a single setValues call below the last used row, which is faster than appending one row at a time.

Example run

A confirmation email arrives:

From: [email protected]
Subject: Your Acme order confirmation

Thank you for your purchase.
Order number: ACME-48217
Estimated delivery: 3 working days

After the next run, the Orders sheet has gained a row:

Captured atFromOrder numberSubject
2026-05-25 09:32[email protected]ACME-48217Your Acme order confirmation

The thread is now labelled orders/captured, so the following run leaves it alone.

Trigger it

This is a polling job, so run it on a time-driven trigger:

  1. In the Apps Script editor, open Triggers (the clock icon).
  2. Click Add trigger.
  3. Choose function captureOrders, event source Time-driven, type Minutes timer, every 30 minutes.

Thirty minutes is a sensible default — fast enough that the log stays current, slow enough to stay well within Gmail quota.

Tighten the regex

Every vendor formats order IDs differently, and a single generic pattern will miss some and mangle others. If you have a handful of repeat senders, keep a per-vendor map and pick the pattern by the sender address before falling back to the generic regex:

// Sender-specific patterns. Checked before ORDER_REGEX as a fallback.
const VENDOR_PATTERNS = {
  '[email protected]': /Receipt #(\d+)/,
  '[email protected]': /Invoice ([A-F0-9-]+)/i,
};

/**
 * Returns the order number for a message, preferring a vendor-specific
 * pattern and falling back to the generic ORDER_REGEX.
 */
function extractOrderNumber(from, body) {
  for (const [sender, pattern] of Object.entries(VENDOR_PATTERNS)) {
    if (from.includes(sender)) {
      const m = body.match(pattern);
      if (m) return m[1];
    }
  }
  const generic = body.match(ORDER_REGEX);
  return generic ? generic[1] : null;
}

Watch out for

  • The regex only reads the first message. If a vendor sends the order number in a follow-up rather than the confirmation, loop over thread.getMessages() instead of taking [0].
  • Plain-text only. getPlainBody strips HTML, which is what you want, but a few senders put the order number in an image or a button label — there is no text for the regex to match.
  • Unmatched threads stay unlabelled by design. That is deliberate, but it means every run re-scans them until they age past newer_than:1d. That is fine at this volume; just be aware of it.
  • Subject-line false positives. A reply or a marketing email containing the word “order” will be scanned. It costs nothing — if the body has no matching number, the thread is simply skipped.
  • Duplicate order numbers. The script does not deduplicate. If the same confirmation is somehow processed twice, you will get two rows; the label normally prevents this, but a manual re-run before labelling could not.

Related