appscript.dev
Automation Intermediate Docs Sheets

Extract tables from Docs into a Sheet

Pull tabular content out of a Doc into a Sheet for analysis or downstream processing.

Published Nov 9, 2025

Northwind’s reports arrive as Docs, and the numbers inside them arrive as tables — a pricing grid here, a headcount breakdown there. A table in a Doc looks fine, but you cannot sort it, sum it, or chart it. The moment someone wants to actually analyse those numbers, they end up retyping the table into a spreadsheet cell by cell.

This script does that transfer for you. It walks a Doc, finds every table, reads each one row by row, and writes them all into a Sheet — each table labelled and separated by a few blank rows. From there the data is real spreadsheet data: sortable, summable, ready for a formula or a chart.

What you’ll need

  • A Google Doc that contains one or more tables. The script reads it by ID.
  • A Google Sheet to write into. The script clears its first tab and rebuilds it, so use a dedicated Sheet rather than one with data you want to keep.
  • Nothing else — no API key, no add-ons.

The script

// The Doc to read tables from — the long string from its URL.
const SOURCE_DOC_ID = '1abcDocId';

// The Sheet to write the extracted tables into.
const TARGET_SHEET_ID = '1abcSheetId';

// Blank rows left between one table block and the next.
const GAP_ROWS = 3;

/**
 * Finds every table in the source Doc and writes them into the first tab
 * of the target Sheet, each table labelled and spaced apart.
 */
function extractTables() {
  const body = DocumentApp.openById(SOURCE_DOC_ID).getBody();

  // 1. Walk the Doc and collect every table as a 2D array of cell text.
  const tables = [];
  for (let i = 0; i < body.getNumChildren(); i++) {
    const c = body.getChild(i);
    if (c.getType() !== DocumentApp.ElementType.TABLE) continue;

    const table = c.asTable();
    const rows = [];
    for (let r = 0; r < table.getNumRows(); r++) {
      const row = [];
      for (let cl = 0; cl < table.getRow(r).getNumCells(); cl++) {
        row.push(table.getCell(r, cl).getText());
      }
      rows.push(row);
    }
    tables.push(rows);
  }

  if (!tables.length) {
    Logger.log('No tables found in the Doc — nothing to extract.');
    return;
  }

  // 2. Clear the target tab so the output reflects only this run.
  const sheet = SpreadsheetApp.openById(TARGET_SHEET_ID).getSheets()[0];
  sheet.clear();

  // 3. Write each table as a labelled block, leaving a gap between blocks.
  let rowOffset = 1;
  tables.forEach((t, i) => {
    sheet.getRange(rowOffset, 1)
      .setValue(`Table ${i + 1}`)
      .setFontWeight('bold');
    sheet.getRange(rowOffset + 1, 1, t.length, t[0].length).setValues(t);
    rowOffset += t.length + GAP_ROWS;
  });

  Logger.log('Extracted ' + tables.length + ' tables into the Sheet.');
}

How it works

  1. extractTables opens the source Doc and loops through every child element of its body in document order.
  2. When it finds a TABLE, it reads it into a 2D array — one inner array per row, one string per cell — using getNumRows, getRow().getNumCells(), and getCell().getText().
  3. It collects every table into the tables array. If the Doc has no tables, it logs a message and stops.
  4. It opens the target Sheet’s first tab and clears it, so the output always reflects the current Doc rather than stacking on old data.
  5. It writes each table as a block: a bold Table N label, then the table data via a single setValues call. After each block it advances rowOffset by the table height plus GAP_ROWS blank rows, so the blocks stay visually separate.

Example run

Say the Doc contains two tables — a pricing grid and a headcount table. After a run, the Sheet’s first tab looks like this:

ABC
Table 1
PlanMonthlyAnnual
Starter990
Pro29290
Table 2
TeamHeadcountLead
Design6Priya
Engineering11Sam

Each table keeps its own shape, sits under a bold label, and is separated by blank rows — ready to sort, sum, or chart.

Run it

This is an on-demand job — run it whenever a report Doc lands:

  1. In the Apps Script editor, set SOURCE_DOC_ID and TARGET_SHEET_ID, select extractTables, and click Run.
  2. Approve the authorisation prompt the first time.
  3. Open the Sheet to see the extracted tables.

Watch out for

  • It clears the target tab on every run. Anything already on the first tab is wiped — point it at a dedicated Sheet, not a working one.
  • Every cell comes out as text. Numbers like 29 arrive as the string "29", not a number, so a SUM may not work until you coerce them. Run a multiply-by-1 paste or VALUE() pass, or convert in code before writing.
  • Ragged tables can throw. The script sizes each setValues block from the first row’s width. A table where rows have different cell counts — from merged cells — can mismatch and error. Tidy merged cells in the Doc first.
  • Nested tables are read as their text. A table inside a table cell is returned by getText() as a flattened string, not as its own block. Avoid nested tables in source Docs you plan to extract.
  • It reads tables in document order and numbers them Table 1, Table 2, and so on. If you reorder tables in the Doc, the numbering shifts with them.

Related