appscript.dev
Automation Advanced Drive

Split a multi-page PDF into separate files

Break a scanned Northwind batch into individual docs — one PDF per page.

Published Sep 30, 2025

Northwind’s scanner produces one fat PDF per batch — twenty delivery notes scanned in a single pass, all stuck inside Batch 0312.pdf. That is fine for the scanner and useless for everyone else: you cannot file note seven on its own, or email note twelve to a client, without first prising it out.

The catch is that Apps Script cannot split a PDF natively — it can read and write the bytes, but it has no idea where one page ends and the next begins. The fix is the same as for merging: hand the PDF to a small splitting service over HTTP, get the individual pages back, and let Apps Script do what it is good at — filing each page neatly in Drive.

What you’ll need

  • A PDF-splitting endpoint that accepts a PDF body and returns each page as a base64 string. This can be a hosted service or a tiny function you deploy yourself; the script only cares about the request and response shape.
  • If the endpoint needs an API key, store it in Script Properties rather than pasting it into the code — see Store API keys and secrets securely.
  • An output folder in Drive for the split pages.

The script

// PDF-splitting endpoint. It takes a PDF body and returns
// { "pages": ["<base64>", "<base64>", ...] }, one entry per page.
const SPLIT_API = 'https://your-pdf-service.example/split';

/**
 * Splits a multi-page PDF into one file per page and saves each page
 * into the output folder.
 *
 * @param {string} fileId The Drive ID of the multi-page PDF.
 * @param {string} outputFolderId The folder to write the split pages into.
 */
function splitPdf(fileId, outputFolderId) {
  const file = DriveApp.getFileById(fileId);

  // 1. Send the PDF's raw bytes to the splitting service.
  const res = UrlFetchApp.fetch(SPLIT_API, {
    method: 'post',
    contentType: 'application/pdf',
    payload: file.getBlob().getBytes(),
    muteHttpExceptions: true,
  });

  // 2. Guard against a failed call before trying to parse the body.
  if (res.getResponseCode() !== 200) {
    throw new Error('Split service returned ' + res.getResponseCode());
  }

  // 3. The service returns one base64 string per page.
  const pages = JSON.parse(res.getContentText()).pages;
  if (!pages || !pages.length) {
    Logger.log('Service returned no pages — nothing to save.');
    return;
  }

  // 4. Decode each page into a PDF blob and file it in the output folder.
  const folder = DriveApp.getFolderById(outputFolderId);
  const baseName = file.getName().replace(/\.pdf$/i, '');
  pages.forEach((page, i) => {
    const blob = Utilities.newBlob(
      Utilities.base64Decode(page),
      'application/pdf',
      `${baseName} - page ${i + 1}.pdf`
    );
    folder.createFile(blob);
  });

  Logger.log('Split ' + file.getName() + ' into ' + pages.length + ' file(s).');
}

How it works

  1. splitPdf opens the source PDF by ID so it can read its bytes and reuse its name later.
  2. It POSTs the raw PDF bytes to SPLIT_API. muteHttpExceptions keeps a non-200 response from throwing before the script can inspect it.
  3. It checks the response code first. A splitting service that is down or rejecting the file should fail loudly here, not produce half a result.
  4. It parses the JSON body, expecting a pages array of base64 strings — one per page. If the array is missing or empty, it logs and stops.
  5. For each page it decodes the base64 back into bytes, wraps them in a PDF blob, and names it <original> - page N.pdf.
  6. folder.createFile saves each page into the output folder, and the script logs how many files it produced.

Example run

Say Batch 0312.pdf is a three-page scan. After running splitPdf against it with an output folder set:

InputOutput files
Batch 0312.pdf (3 pages)Batch 0312 - page 1.pdf
Batch 0312 - page 2.pdf
Batch 0312 - page 3.pdf

The log reads Split Batch 0312.pdf into 3 file(s). The original batch PDF is left in place, so you can re-run if a page comes out wrong.

Run it

This runs on demand, whenever a new batch needs breaking up:

  1. Set SPLIT_API to your splitting endpoint.
  2. In the Apps Script editor, open a function that calls splitPdf with the batch’s file ID and your output folder ID — or call it from the editor’s Run panel with those arguments.
  3. Approve the authorisation prompt the first time.
  4. Check the output folder for the numbered pages.

To split batches automatically as they land, wrap splitPdf in a function that loops over a “to split” folder and run that on a time-driven trigger.

Watch out for

  • The split happens off-platform. Whatever service you point SPLIT_API at receives the full PDF — only use an endpoint you trust, and prefer one you control for anything confidential.
  • getBytes() loads the whole PDF into memory. Very large batches can hit Apps Script’s memory or runtime limits; split those in smaller chunks or raise the page count gradually.
  • The page count comes entirely from the service. If it returns fewer pages than the PDF actually has, the script will happily save the short result — spot-check the output folder after the first few runs.
  • Re-running on the same PDF creates a second set of page N files. Move the source PDF out of the way once it is split, or clear the output folder first.
  • Names are derived from the original file. If two batches share a name, their pages will collide in the same output folder — give each batch its own subfolder if that is a risk.

Related