appscript.dev
Automation Advanced

Build a generic paginated-API fetcher

Handle cursors and pages for any large dataset — Northwind's standard pull pattern.

Published Dec 6, 2025

Most APIs Northwind pulls from — the CRM, the billing platform, the helpdesk — never hand back a whole dataset in one response. They page it: a chunk of records plus a cursor token for the next chunk. Writing a fresh paging loop for each integration means the same fiddly cursor logic copy-pasted everywhere, each copy with its own off-by-one bug waiting to happen.

This builds one reusable helper, fetchAllPages, that follows the cursor for you and returns the complete dataset as a single flat array. Point it at any endpoint that uses page tokens, tell it which JSON keys to look for, and it walks every page until there are none left.

What you’ll need

  • An API endpoint that paginates with a cursor token — it returns a token in the response when there is another page, and omits it on the last page.
  • The names of the two JSON keys that endpoint uses: the one holding the records array, and the one holding the next-page token. Defaults are data and nextPageToken; override them per API.
  • Any auth headers the endpoint requires (a bearer token, an API key header), ready to pass in.

The helper

// Hard ceiling on pages fetched in one run. A safety net against an
// API that never stops returning a token — see "Watch out for".
const MAX_PAGES = 100;

/**
 * Fetches every page of a cursor-paginated endpoint and returns all
 * records as one flat array.
 * @param {string} url - The base endpoint URL.
 * @param {Object} [opts] - Options.
 * @param {string} [opts.tokenKey] - JSON key holding the next-page token.
 * @param {string} [opts.dataKey] - JSON key holding the records array.
 * @param {Object} [opts.headers] - Request headers (e.g. auth).
 * @returns {Array} Every record across every page.
 */
function fetchAllPages(url, {
  tokenKey = 'nextPageToken',
  dataKey = 'data',
  headers = {},
} = {}) {
  const all = [];
  let token = null;
  let pages = 0;

  do {
    // 1. Append the page token to the URL, picking ? or & correctly.
    const sep = url.includes('?') ? '&' : '?';
    const pagedUrl = url + (token ? sep + 'pageToken=' + token : '');

    // 2. Fetch and parse this page.
    const res = JSON.parse(
      UrlFetchApp.fetch(pagedUrl, { headers }).getContentText());

    // 3. Collect the records from this page.
    all.push(...(res[dataKey] || []));

    // 4. Read the cursor for the next page — undefined ends the loop.
    token = res[tokenKey];
    pages++;

    // 5. Stop if the API never stops handing back a token.
    if (pages >= MAX_PAGES) {
      Logger.log('Hit MAX_PAGES (' + MAX_PAGES + ') — stopping.');
      break;
    }
  } while (token);

  return all;
}

How it works

  1. fetchAllPages takes the base URL and an options object with sensible defaults, so a simple endpoint needs nothing but the URL.
  2. Inside the loop it builds the request URL: on the first pass there is no token, and on later passes it appends pageToken= plus the cursor — using & if the URL already has a query string, ? if not.
  3. It fetches the page, parses the JSON, and pushes that page’s records — read from dataKey — onto the running all array. || [] guards against a page that has no records key at all.
  4. It reads the next-page token from tokenKey. While that token exists the do…while loops again; when the API omits it, the loop ends.
  5. A MAX_PAGES counter caps the run. If an endpoint ever returns a token forever, the helper logs a message and stops rather than hanging.
  6. When the loop ends, it returns every record from every page as one array.

Example run

Call it with just a URL for a simple endpoint, or with options for one that uses different key names:

// Simple case — defaults (data / nextPageToken) apply.
const customers = fetchAllPages('https://api.example/customers?limit=100');
Logger.log(customers.length + ' customers fetched.');

// Custom key names and an auth header.
const tickets = fetchAllPages('https://api.helpdesk/tickets', {
  dataKey: 'results',
  tokenKey: 'cursor',
  headers: { Authorization: 'Bearer ' + token },
});

If the customers endpoint returns 100 records per page across 7 pages, the first call makes 7 requests behind the scenes and customers ends up holding all 650-odd records — no paging code in your own script.

Run it

This is a helper, not a standalone job. Drop fetchAllPages into any script and call it wherever you would otherwise fetch a list — typically inside an on-demand sync function or a time-driven trigger that refreshes a sheet.

Watch out for

  • UrlFetchApp has a daily call quota and each page is one call. A 50-page pull burns 50 calls — for very large datasets, fetch on a schedule and cache the result rather than re-pulling everything on demand.
  • MAX_PAGES is a deliberate safety net. If a real dataset genuinely exceeds it, raise the limit — but first check you are reading the right tokenKey, since a misnamed key can make the loop think there is always another page.
  • Pulling everything into one array uses memory. Apps Script has a script runtime limit (around 6 minutes); a huge dataset can blow it. Process pages as you go, or fetch in scheduled batches, if the dataset is enormous.
  • The helper assumes the token goes in a pageToken query parameter. APIs that expect the cursor in a header, the body, or a differently named parameter need that line adjusted.
  • There is no retry on a failed request. Wrap the UrlFetchApp.fetch call in a try/catch with muteHttpExceptions if the API is flaky, so one bad page does not lose the whole pull.

Related