appscript.dev
Automation Intermediate Sheets

Parse messy mixed-format dates

Normalise inconsistently formatted strings into real date values with a single formula.

Published Aug 2, 2025

Northwind pulls timesheets from three contractors, and every one of them writes dates differently. One vendor exports 12/05/2025, another 2025-05-12, the third May 12, 2025. When those land in the same column, Sheets treats most of them as plain text — they will not sort, will not subtract, and will not feed a chart. Fixing them by hand, row by row, is exactly the kind of dull job that never quite gets done.

A custom function turns the mess into real dates. =PARSE_DATE() recognises each of the three formats, works out the year, month, and day, and returns a genuine Date value Sheets can sort and calculate with. Point it at the messy column, format the result as a date, and the contractors’ habits stop being your problem.

What you’ll need

  • A Google Sheet with a column of date strings in mixed formats.
  • A header row, so the formula starts on row 2.
  • An empty column for the cleaned output, formatted as Date (Format → Number → Date).
  • Nothing else — the function is pure JavaScript, with no API key or external call.

The script

// Each supported format, paired with a label so the parser can tell
// which capture groups mean what. Order does not matter — a string is
// tested against every pattern until one matches.
const DATE_FORMATS = [
  { name: 'iso', re: /^(\d{4})-(\d{2})-(\d{2})$/ },          // 2025-05-12
  { name: 'slash', re: /^(\d{1,2})\/(\d{1,2})\/(\d{4})$/ },  // 12/05/2025 (DD/MM)
  { name: 'long', re: /^([A-Za-z]+)\s+(\d{1,2}),?\s+(\d{4})$/ }, // May 12, 2025
];

// Month names mapped to their zero-based index, matched on the first
// three letters so "May", "may", and "May." all resolve.
const MONTHS = {
  jan: 0, feb: 1, mar: 2, apr: 3, may: 4, jun: 5,
  jul: 6, aug: 7, sep: 8, oct: 9, nov: 10, dec: 11,
};

/**
 * Normalises a mixed-format date string into a real Date value.
 *
 * @param {string|Date} text The cell value to parse.
 * @return {Date|string} A Date when a format matches, otherwise ''.
 * @customfunction
 */
function PARSE_DATE(text) {
  // 1. Already a real date (or an empty cell) — nothing to parse.
  if (text instanceof Date) return text;
  if (text === '' || text === null || text === undefined) return '';

  const s = String(text).trim();

  // 2. Test the string against each known format in turn.
  for (const format of DATE_FORMATS) {
    const m = s.match(format.re);
    if (!m) continue;

    // 3. Build a Date from the right capture groups. Month is zero-based,
    //    so subtract one from any numeric month.
    if (format.name === 'iso') {
      return new Date(+m[1], +m[2] - 1, +m[3]);
    }
    if (format.name === 'slash') {
      // Capture order is day, month, year — the DD/MM convention.
      return new Date(+m[3], +m[2] - 1, +m[1]);
    }
    if (format.name === 'long') {
      const month = MONTHS[m[1].slice(0, 3).toLowerCase()];
      if (month === undefined) return '';
      return new Date(+m[3], month, +m[2]);
    }
  }

  // 4. No format matched — return blank so a bad row is easy to spot.
  return '';
}

How it works

  1. PARSE_DATE is tagged @customfunction, so Sheets exposes it as a formula. If the cell already holds a real Date, or is empty, it returns straight away — there is nothing to normalise.
  2. The value is converted to a string and trimmed of stray whitespace.
  3. The string is tested against each pattern in DATE_FORMATS. Each pattern has a name so the parser knows which capture group is the year, the month, and the day.
  4. For the iso format the groups are already year-month-day. For slash the capture order is day-month-year — Northwind treats 12/05/2025 as the 12th of May, the DD/MM convention. For long the month name is matched on its first three letters against the MONTHS table.
  5. JavaScript months are zero-based, so every numeric month has one subtracted before the Date is built.
  6. If no pattern matches, the function returns an empty string. A blank cell in the output column is an obvious flag that a row needs a human eye.

Example run

Say column A holds dates from all three contractors:

Raw date (A)=PARSE_DATE(A2)
2025-05-1212/05/2025
12/05/202512/05/2025
May 12, 202512/05/2025
12 May 2025(blank — unrecognised format)

The first three rows all resolve to the same real date, displayed in your sheet’s date format. The fourth is left blank because that format is not in the list — a clear signal to either fix the row or add a new pattern.

Use it

In the output column, on the first data row:

=PARSE_DATE(A2)

Fill it down the column. The result is a real date value, but Sheets may show it as a serial number until you format the column: select it, then Format → Number → Date. Once formatted, the cleaned column will sort, subtract, and feed charts like any normal date.

Watch out for

  • 12/05/2025 is read as DD/MM — the 12th of May, not the 5th of December. If a contractor sends US-style MM/DD dates, this function will silently mangle them. Confirm each vendor’s convention before trusting the output.
  • Only the three listed formats are supported. Anything else — 12 May 2025, 2025/05/12, two-digit years — returns blank. Add a new entry to DATE_FORMATS for each format you need, and a matching branch in the loop.
  • The slash pattern requires a four-digit year. A value like 12/5/25 will not match, so two-digit years must be cleaned up before parsing or handled with an extra format.
  • The result is a date with no time component, set at local midnight. If your source strings ever carry a time, it is dropped. For timesheet dates that is usually what you want, but be aware of it before doing time-of-day maths.

Related