fix(cron): prevent spin loop when job completes within scheduled second (#17821)

When a cron job fires and completes within the same wall-clock second it
was scheduled for, the next-run computation could return undefined or the
same second, causing the scheduler to re-trigger the job hundreds of
times in a tight loop.

Two-layer fix:

1. computeJobNextRunAtMs: When computeNextRunAtMs returns undefined for a
   cron-kind schedule (edge case where floored nowSecondMs matches the
   schedule), retry with the ceiling (next second) as reference time.
   This ensures we always get the next valid occurrence.

2. applyJobResult: Add MIN_REFIRE_GAP_MS (2s) safety net for cron-kind
   jobs.  After a successful run, nextRunAtMs is guaranteed to be at
   least 2s in the future.  This breaks any remaining spin-loop edge
   cases without affecting normal daily/hourly schedules (where the
   natural next run is hours/days away).

Fixes #17821
This commit is contained in:
Marcus Widing
2026-02-16 14:09:16 +01:00
committed by Peter Steinberger
parent eed806ce58
commit 8af4712c40
3 changed files with 96 additions and 2 deletions

View File

@@ -96,7 +96,18 @@ export function computeJobNextRunAtMs(job: CronJob, nowMs: number): number | und
: null;
return atMs !== null ? atMs : undefined;
}
return computeNextRunAtMs(job.schedule, nowMs);
const next = computeNextRunAtMs(job.schedule, nowMs);
// Guard against the scheduler returning a time within the same second as
// nowMs. When a cron job completes within the same wall-clock second it
// was scheduled for, some croner versions/timezone combinations may return
// the current second (or computeNextRunAtMs may return undefined, which
// triggers recomputation). Advancing to the next second and retrying
// ensures we always land on the *next* occurrence. (See #17821)
if (next === undefined && job.schedule.kind === "cron") {
const nextSecondMs = (Math.floor(nowMs / 1000) + 1) * 1000;
return computeNextRunAtMs(job.schedule, nextSecondMs);
}
return next;
}
/** Maximum consecutive schedule errors before auto-disabling a job. */

View File

@@ -15,6 +15,15 @@ import { ensureLoaded, persist } from "./store.js";
const MAX_TIMER_DELAY_MS = 60_000;
/**
* Minimum gap between consecutive fires of the same cron job. This is a
* safety net that prevents spin-loops when `computeJobNextRunAtMs` returns
* a value within the same second as the just-completed run. The guard
* is intentionally generous (2 s) so it never masks a legitimate schedule
* but always breaks an infinite re-trigger cycle. (See #17821)
*/
const MIN_REFIRE_GAP_MS = 2_000;
/**
* Maximum wall-clock time for a single job execution. Acts as a safety net
* on top of the per-provider / per-agent timeouts to prevent one stuck job
@@ -107,7 +116,18 @@ function applyJobResult(
"cron: applying error backoff",
);
} else if (job.enabled) {
job.state.nextRunAtMs = computeJobNextRunAtMs(job, result.endedAt);
const naturalNext = computeJobNextRunAtMs(job, result.endedAt);
if (job.schedule.kind === "cron") {
// Safety net: ensure the next fire is at least MIN_REFIRE_GAP_MS
// after the current run ended. Prevents spin-loops when the
// schedule computation lands in the same second due to
// timezone/croner edge cases (see #17821).
const minNext = result.endedAt + MIN_REFIRE_GAP_MS;
job.state.nextRunAtMs =
naturalNext !== undefined ? Math.max(naturalNext, minNext) : minNext;
} else {
job.state.nextRunAtMs = naturalNext;
}
} else {
job.state.nextRunAtMs = undefined;
}