feat(cron): add failure destination support to failed cron jobs (#31059)

* feat(cron): add failure destination support with webhook mode and bestEffort handling

Extends PR #24789 failure alerts with features from PR #29145:
- Add webhook delivery mode for failure alerts (mode: 'webhook')
- Add accountId support for multi-account channel configurations
- Add bestEffort handling to skip alerts when job has bestEffort=true
- Add separate failureDestination config (global + per-job in delivery)
- Add duplicate prevention (prevents sending to same as primary delivery)
- Add CLI flags: --failure-alert-mode, --failure-alert-account-id
- Add UI fields for new options in web cron editor

* fix(cron): merge failureAlert mode/accountId and preserve failureDestination on updates

- Fix mergeCronFailureAlert to merge mode and accountId fields
- Fix mergeCronDelivery to preserve failureDestination on updates
- Fix isSameDeliveryTarget to use 'announce' as default instead of 'none'
  to properly detect duplicates when delivery.mode is undefined

* fix(cron): validate webhook mode requires URL in resolveFailureDestination

When mode is 'webhook' but no 'to' URL is provided, return null
instead of creating an invalid plan that silently fails later.

* fix(cron): fail closed on webhook mode without URL and make failureDestination fields clearable

- sendCronFailureAlert: fail closed when mode is webhook but URL is missing
- mergeCronDelivery: use per-key presence checks so callers can clear
  nested failureDestination fields via cron.update

Note: protocol:check shows missing internalEvents in Swift models - this is
a pre-existing issue unrelated to these changes (upstream sync needed).

* fix(cron): use separate schema for failureDestination and fix type cast

- Create CronFailureDestinationSchema excluding after/cooldownMs fields
- Fix type cast in sendFailureNotificationAnnounce to use CronMessageChannel

* fix(cron): merge global failureDestination with partial job overrides

When job has partial failureDestination config, fall back to global
config for unset fields instead of treating it as a full override.

* fix(cron): avoid forcing announce mode and clear inherited to on mode change

- UI: only include mode in patch if explicitly set to non-default
- delivery.ts: clear inherited 'to' when job overrides mode, since URL
  semantics differ between announce and webhook modes

* fix(cron): preserve explicit to on mode override and always include mode in UI patches

- delivery.ts: preserve job-level explicit 'to' when overriding mode
- UI: always include mode in failureAlert patch so users can switch between announce/webhook

* fix(cron): allow clearing accountId and treat undefined global mode as announce

- UI: always include accountId in patch so users can clear it
- delivery.ts: treat undefined global mode as announce when comparing for clearing inherited 'to'

* Cron: harden failure destination routing and add regression coverage

* Cron: resolve failure destination review feedback

* Cron: drop unrelated timeout assertions from conflict resolution

* Cron: format cron CLI regression test

* Cron: align gateway cron test mock types

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
This commit is contained in:
Evgeny Zislis
2026-03-02 17:27:41 +02:00
committed by GitHub
parent a905b6dabc
commit 4b4ea5df8b
22 changed files with 993 additions and 89 deletions

View File

@@ -138,10 +138,33 @@ export const CronPayloadPatchSchema = Type.Union([
cronAgentTurnPayloadSchema({ message: Type.Optional(NonEmptyString) }),
]);
export const CronFailureAlertSchema = Type.Object(
{
after: Type.Optional(Type.Integer({ minimum: 1 })),
channel: Type.Optional(Type.Union([Type.Literal("last"), NonEmptyString])),
to: Type.Optional(Type.String()),
cooldownMs: Type.Optional(Type.Integer({ minimum: 0 })),
mode: Type.Optional(Type.Union([Type.Literal("announce"), Type.Literal("webhook")])),
accountId: Type.Optional(NonEmptyString),
},
{ additionalProperties: false },
);
export const CronFailureDestinationSchema = Type.Object(
{
channel: Type.Optional(Type.Union([Type.Literal("last"), NonEmptyString])),
to: Type.Optional(Type.String()),
accountId: Type.Optional(NonEmptyString),
mode: Type.Optional(Type.Union([Type.Literal("announce"), Type.Literal("webhook")])),
},
{ additionalProperties: false },
);
const CronDeliverySharedProperties = {
channel: Type.Optional(Type.Union([Type.Literal("last"), NonEmptyString])),
accountId: Type.Optional(NonEmptyString),
bestEffort: Type.Optional(Type.Boolean()),
failureDestination: Type.Optional(CronFailureDestinationSchema),
};
const CronDeliveryNoopSchema = Type.Object(
@@ -188,16 +211,6 @@ export const CronDeliveryPatchSchema = Type.Object(
{ additionalProperties: false },
);
export const CronFailureAlertSchema = Type.Object(
{
after: Type.Optional(Type.Integer({ minimum: 1 })),
channel: Type.Optional(Type.Union([Type.Literal("last"), NonEmptyString])),
to: Type.Optional(Type.String()),
cooldownMs: Type.Optional(Type.Integer({ minimum: 0 })),
},
{ additionalProperties: false },
);
export const CronJobStateSchema = Type.Object(
{
nextRunAtMs: Type.Optional(Type.Integer({ minimum: 0 })),