AWS Lambda Scheduled Jobs in 2026: EventBridge Cron Rules, Error Handling, and Terraform

Q: What are the benefits of AWS?

For most SaaS cron needs, EventBridge Scheduler is the better choice in 2026 — built-in retry, timezone support, and flexible windows.

Every SaaS product needs scheduled jobs: send the weekly digest email, retry failed payments, generate monthly invoices, clean up expired sessions. On AWS, the standard pattern is EventBridge Scheduler (or EventBridge Rules) triggering Lambda. It's fully managed, has no servers to maintain, and costs almost nothing for typical cron workloads.

This post covers EventBridge cron and rate expressions, the Terraform configuration, Lambda handler patterns for scheduled work, overlap prevention using DynamoDB, DLQ for failed invocations, and idempotency for safe retries.

EventBridge vs EventBridge Scheduler

Two options in 2026:

Feature	EventBridge Rules	EventBridge Scheduler
Schedule types	cron + rate	cron + rate + one-time
Timezone support	UTC only	✅ Any timezone
Flexible windows	No	✅ (run within X minutes of schedule)
Retry on failure	No (need DLQ)	✅ Built-in retry + DLQ
Cost	Free (first 5M/month)	$1.00/M invocations
Use for	Simple recurring jobs	Timezone-aware, one-time, retry-built-in

For most SaaS cron needs, EventBridge Scheduler is the better choice in 2026 — built-in retry, timezone support, and flexible windows.

Terraform: EventBridge Scheduler

# terraform/scheduled-jobs.tf

# IAM role for EventBridge Scheduler to invoke Lambda
resource "aws_iam_role" "scheduler" {
  name = "${var.name}-${var.environment}-scheduler"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "scheduler.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy" "scheduler_invoke" {
  name = "invoke-lambdas"
  role = aws_iam_role.scheduler.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "lambda:InvokeFunction"
      Resource = [
        aws_lambda_function.weekly_digest.arn,
        aws_lambda_function.payment_retry.arn,
        aws_lambda_function.session_cleanup.arn,
      ]
    }]
  })
}

# Weekly digest: every Monday at 9am US/Eastern
resource "aws_scheduler_schedule" "weekly_digest" {
  name       = "${var.name}-${var.environment}-weekly-digest"
  group_name = aws_scheduler_schedule_group.app.name

  flexible_time_window {
    mode                      = "FLEXIBLE"
    maximum_window_in_minutes = 15  # Run within 15 min of 9am (reduces cold starts)
  }

  schedule_expression          = "cron(0 9 ? * MON *)"
  schedule_expression_timezone = "America/New_York"

  target {
    arn      = aws_lambda_function.weekly_digest.arn
    role_arn = aws_iam_role.scheduler.arn

    input = jsonencode({
      jobType = "weekly_digest"
      env     = var.environment
    })

    retry_policy {
      maximum_retry_attempts       = 2
      maximum_event_age_in_seconds = 3600  # Give up after 1 hour
    }

    dead_letter_config {
      arn = aws_sqs_queue.scheduler_dlq.arn
    }
  }
}

# Payment retry: every hour
resource "aws_scheduler_schedule" "payment_retry" {
  name       = "${var.name}-${var.environment}-payment-retry"
  group_name = aws_scheduler_schedule_group.app.name

  flexible_time_window {
    mode = "OFF"  # Must run exactly on schedule (payments are time-sensitive)
  }

  schedule_expression          = "rate(1 hour)"
  schedule_expression_timezone = "UTC"

  target {
    arn      = aws_lambda_function.payment_retry.arn
    role_arn = aws_iam_role.scheduler.arn

    retry_policy {
      maximum_retry_attempts       = 1
      maximum_event_age_in_seconds = 300  # Give up after 5 min
    }

    dead_letter_config {
      arn = aws_sqs_queue.scheduler_dlq.arn
    }
  }
}

# Session cleanup: daily at 3am UTC
resource "aws_scheduler_schedule" "session_cleanup" {
  name       = "${var.name}-${var.environment}-session-cleanup"
  group_name = aws_scheduler_schedule_group.app.name

  flexible_time_window {
    mode                      = "FLEXIBLE"
    maximum_window_in_minutes = 60  # Run any time within the 3am hour
  }

  schedule_expression          = "cron(0 3 * * ? *)"
  schedule_expression_timezone = "UTC"

  target {
    arn      = aws_lambda_function.session_cleanup.arn
    role_arn = aws_iam_role.scheduler.arn

    retry_policy {
      maximum_retry_attempts = 2
      maximum_event_age_in_seconds = 7200
    }

    dead_letter_config {
      arn = aws_sqs_queue.scheduler_dlq.arn
    }
  }
}

resource "aws_scheduler_schedule_group" "app" {
  name = "${var.name}-${var.environment}"
  tags = var.common_tags
}

# DLQ for failed scheduler invocations
resource "aws_sqs_queue" "scheduler_dlq" {
  name                      = "${var.name}-${var.environment}-scheduler-dlq"
  message_retention_seconds = 1209600  # 14 days

  tags = var.common_tags
}

# Alert when DLQ has messages
resource "aws_cloudwatch_metric_alarm" "scheduler_dlq" {
  alarm_name          = "${var.name}-${var.environment}-scheduler-dlq-depth"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "ApproximateNumberOfMessagesVisible"
  namespace           = "AWS/SQS"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_description   = "Scheduled job failed and hit DLQ"
  alarm_actions       = [aws_sns_topic.alerts.arn]

  dimensions = { QueueName = aws_sqs_queue.scheduler_dlq.name }
}

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

Cron Expression Reference

EventBridge cron format: cron(Minutes Hours Day-of-month Month Day-of-week Year)

cron(0 9 ? * MON *)      → Every Monday at 9:00 AM
cron(0 9 1 * ? *)        → 1st of every month at 9:00 AM
cron(0 0 L * ? *)        → Last day of every month at midnight
cron(0 */6 * * ? *)      → Every 6 hours
cron(0 9 ? * MON-FRI *)  → Weekdays at 9:00 AM
cron(15 14 1 1 ? *)      → January 1st at 2:15 PM (annual report)

rate expressions:
rate(1 minute)
rate(5 minutes)
rate(1 hour)
rate(7 days)

Note: Day-of-month and Day-of-week cannot both be specified; use ? for the one not used.

Lambda Handler: Scheduled Job Pattern

// handlers/weekly-digest.ts
import type { ScheduledEvent, Context } from "aws-lambda";
import { db } from "@/lib/db";
import { sendDigestEmail } from "@/lib/emails/digest";
import { acquireJobLock, releaseJobLock } from "@/lib/jobs/lock";

export async function handler(event: ScheduledEvent, context: Context) {
  const jobId = "weekly-digest";
  const executionId = context.awsRequestId;

  console.log(`[${jobId}] Starting execution ${executionId}`);

  // Idempotency: skip if already ran for this period
  const weekKey = getWeekKey(); // e.g., "2027-W08"
  const alreadyRan = await db.jobExecution.findUnique({
    where: { jobId_periodKey: { jobId, periodKey: weekKey } },
  });

  if (alreadyRan) {
    console.log(`[${jobId}] Already ran for ${weekKey}, skipping`);
    return { skipped: true, reason: "already_ran_for_period" };
  }

  // Record execution (idempotent upsert)
  await db.jobExecution.create({
    data: {
      jobId,
      periodKey: weekKey,
      executionId,
      status: "running",
      startedAt: new Date(),
    },
  });

  const stats = { sent: 0, skipped: 0, failed: 0 };

  try {
    // Process in batches to avoid Lambda timeout
    let cursor: string | undefined;

    do {
      const workspaces = await db.workspace.findMany({
        where: {
          subscription: { status: "active" },
          settings: { weeklyDigest: true },
          ...(cursor && { id: { gt: cursor } }),
        },
        take: 50,
        orderBy: { id: "asc" },
        select: { id: true, name: true },
      });

      if (workspaces.length === 0) break;

      await Promise.allSettled(
        workspaces.map(async (workspace) => {
          try {
            await sendDigestEmail(workspace.id);
            stats.sent++;
          } catch (err) {
            console.error(`Failed to send digest for workspace ${workspace.id}:`, err);
            stats.failed++;
          }
        })
      );

      cursor = workspaces[workspaces.length - 1].id;

      // Check remaining Lambda time (leave 30s buffer)
      const remainingMs = context.getRemainingTimeInMillis();
      if (remainingMs < 30_000) {
        console.warn(`[${jobId}] Running low on time (${remainingMs}ms), stopping`);
        break;
      }
    } while (true);

    await db.jobExecution.update({
      where: { jobId_periodKey: { jobId, periodKey: weekKey } },
      data: { status: "completed", completedAt: new Date(), result: stats },
    });

    console.log(`[${jobId}] Completed:`, stats);
    return { success: true, stats };
  } catch (err) {
    await db.jobExecution.update({
      where: { jobId_periodKey: { jobId, periodKey: weekKey } },
      data: { status: "failed", completedAt: new Date(), error: String(err) },
    });
    throw err; // Re-throw for EventBridge Scheduler retry
  }
}

function getWeekKey(): string {
  const now = new Date();
  const start = new Date(now);
  start.setHours(0, 0, 0, 0);
  start.setDate(start.getDate() - start.getDay()); // Sunday
  return `${start.getFullYear()}-W${getWeekNumber(start).toString().padStart(2, "0")}`;
}

function getWeekNumber(date: Date): number {
  const d = new Date(Date.UTC(date.getFullYear(), date.getMonth(), date.getDate()));
  const dayNum = d.getUTCDay() || 7;
  d.setUTCDate(d.getUTCDate() + 4 - dayNum);
  const yearStart = new Date(Date.UTC(d.getUTCFullYear(), 0, 1));
  return Math.ceil(((d.getTime() - yearStart.getTime()) / 86400000 + 1) / 7);
}

AWS - AWS Lambda Scheduled Jobs in 2026: EventBridge Cron Rules, Error Handling, and Terraform

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

Overlap Prevention (Long-Running Jobs)

For jobs that could overlap if the previous run is still active:

// lib/jobs/lock.ts — DynamoDB-based distributed lock
import { DynamoDBClient, PutItemCommand, DeleteItemCommand } from "@aws-sdk/client-dynamodb";

const dynamo = new DynamoDBClient({ region: process.env.AWS_REGION });
const TABLE = process.env.JOB_LOCKS_TABLE!;

export async function acquireJobLock(jobId: string, ttlSeconds: number = 3600): Promise<boolean> {
  const expiresAt = Math.floor(Date.now() / 1000) + ttlSeconds;

  try {
    await dynamo.send(new PutItemCommand({
      TableName: TABLE,
      Item: {
        jobId:     { S: jobId },
        expiresAt: { N: String(expiresAt) },
        acquiredAt: { N: String(Math.floor(Date.now() / 1000)) },
      },
      // Only succeed if item doesn't exist (or has expired)
      ConditionExpression: "attribute_not_exists(jobId) OR expiresAt < :now",
      ExpressionAttributeValues: {
        ":now": { N: String(Math.floor(Date.now() / 1000)) },
      },
    }));
    return true; // Lock acquired
  } catch (err: any) {
    if (err.name === "ConditionalCheckFailedException") {
      return false; // Lock held by another execution
    }
    throw err;
  }
}

export async function releaseJobLock(jobId: string): Promise<void> {
  await dynamo.send(new DeleteItemCommand({
    TableName: TABLE,
    Key: { jobId: { S: jobId } },
  }));
}

Cost Estimation

Schedule	Invocations/Month	EventBridge Scheduler Cost
Hourly	720	~$0.001
Daily	30	~$0.00003
Weekly	4	~$0.000004
Every minute	43,200	~$0.04
All above combined	~44,000	~$0.04 + Lambda execution

Lambda execution: ~$0.000017 per 1s at 256MB. Cron jobs are effectively free at typical volumes.

Additional Resources

AWS Lambda Cold Start Optimization — Minimize cold start for scheduled jobs
AWS SQS SNS Patterns — Scheduling via SQS visibility timeout
SaaS Dunning Management — Scheduled payment retry pattern
SaaS Email Sequences — Drip emails via scheduled jobs

Our Approach at Viprasol

We design and implement AWS scheduled job systems for SaaS products — from simple hourly cleanup jobs through complex multi-step batch workflows with overlap prevention and failure monitoring. Our cloud team has shipped scheduled systems processing millions of records per run.

What we deliver:

EventBridge Scheduler setup with cron expressions and timezone support
Lambda handler with idempotency keys and batch cursor pagination
DynamoDB-based overlap prevention for long-running jobs
DLQ configuration with CloudWatch alarm
Terraform module for the complete scheduled job stack

See our cloud infrastructure services or contact us to build your scheduled job system.

AWS Lambda Scheduled Jobs in 2026: EventBridge Cron Rules, Error Handling, and Terraform

AWS Lambda Scheduled Jobs in 2026: EventBridge Cron Rules, Error Handling, and Terraform

EventBridge vs EventBridge Scheduler

Terraform: EventBridge Scheduler

☁️ Is Your Cloud Costing Too Much?

Cron Expression Reference

Lambda Handler: Scheduled Job Pattern

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Recommended Reading

Overlap Prevention (Long-Running Jobs)

Cost Estimation

Additional Resources

Our Approach at Viprasol

External Resources

Viprasol Tech Team

Need DevOps & Cloud Expertise?

Making sense of your data at scale?

Related Articles

AWS EventBridge: Event Rules, Cross-Account Routing, Schema Registry, and Terraform

AWS Lambda Layers: Shared Dependencies, Custom Runtimes, and Terraform IaC

AWS Lambda in VPC: RDS Access, NAT Gateway vs Interface Endpoints, and Cold Start Impact

AWS Lambda Container Images: ECR, Multi-Stage Dockerfiles, and Cold Start Optimization

AWS Parameter Store vs Secrets Manager in 2026: Hierarchical Config, Rotation, and Terraform

AWS Lambda Container Images in 2026: Custom Runtimes, Large Dependencies, and ECR Deployment