members-only post

Data Engineering With Dagster Part Five – Automating With Schedules

Dagster finally earns its “orchestrator” title - this part dives into jobs, asset selection, cron expressions, and how to wire everything into automated schedules.
Data Engineering With Dagster Part Five – Automating With Schedules

Finally Automating: Dagster Schedules

Let’s be honest - if you’re still manually triggering pipeline runs in 2025, your orchestrator is more like a glorified button clicker.

In this post, we’ll hook Dagster up with schedules to automate jobs and run selected assets on our terms - daily, weekly, monthly, you name it.


What Is a Schedule?

Think: “Run this job every Monday at midnight.” That’s what a schedule is - it defines when a Dagster job should execute.

You’ve probably seen similar setups using cron in Linux or crontab files to schedule shell scripts.

Dagster just gives it all a Pythonic polish and tucks it neatly into your orchestration layer.

Anatomy of a Dagster Schedule

To build a schedule, you need:

  • Job (what to run)
  • Cron expression (when to run it)

Other optional components:

  • Tags
  • Run configuration
  • Execution time evaluation

But for now, we’re keeping it lean and practical: Job + Cron = Schedule.

Step 1: Define Your Job

When working with a big asset graph, you don’t always want to materialize everything in one go. Jobs let you slice up your graph and selectively execute parts of it.

Here’s how to isolate an asset using AssetSelection and define a custom job:

# jobs.py
import dagster as dg

trips_by_week = dg.AssetSelection.assets("trips_by_week")

trip_update_job = dg.define_asset_job(
    name="trip_update_job",
    selection=dg.AssetSelection.all() - trips_by_week
)
💡 We exclude trips_by_week from this job because it has its own separate schedule. Think of this as the "everything else" job.

Mini Excursus: What Are Dagster Jobs, Really?

job in Dagster is a reusable, named way to trigger a set of asset materializations.

Why care?

  • You can create different jobs for different subsets of assets.
  • Run one job in a K8s pod and another in-process.
  • Schedule them differently.
  • Add custom configs or tags.

Best practice: keep job definitions in a dedicated module, like jobs.py.


Step 2: Add a Second Job for Our Weekly Asset

Let’s create a job that only runs the trips_by_week asset:

# jobs.py
weekly_update_job = dg.define_asset_job(
    name="weekly_update_job",
    selection=trips_by_week
)

Simple and scoped. Clean separation of logic.

Step 3: Cron Expressions 101

Cron is that weird 5-part string format that looks like keyboard spam but secretly controls most of automation:

15 5 * * 1-5

That means: Every weekday (Mon–Fri) at 5:15AM

Dagster uses the same syntax. Example:

# schedules.py
trip_update_schedule = dg.ScheduleDefinition(
    job=trip_update_job,
    cron_schedule="0 0 5 * *",  # Every 5th of the month at midnight
)

Mini Excursus: Understanding Cron Syntax

FieldMeaningExampleNotes
Minute0–590Top of the hour
Hour0–230Midnight
Day1–3155th of the month
Month1–12*Every month
Weekday0–6 (Sun–Sat)1Every Monday

Want help building cron expressions? Use Crontab Guru - it’s simple, accurate, and shows examples live.


Step 4: Schedule the Weekly Job

# schedules.py
from dagster_essentials.jobs import weekly_update_job

weekly_update_schedule = dg.ScheduleDefinition(
    job=weekly_update_job,
    cron_schedule="0 0 * * 1",  # Every Monday at midnight
)

Step 5: Plug Jobs into Definitions

Head to definitions.py and tell Dagster about your jobs and schedules:

# definitions.py
from dagster_essentials.jobs import trip_update_job, weekly_update_job
from dagster_essentials.schedules import trip_update_schedule, weekly_update_schedule

all_jobs = [trip_update_job, weekly_update_job]
all_schedules = [trip_update_schedule, weekly_update_schedule]

defs = dg.Definitions(
    assets=[*trip_assets, *metric_assets],
    resources={"database": database_resource},
    jobs=all_jobs,
    schedules=all_schedules,
)

Boom. Now it’s all hooked in.


Testing It in the UI

Spin up the Dagster UI (dagster dev) and check:

  • Jobs: Overview > Jobs
  • Schedules: Overview > Schedules

You’ll see:

FieldDescription
Job Nametrip_update_job / weekly_update_job
Schedules LinkedOne or more
Last RunTimestamp of most recent execution
Enabled?Yes/No toggle (click to toggle state)

You can test schedules using the "Test Schedule" button - simulate ticks and preview runs before they’re real.


Mini Excursus: dagster-daemon

When you run dagster dev, you also spawn dagster-daemon under the hood.

This background process handles:

  • Running schedules
  • Polling sensors
  • Executing retries or hooks
Without the daemon, schedules won’t tick. So always make sure it’s up when testing automation.

Knowledge Check

This post is for subscribers only

Subscribe to continue reading