Resource Management During Waits
When a task needs to wait (for time, an event, or child results), how does Hatchet handle the worker slot? The answer depends on which pattern you’re using.
Task Eviction
When a durable task enters a wait, whether from SleepFor, WaitForEvent, or WaitFor, Hatchet evicts the task from the worker. The worker slot is released, the task’s progress is persisted in the durable event log, and the task does not consume slots or hold resources while it is idle.
This is what makes durable tasks fundamentally different from regular tasks: a regular task consumes a slot for the entire duration of execution, even if it’s just sleeping. A durable task gives the slot back the moment it starts waiting.
How eviction works
- Task reaches a wait. The durable task calls
SleepFor,WaitForEvent, orWaitFor. - Checkpoint is written. Hatchet records the current progress in the durable event log.
- Worker slot is freed. The task is evicted from the worker. The slot is immediately available for other tasks.
- Wait completes. When the sleep expires or the expected event arrives, Hatchet re-queues the task.
- Task resumes on any available worker. A worker picks up the task, replays the event log to the last checkpoint, and continues execution from where it left off.
The resumed task does not need to run on the same worker that originally started it. Any worker that has registered the task can pick it up.
Why eviction matters
Without eviction, a task that sleeps for 24 hours would consume a slot for the entire duration, wasting capacity that could be running other work. With eviction, the slot is freed immediately.
This is especially important for:
- Long waits — Tasks that sleep for hours or days should not hold slots.
- Human-in-the-loop — Waiting for a human to approve or respond could take minutes or weeks. Eviction ensures no resources are held in the meantime.
- Large fan-outs — A parent task that spawns thousands of children and waits for results can release its slot while the children run, preventing deadlocks where the parent holds resources that the children need.
Separate slot pools
Durable tasks consume slots from a separate slot pool than regular tasks. This prevents a common deadlock: if durable and regular tasks shared the same pool, a durable task waiting on child tasks could hold the very slot those children need to execute.
By isolating slot pools, Hatchet ensures that durable tasks waiting on children never starve the workers that need to run those children.
Eviction and determinism
Because a task may be evicted and resumed on a different worker at any time, the code between checkpoints must be deterministic. On resume, Hatchet replays the event log; it does not re-execute completed operations. If the code has changed between the original run and the replay, the checkpoint sequence may not match, leading to unexpected behavior.