We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

By clicking "Accept", you agree to our use of cookies.
Learn more.

Self HostingPrometheus Metrics

Prometheus Metrics for Hatchet

This document provides an overview of the Prometheus metrics exposed by Hatchet, setup instructions for the metrics endpoint, and example PromQL queries to analyze them.

Setup

To enable and configure the Prometheus metrics endpoint in your Hatchet server, set the following environment variables (bound to Viper keys as shown):

  • SERVER_PROMETHEUS_ENABLED (prometheus.enabled)

    • Type: boolean
    • Default: false
    • Description: Enables or disables the Prometheus metrics HTTP server.
  • SERVER_PROMETHEUS_ADDRESS (prometheus.address)

    • Type: string
    • Default: ":9090"
    • Description: The network address and port to bind the Prometheus metrics server to.
  • SERVER_PROMETHEUS_PATH (prometheus.path)

    • Type: string
    • Default: "/metrics"
    • Description: The HTTP path at which metrics will be exposed.

Example environment setup:

export SERVER_PROMETHEUS_ENABLED=true
export SERVER_PROMETHEUS_ADDRESS=":9999"
export SERVER_PROMETHEUS_PATH="/custom-metrics"

Restart your Hatchet server after setting these variables to apply the changes.


Metrics

Metric NameTypeDescription
hatchet_queue_invocations_totalCounterThe total number of invocations of the queuer function
hatchet_created_tasks_totalCounterThe total number of tasks created
hatchet_retried_tasks_totalCounterThe total number of tasks retried
hatchet_succeeded_tasks_totalCounterThe total number of tasks that succeeded
hatchet_failed_tasks_totalCounterThe total number of tasks that failed (in a final state, not including retries)
hatchet_skipped_tasks_totalCounterThe total number of tasks that were skipped
hatchet_cancelled_tasks_totalCounterThe total number of tasks cancelled
hatchet_assigned_tasks_totalCounterThe total number of tasks assigned to a worker
hatchet_scheduling_timed_out_totalCounterThe total number of tasks that timed out while waiting to be scheduled
hatchet_rate_limited_totalCounterThe total number of tasks that were rate limited
hatchet_queued_to_assigned_totalCounterThe total number of unique tasks that were queued and later assigned to a worker
hatchet_queued_to_assigned_secondsHistogramBuckets of time (in seconds) spent in the queue before being assigned to a worker

Example PromQL Queries

1. Rate of calls to the queuer method

rate(hatchet_queue_invocations_total[5m])

2. Average queue time in milliseconds

# Calculates average queue time over the past 5 minutes, converted to ms
rate(hatchet_queued_to_assigned_seconds_sum[5m])
  / rate(hatchet_queued_to_assigned_seconds_count[5m])
  * 1e3

3. Success and failure rates

rate(hatchet_succeeded_tasks_total[5m])
rate(hatchet_failed_tasks_total[5m])

4. Queue time distribution (histogram)

sum by (le) (
  rate(hatchet_queued_to_assigned_seconds_bucket[5m])
)

5. Rate of tasks created vs. retried

rate(hatchet_created_tasks_total[5m])
rate(hatchet_retried_tasks_total[5m])