Skip to content

Workflows

Robust workflows are explicit, observable, and resilient to retries or partial failures. This page covers practical design principles.

Design principles

  • Small steps: Break long operations into independently retryable units.
  • Stable contracts: Version payloads and avoid breaking field changes.
  • Idempotency: Allow safe re-execution without duplicate side effects.
  • Observability: Include trace IDs and execution IDs in every log and event.

Error handling

  • Use retry policies for temporary failures like timeouts.
  • Promote terminal failures to dead-letter queues for manual triage.
  • Emit structured events for each step transition.

Scaling guidance

!!! note Scale based on queue depth and processing latency, not only on CPU usage.

Measure queue depth, processing latency, and worker saturation before adding capacity. Scale workers gradually and validate downstream service limits.