Request Handling and Retries

Queuing & Retries

Implementing queuing and retry logic is the most fundamental pillar of a resilient Alloy integration. Two reasons to maintain this in your own infrastructure regardless of Alloy's built-in features:

  • Outage independence: In the rare event Alloy experiences an outage, Alloy cannot handle retry orchestration on your behalf. You will need to manage that independently.
  • Pipeline resilience: Your own ingestion and processing pipelines can also experience failures that require retry logic regardless of Alloy's availability.

Note: Alloy also offers configurable Retry Nodes within Journey Orchestration as a complementary option. See Policy Design for Service Failures for details.

When to Retry

HTTP Status Code Errors

  • Retryable: Retry on 429 (rate limit) and 5xx (server error) responses. Honor the retry-after header value before resubmitting.
  • Non-retryable: Do not retry on 4xx errors (except 429) without first addressing the underlying cause — these indicate a client-side issue such as malformed data or invalid parameters.
  • Backoff strategy: Use exponential backoff with jitter when retrying to avoid thundering herd problems, especially during partial outages.
  • Observability: Log all retry attempts with their reason and outcome to detect systematic failures versus isolated transient events.

Rate Limit Headers

For 429 errors, Alloy returns the following headers you can use to build intelligent retry logic:

HeaderDefinition
alloy-ratelimit-remainingNumber of requests remaining in the current window
alloy-ratelimit-resetUnix timestamp when the rate limit window resets
alloy-ratelimit-limitTotal requests allowed per window
retry-afterSeconds to wait before retrying

Journey Application State Errors

Journey Application state errors are more nuanced and aren't always appropriate for automatic retries.

  • data_request_evaluation statusData request statuses are mostly preventable with strong data validation before submitting to Alloy. If encountered, resolve via the Update Journey Application endpoint or by rerunning with corrected data.
  • error status — This issue is rare but unlikely to resolve automatically. Reach out to [email protected] for help diagnosing the problem, especially if errors are sudden and persistent across multiple applications. Once the underlying issue is fixed, rerun the Journey Application via the Rerun Journey Application endpoint or through the Alloy dashboard.

Timeouts & Async Processing

Alloy's Journey Application API endpoints are designed to return a response within about 25 seconds of application processing. In most cases Alloy responds much faster, but if a third-party data service is slow or your Journey includes many sequential vendor calls, Alloy may return a response while the Journey status is still running.

If you encounter running, Alloy was still processing when the 25-second timeout was reached. Alloy will send a webhook once the Journey Application reaches its next state.

A related status, pending_workflow_service, indicates that Alloy is waiting on an asynchronous callback from a third-party vendor before the application can continue. Once Alloy begins waiting, the Journey Application moves into pending_workflow_service until the vendor webhook is received.