Request Handling and Retries
Queuing & Retries
Implementing queuing and retry logic is the most fundamental pillar of a resilient Alloy integration. Two reasons to maintain this in your own infrastructure regardless of Alloy's built-in features:
- Outage independence: In the rare event Alloy experiences an outage, Alloy cannot handle retry orchestration on your behalf. You will need to manage that independently.
- Pipeline resilience: Your own ingestion and processing pipelines can also experience failures that require retry logic regardless of Alloy's availability.
Note: Alloy also offers configurable Retry Nodes within Journey Orchestration as a complementary option. See Policy Design for Service Failures for details.
When to Retry
HTTP Status Code Errors
- Retryable: Retry on
429(rate limit) and5xx(server error) responses. Honor theretry-afterheader value before resubmitting. - Non-retryable: Do not retry on
4xxerrors (except429) without first addressing the underlying cause — these indicate a client-side issue such as malformed data or invalid parameters. - Backoff strategy: Use exponential backoff with jitter when retrying to avoid thundering herd problems, especially during partial outages.
- Observability: Log all retry attempts with their reason and outcome to detect systematic failures versus isolated transient events.
Rate Limit Headers
For 429 errors, Alloy returns the following headers you can use to build intelligent retry logic:
| Header | Definition |
|---|---|
alloy-ratelimit-remaining | Number of requests remaining in the current window |
alloy-ratelimit-reset | Unix timestamp when the rate limit window resets |
alloy-ratelimit-limit | Total requests allowed per window |
retry-after | Seconds to wait before retrying |
Journey Application State Errors
Journey Application state errors are more nuanced and aren't always appropriate for automatic retries.
data_request_evaluationstatus — Data request statuses are mostly preventable with strong data validation before submitting to Alloy. If encountered, resolve via the Update Journey Application endpoint or by rerunning with corrected data.errorstatus — This issue is rare but unlikely to resolve automatically. Reach out to [email protected] for help diagnosing the problem, especially if errors are sudden and persistent across multiple applications. Once the underlying issue is fixed, rerun the Journey Application via the Rerun Journey Application endpoint or through the Alloy dashboard.
Timeouts & Async Processing
Alloy's Journey Application API endpoints are designed to return a response within about 25 seconds of application processing. In most cases Alloy responds much faster, but if a third-party data service is slow or your Journey includes many sequential vendor calls, Alloy may return a response while the Journey status is still running.
If you encounter running, Alloy was still processing when the 25-second timeout was reached. Alloy will send a webhook once the Journey Application reaches its next state.
A related status, pending_workflow_service, indicates that Alloy is waiting on an asynchronous callback from a third-party vendor before the application can continue. Once Alloy begins waiting, the Journey Application moves into pending_workflow_service until the vendor webhook is received.
Updated about 20 hours ago