Conversation
Signed-off-by: Cassandra Coyle <cassie@diagrid.io>
There was a problem hiding this comment.
Pull request overview
Removes the SDK’s implicit 60s gRPC deadline so long-running operations (e.g., workflows/LLM calls) aren’t cut off before Dapr resiliency policies and component timeouts can apply.
Changes:
- Set
DAPR_API_TIMEOUT_SECONDSdefault toNone(no deadline unless explicitly configured). - Update sync and async gRPC timeout interceptors to only apply a deadline when
DAPR_API_TIMEOUT_SECONDSis configured, and to pass it as afloat.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
dapr/conf/global_settings.py |
Removes the hardcoded default gRPC timeout by setting DAPR_API_TIMEOUT_SECONDS to None. |
dapr/clients/grpc/interceptors.py |
Applies a global gRPC timeout only when explicitly configured via settings. |
dapr/aio/clients/grpc/interceptors.py |
Mirrors the explicit-only timeout behavior for the asyncio gRPC client. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # Only apply a deadline when DAPR_API_TIMEOUT_SECONDS is explicitly configured. | ||
| # Without an explicit setting there is no SDK-level default deadline: Dapr's own | ||
| # resiliency policies and component timeouts act as the authoritative bounds for | ||
| # long-running operations such as LLM calls and workflow activities. | ||
| if settings.DAPR_API_TIMEOUT_SECONDS is not None and client_call_details.timeout is None: | ||
| new_client_call_details = _ClientCallDetails( |
There was a problem hiding this comment.
New behavior adds a branch where client_call_details.timeout is None but settings.DAPR_API_TIMEOUT_SECONDS is also None, so no timeout should be applied. Existing tests cover the configured-timeout and per-call-timeout cases, but there’s no test ensuring the interceptor leaves the call details unchanged when the global setting is unset; please add coverage for this to prevent regressions back to an implicit default deadline.
| # Only apply a deadline when DAPR_API_TIMEOUT_SECONDS is explicitly configured. | ||
| # Without an explicit setting there is no SDK-level default deadline: Dapr's own | ||
| # resiliency policies and component timeouts act as the authoritative bounds for | ||
| # long-running operations such as LLM calls and workflow activities. | ||
| if settings.DAPR_API_TIMEOUT_SECONDS is not None and client_call_details.timeout is None: | ||
| new_client_call_details = _ClientCallDetailsAsync( |
There was a problem hiding this comment.
New behavior adds a branch where client_call_details.timeout is None but settings.DAPR_API_TIMEOUT_SECONDS is also None, so no timeout should be applied. Existing tests cover the configured-timeout and per-call-timeout cases, but there’s no test ensuring the interceptor leaves the call details unchanged when the global setting is unset; please add coverage for this to prevent regressions back to an implicit default deadline.
Remove hardcoded 60s default timeout. Only apply if explicitly configured.
I was running a long running agent/wf and it was being killed by this and it was not respecting the resiliency configured. An SDK-level default deadline that fires before resiliency can act prevents retries from ever being attempted...