-
Notifications
You must be signed in to change notification settings - Fork 88
automatic time-skipping #740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
1feee4f
012f767
6a6aed9
78e6bfb
525ec42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -873,6 +873,9 @@ message WorkflowExecutionOptionsUpdatedEventAttributes { | |||
| // Priority override upserted in this event. Represents the full priority; not just partial fields. | ||||
| // Ignored if nil. | ||||
| temporal.api.common.v1.Priority priority = 6; | ||||
| // Time skipping configuration upserted in this event. | ||||
| // It means the time-skipping config is changed, and the full latest config is upserted. | ||||
| temporal.api.workflow.v1.TimeSkippingConfig time_skipping_config = 7; | ||||
| } | ||||
|
|
||||
| // Not used anywhere. Use case is replaced by WorkflowExecutionOptionsUpdatedEventAttributes | ||||
|
|
@@ -965,6 +968,24 @@ message WorkflowExecutionUnpausedEventAttributes { | |||
| string request_id = 3; | ||||
| } | ||||
|
|
||||
| // Attributes for an event indicating that virtual time was advanced for a workflow execution, | ||||
|
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ⭐ highlight the change after review meeting
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should also add to the comment here that api/temporal/api/history/v1/message.proto Line 1127 in 2547d30
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added but I remember an early proposal was to break the workers and remind them to upgrade. So I guess our strategy changed?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I meant we should add this to the comment here, not on the field. Otherwise that field will have an ever-increasing list in the comment. And I think that was before we realized workers don't need to handle this at all. Even old SDKs will be able to successfully replay histories with time-skipping, which could be useful if users want to try this without needing to upgrade SDKs.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess I located here correctly this time? |
||||
| // either via a manual time-skipping API call or automatic time skipping. | ||||
| // The worker_may_ignore field in HistoryEvent should always be set true for this event. | ||||
| message WorkflowExecutionTimeSkippedEventAttributes { | ||||
|
|
||||
| // The virtual time after time skipping is applied. | ||||
| google.protobuf.Timestamp target_time = 1; | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The timestamp of this event will be the workflow (virtual) time before or after the skip? I guess it's before? and the difference between the event time and the target time is the duration skipped?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. emm this is after, right now at sometime for example 4:00 virtual time (which is 3:00 in real time) we are going to skip 1hour to 5:00 and the target_time is 5:00
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||
|
|
||||
| // when true, automatic time-skipping was enabled with a bound | ||||
| // and now that bound has been reached and time-skipping is disabled automatically. | ||||
| // (-- api-linter: core::0140::prepositions=disabled | ||||
| // aip.dev/not-precedent: "after" is used to indicate temporal ordering. --) | ||||
| bool disabled_after_bound = 2; | ||||
|
|
||||
| // The wall-clock time when the time-skipping event happened. | ||||
| google.protobuf.Timestamp wall_clock_time = 3; | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am trying to understand this field v.s. the event time of the event. I mean I understand that this is always wal clock and event time can be a virtual time, but want to see if you any use case where this field will help make it easier.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this was discussed in the review-meeting only for better visibility for debugging purpose. for example, I am an user, and I am debugging something with my business system built on temporal, now I need to align temporal history (which is virtual time) with my own log (which is real time) definitely, it will be better but heavier if we add all wallclock to all following events |
||||
| } | ||||
|
|
||||
| // Event marking that an operation was scheduled by a workflow via the ScheduleNexusOperation command. | ||||
| message NexusOperationScheduledEventAttributes { | ||||
| // Endpoint name, must exist in the endpoint registry. | ||||
|
|
@@ -982,6 +1003,8 @@ message NexusOperationScheduledEventAttributes { | |||
| // Calls are retried internally by the server. | ||||
| // (-- api-linter: core::0140::prepositions=disabled | ||||
| // aip.dev/not-precedent: "to" is used to indicate interval. --) | ||||
| // (-- api-linter: core::0142::time-field-names=disabled | ||||
| // aip.dev/not-precedent: "timeout" is an acceptable suffix for duration fields in this API. --) | ||||
| google.protobuf.Duration schedule_to_close_timeout = 5; | ||||
| // Header to attach to the Nexus request. Note these headers are not the same as Temporal headers on internal | ||||
| // activities and child workflows, these are transmitted to Nexus operations that may be external and are not | ||||
|
|
@@ -1199,6 +1222,7 @@ message HistoryEvent { | |||
| NexusOperationCancelRequestFailedEventAttributes nexus_operation_cancel_request_failed_event_attributes = 62; | ||||
| WorkflowExecutionPausedEventAttributes workflow_execution_paused_event_attributes = 63; | ||||
| WorkflowExecutionUnpausedEventAttributes workflow_execution_unpaused_event_attributes = 64; | ||||
| WorkflowExecutionTimeSkippedEventAttributes workflow_execution_time_skipped_event_attributes = 65; | ||||
| } | ||||
| } | ||||
|
|
||||
|
|
||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -566,6 +566,53 @@ message WorkflowExecutionOptions { | |
|
|
||
| // If set, overrides the workflow's priority sent by the SDK. | ||
| temporal.api.common.v1.Priority priority = 2; | ||
|
|
||
| // Time-skipping configuration for this workflow execution. | ||
| // If not set, the time-skipping conf will not get updated upon request, | ||
| // i.e. the existing time-skipping conf will be preserved. | ||
| TimeSkippingConfig time_skipping_config = 3; | ||
| } | ||
|
|
||
|
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Sushisource @yycptt updated this part especially the bound part according to discussions. With the new fine-grained bound, I think the 1h-signal can be easily implemented by setting For server, a EVENT_TYPE_WORKFLOW_EXECUTION_TIME_SKIPPED will be created to indicate time-skipping is disabled by bound even if no duration skipped. For SDK can poll this event, but I think there are two things to consider right now
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for me the confusing part is those two options didn't say it will, like, create a new timer at the specified bound. and I am under the impression that time skipping will only skip to existing timers within the bound.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes it only skip timers, but we allow this to be turned off automatically without finding anything to skip |
||
| // Configuration for automatic time skipping during a workflow execution. | ||
| // When enabled, virtual time advances automatically whenever there are | ||
| // no in-flight activities, child workflows, or Nexus operations. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we need to block auto skipping when there's pending signal/cancel external workflow operations in the transfer queue as well.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think these are good scenarios for testing, I think I can record these in testing planning. For desgin, |
||
| message TimeSkippingConfig { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we need to add this in to child workflow command?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Specifically, I don't imagine we will need to set childWorkflow time-skipping config thru command when users use SDK testing env. And in general, I think we may keep this as complex as the current version and only consider adding more when SDK really needs it.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Wonder if we need to call this
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am kind of yes and no for different reasons. Guts-feeling is we will not get there that adding manual skipping.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thought, I’m leaning against renaming it. In my mind, 'TimeSkipping' implies an automatic process. Let’s keep it simple and only add a 'Manual' prefix when we need to distinguish the manual version? |
||
|
|
||
| // Enables or disables automatic time skipping for this workflow execution. | ||
| // By default, this field is propagated to transitively related workflows (child workflows/start-as-new/reset) | ||
| // at the time they are started. | ||
| // Changes made after a transitively related workflow has started are not propagated. | ||
| bool enabled = 1; | ||
|
|
||
| // If set, the enabled field is not propagated to transitively related workflows. | ||
| bool disable_propagation = 2; | ||
|
|
||
| // Optional bound that limits how long time skipping remains active. | ||
| // Once the bound is reached, time skipping is automatically disabled. | ||
| // It can later be re-enabled via UpdateWorkflowExecutionOptions. | ||
| // | ||
| // This is particularly useful in testing scenarios where workflows | ||
| // are expected to receive signals, updates, or other events while | ||
| // timers are in progress. | ||
| // | ||
| // This bound is not propagated to transitively related workflows. | ||
| // When bound is also needed for transitively related workflows, | ||
| // it is recommended to set disable_propagation to true | ||
| // and configure TimeSkippingConfig explicitly for transitively related workflows. | ||
| oneof bound { | ||
|
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ⭐ highlight the change after review meeting
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks great!
|
||
|
|
||
| // Maximum total virtual time that can be skipped. | ||
| google.protobuf.Duration max_skipped_duration = 4; | ||
|
|
||
| // Maximum elapsed time since time skipping was enabled. | ||
| // This includes both skipped time and real time elapsing. | ||
| // (-- api-linter: core::0142::time-field-names=disabled --) | ||
| google.protobuf.Duration max_elapsed_duration = 5; | ||
|
|
||
| // Absolute virtual timestamp at which time skipping is disabled. | ||
| // Time skipping will not advance beyond this point. | ||
| google.protobuf.Timestamp max_target_time = 6; | ||
| } | ||
| } | ||
|
|
||
| // Used to override the versioning behavior (and pinned deployment version, if applicable) of a | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -197,6 +197,8 @@ message StartWorkflowExecutionRequest { | |
| temporal.api.common.v1.Priority priority = 27; | ||
| // Deployment Options of the worker who will process the eager task. Passed when `request_eager_execution=true`. | ||
| temporal.api.deployment.v1.WorkerDeploymentOptions eager_worker_deployment_options = 28; | ||
| // Automatic time-skipping configuration. If not set, automatic time-skipping is disabled. | ||
| temporal.api.workflow.v1.TimeSkippingConfig time_skipping_config = 29; | ||
| } | ||
|
|
||
| message StartWorkflowExecutionResponse { | ||
|
|
@@ -836,6 +838,8 @@ message SignalWithStartWorkflowExecutionRequest { | |
| temporal.api.workflow.v1.VersioningOverride versioning_override = 25; | ||
| // Priority metadata | ||
| temporal.api.common.v1.Priority priority = 26; | ||
| // Time skipping configuration. If not set, automatic time-skipping is disabled. | ||
| temporal.api.workflow.v1.TimeSkippingConfig time_skipping_config = 27; | ||
| } | ||
|
|
||
| message SignalWithStartWorkflowExecutionResponse { | ||
|
|
@@ -2951,4 +2955,4 @@ message DeleteActivityExecutionRequest { | |
| } | ||
|
|
||
| message DeleteActivityExecutionResponse { | ||
| } | ||
| } | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: add new line
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. emm it seems not to be a new line >? |
||

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know this just occurred to me and I don't think we've talked about it - if time skipping is enabled on a workflow with an old SDK version processing it, it's going to break on these new event types.
That's fine I think, we will tell people they need to upgrade, but we need to make sure that is clear in the docs & GTM material we create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get it. the docs & GTM material mainly refer to API comments (here) and https://docs.temporal.io/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we always set
worker_may_ignorefor TimeSkippingEvents, it won't break workers right? because in the demo, I setworker_may_ignore = trueand used current SDK which has no time-skipping feature and didnt break itThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for sdk version old enough, it doesn't even know about the
worker_may_ignoreflag, hence the comment. but yeah this is a new feature so we can make it clear re. the sdk version required.