What is a Temporal Workflow?
This guide provides a comprehensive overview of Temporal Workflows.
A Temporal Workflow defines the overall flow of the application. Conceptually, a Workflow is a sequence of steps written in a general-purpose programming language. With Temporal, those steps are defined by writing code, known as a Workflow Definition, and are carried out by running that code, which results in a Workflow Execution.
In day-to-day conversations, the term Workflow might refer to Workflow Type, a Workflow Definition, or a Workflow Execution. Temporal documentation aims to be explicit and differentiate between them.
What is a Workflow Definition?
A Workflow Definition is the code that defines the Workflow. It is written with a programming language and corresponding Temporal SDK. Depending on the programming language, it's typically implemented as a function or an object method and encompasses the end-to-end series of steps of a Temporal application.
Below are different ways to develop a basic Workflow Definition.
- Go
- Java
- PHP
- Python
- Typescript
- .NET
func YourBasicWorkflow(ctx workflow.Context) error {
// ...
return nil
}
Workflow Definition in Java (Interface)
// Workflow interface
@WorkflowInterface
public interface YourBasicWorkflow {
@WorkflowMethod
String workflowMethod(Arguments args);
}
Workflow Definition in Java (Implementation)
// Workflow implementation
public class YourBasicWorkflowImpl implements YourBasicWorkflow {
// ...
}
Workflow Definition in PHP (Interface)
#[WorkflowInterface]
interface YourBasicWorkflow {
#[WorkflowMethod]
public function workflowMethod(Arguments args);
}
Workflow Definition in PHP (Implementation)
class YourBasicWorkflowImpl implements YourBasicWorkflow {
// ...
}
@workflow.defn
class YourWorkflow:
@workflow.run
async def YourBasicWorkflow(self, input: str) -> str:
# ...
Workflow Definition in Typescript
type BasicWorkflowArgs = {
param: string;
};
export async function WorkflowExample(
args: BasicWorkflowArgs,
): Promise<{ result: string }> {
// ...
}
Workflow Definition in C# and .NET
[Workflow]
public class YourBasicWorkflow {
[WorkflowRun]
public async Task<string> workflowExample(string param) {
// ...
}
}
A Workflow Definition may be also referred to as a Workflow Function. In Temporal's documentation, a Workflow Definition refers to the source for the instance of a Workflow Execution, while a Workflow Function refers to the source for the instance of a Workflow Function Execution.
A Workflow Execution effectively executes once to completion, while a Workflow Function Execution occurs many times during the life of a Workflow Execution.
We strongly recommend that you write a Workflow Definition in a language that has a corresponding Temporal SDK.
Deterministic constraints
A critical aspect of developing Workflow Definitions is ensuring they exhibit certain deterministic traits – that is, making sure that the same Commands are emitted in the same sequence, whenever a corresponding Workflow Function Execution (instance of the Function Definition) is re-executed.
The execution semantics of a Workflow Execution include the re-execution of a Workflow Function, which is called a Replay. The use of Workflow APIs in the function is what generates Commands. Commands tell the Temporal Service which Events to create and add to the Workflow Execution's Event History. When a Workflow Function executes, the Commands that are emitted are compared with the existing Event History. If a corresponding Event already exists within the Event History that maps to the generation of that Command in the same sequence, and some specific metadata of that Command matches with some specific metadata of the Event, then the Function Execution progresses.
For example, using an SDK's "Execute Activity" API generates the ScheduleActivityTask Command. When this API is called upon re-execution, that Command is compared with the Event that is in the same location within the sequence. The Event in the sequence must be an ActivityTaskScheduled Event, where the Activity name is the same as what is in the Command.
If a generated Command doesn't match what it needs to in the existing Event History, then the Workflow Execution returns a non-deterministic error.
The following are the two reasons why a Command might be generated out of sequence or the wrong Command might be generated altogether:
- Code changes are made to a Workflow Definition that is in use by a running Workflow Execution.
- There is intrinsic non-deterministic logic (such as inline random branching).
Code changes can cause non-deterministic behavior
The Workflow Definition can change in very limited ways once there is a Workflow Execution depending on it. To alleviate non-deterministic issues that arise from code changes, we recommend using Workflow Versioning.
For example, let's say we have a Workflow Definition that defines the following sequence:
- Start and wait on a Timer/sleep.
- Spawn and wait on an Activity Execution.
- Complete.
We start a Worker and spawn a Workflow Execution that uses that Workflow Definition. The Worker would emit the StartTimer Command and the Workflow Execution would become suspended.
Before the Timer is up, we change the Workflow Definition to the following sequence:
- Spawn and wait on an Activity Execution.
- Start and wait on a Timer/sleep.
- Complete.
When the Timer fires, the next Workflow Task will cause the Workflow Function to re-execute. The first Command the Worker sees would be ScheduleActivityTask Command, which wouldn't match up to the expected TimerStarted Event.
The Workflow Execution would fail and return a non-deterministic error.
The following are examples of minor changes that would not result in non-determinism errors when re-executing a History which already contain the Events:
- Changing the duration of a Timer, with the following exceptions:
- In Java, Python, and Go, changing a Timer's duration from or to 0 is a non-deterministic behavior.
- In .NET, changing a Timer's duration from or to -1 (which means "infinite") is a non-deterministic behavior.
- Changing the arguments to:
- The Activity Options in a call to spawn an Activity Execution (local or nonlocal).
- The Child Workflow Options in a call to spawn a Child Workflow Execution.
- Call to Signal an External Workflow Execution.
- Adding a Signal Handler for a Signal Type that has not been sent to this Workflow Execution.
Intrinsic non-deterministic logic
Intrinsic non-determinism is when a Workflow Function Execution might emit a different sequence of Commands on re-execution, regardless of whether all the input parameters are the same.
For example, a Workflow Definition can not have inline logic that branches (emits a different Command sequence) based off a local time setting or a random number.
In the representative pseudocode below, the local_clock()
function returns the local time, rather than Temporal-defined time:
fn your_workflow() {
if local_clock().is_before("12pm") {
await workflow.sleep(duration_until("12pm"))
} else {
await your_afternoon_activity()
}
}
Each Temporal SDK offers APIs that enable Workflow Definitions to have logic that gets and uses time, random numbers, and data from unreliable resources. When those APIs are used, the results are stored as part of the Event History, which means that a re-executed Workflow Function will issue the same sequence of Commands, even if there is branching involved.
In other words, all operations that do not purely mutate the Workflow Execution's state should occur through a Temporal SDK API.
Versioning Workflow code
The Temporal Platform requires that Workflow code (Workflow Definitions) be deterministic in nature. This requirement means that developers should consider how they plan to handle changes to Workflow code over time.
A versioning strategy is even more important if your Workflow Executions live long enough that a Worker must be able to execute multiple versions of the same Workflow Type.
Apart from the ability to create new Task Queues for Workflow Types with the same name, the Temporal Platform provides Workflow Patching APIs and Worker Build Id–based versioning features.
Patching
Patching APIs enable the creation of logical branching inside a Workflow Definition based on a developer-specified version identifier. This feature is useful for Workflow Definition logic that needs to be updated but still has running Workflow Executions that depend on it.
- How to patch Workflow code in Go
- How to patch Workflow code in Java
- How to patch Workflow code in Python
- How to patch Workflow code in PHP
- How to patch Workflow code in TypeScript
- How to patch Workflow code in .NET
You can also use Worker Versioning instead of Patching.
Handling unreliable Worker Processes
You do not handle Worker Process failure or restarts in a Workflow Definition.
Workflow Function Executions are completely oblivious to the Worker Process in terms of failures or downtime. The Temporal Platform ensures that the state of a Workflow Execution is recovered and progress resumes if there is an outage of either Worker Processes or the Temporal Service itself. The only reason a Workflow Execution might fail is due to the code throwing an error or exception, not because of underlying infrastructure outages.
What is a Workflow Type?
A Workflow Type is a name that maps to a Workflow Definition.
- A single Workflow Type can be instantiated as multiple Workflow Executions.
- A Workflow Type is scoped by a Task Queue. It is acceptable to have the same Workflow Type name map to different Workflow Definitions if they are using completely different Workers.
Workflow Type cardinality with Workflow Definitions and Workflow Executions
What is a Workflow Execution?
While the Workflow Definition is the code that defines the Workflow, the Workflow Execution is created by executing that code. A Temporal Workflow Execution is a durable, reliable, and scalable function execution. It is the main unit of execution of a Temporal Application.
- How to start a Workflow Execution using temporal
- How to start a Workflow Execution using the Go SDK
- How to start a Workflow Execution using the Java SDK
- How to start a Workflow Execution using the PHP SDK
- How to start a Workflow Execution using the Python SDK
- How to start a Workflow Execution using the TypeScript SDK
- How to start a Workflow Execution using the .NET SDK
Each Temporal Workflow Execution has exclusive access to its local state. It executes concurrently to all other Workflow Executions, and communicates with other Workflow Executions through Signals and the environment through Activities. While a single Workflow Execution has limits on size and throughput, a Temporal Application can consist of millions to billions of Workflow Executions.
Durability
Durability is the absence of an imposed time limit.
A Workflow Execution is durable because it executes a Temporal Workflow Definition (also called a Temporal Workflow Function), your application code, effectively once and to completion—whether your code executes for seconds or years.
Reliability
Reliability is responsiveness in the presence of failure.
A Workflow Execution is reliable, because it is fully recoverable after a failure. The Temporal Platform ensures the state of the Workflow Execution persists in the face of failures and outages and resumes execution from the latest state.
Scalability
Scalability is responsiveness in the presence of load.
A single Workflow Execution is limited in size and throughput but is scalable because it can Continue-As-New in response to load. A Temporal Application is scalable because the Temporal Platform is capable of supporting millions to billions of Workflow Executions executing concurrently, which is realized by the design and nature of the Temporal Service and Worker Processes.
Replays
A Replay is the method by which a Workflow Execution resumes making progress. During a Replay the Commands that are generated are checked against an existing Event History. Replays are necessary and often happen to give the effect that Workflow Executions are resumable, reliable, and durable.
For more information, see Deterministic constraints.
If a failure occurs, the Workflow Execution picks up where the last recorded event occurred in the Event History.
- How to use Replay APIs using the Go SDK
- How to use Replay APIs using the Java SDK
- How to use Replay APIs using the Python SDK
- How to use Replay APIs using the TypeScript SDK
- How to use Replay APIs using the .NET SDK
Commands and awaitables
A Workflow Execution does two things:
- Issue Commands.
- Wait on an Awaitables (often called Futures).
Command generation and waiting
Commands are issued and Awaitables are provided by the use of Workflow APIs in the Workflow Definition.
Commands are generated whenever the Workflow Function is executed. The Worker Process supervises the Command generation and makes sure that it maps to the current Event History. (For more information, see Deterministic constraints.) The Worker Process batches the Commands and then suspends progress to send the Commands to the Temporal Service whenever the Workflow Function reaches a place where it can no longer progress without a result from an Awaitable.
A Workflow Execution may only ever block progress on an Awaitable that is provided through a Temporal SDK API. Awaitables are provided when using APIs for the following:
- Awaiting: Progress can block using explicit "Await" APIs.
- Requesting cancellation of another Workflow Execution: Progress can block on confirmation that the other Workflow Execution is cancelled.
- Sending a Signal: Progress can block on confirmation that the Signal sent.
- Spawning a Child Workflow Execution: Progress can block on confirmation that the Child Workflow Execution started, and on the result of the Child Workflow Execution.
- Spawning an Activity Execution: Progress can block on the result of the Activity Execution.
- Starting a Timer: Progress can block until the Timer fires.
Status
A Workflow Execution can be either Open or Closed.
Workflow Execution statuses
Open
An Open status means that the Workflow Execution is able to make progress.
- Running: The only Open status for a Workflow Execution. When the Workflow Execution is Running, it is either actively progressing or is waiting on something.
Closed
A Closed status means that the Workflow Execution cannot make further progress because of one of the following reasons:
- Cancelled: The Workflow Execution successfully handled a cancellation request.
- Completed: The Workflow Execution has completed successfully.
- Continued-As-New: The Workflow Execution Continued-As-New.
- Failed: The Workflow Execution returned an error and failed.
- Terminated: The Workflow Execution was terminated.
- Timed Out: The Workflow Execution reached a timeout limit.
Workflow Execution Chain
A Workflow Execution Chain is a sequence of Workflow Executions that share the same Workflow Id. Each link in the Chain is often called a Workflow Run. Each Workflow Run in the sequence is connected by one of the following:
A Workflow Execution is uniquely identified by its Namespace, Workflow Id, and Run Id.
The Workflow Execution Timeout applies to a Workflow Execution Chain. The Workflow Run Timeout applies to a single Workflow Execution (Workflow Run).
Event loop
A Workflow Execution is made up of a sequence of Events called an Event History. Events are created by the Temporal Service in response to either Commands or actions requested by a Temporal Client (such as a request to spawn a Workflow Execution).
Workflow Execution
Time constraints
Is there a limit to how long Workflows can run?
No, there is no time constraint on how long a Workflow Execution can be Running.
However, Workflow Executions intended to run indefinitely should be written with some care. The Temporal Service stores the complete Event History for the entire lifecycle of a Workflow Execution. The Temporal Service logs a warning after 10Ki (10,240) Events and periodically logs additional warnings as new Events are added. If the Event History exceeds 50Ki (51,200) Events, the Workflow Execution is terminated.
To prevent runaway Workflow Executions, you can use the Workflow Execution Timeout, the Workflow Run Timeout, or both. A Workflow Execution Timeout can be used to limit the duration of Workflow Execution Chain, and a Workflow Run Timeout can be used to limit the duration an individual Workflow Execution (Run).
You can use the Continue-As-New feature to close the current Workflow Execution and create a new Workflow Execution in a single atomic operation. The Workflow Execution spawned from Continue-As-New has the same Workflow Id, a new Run Id, and a fresh Event History and is passed all the appropriate parameters. For example, it may be reasonable to use Continue-As-New once per day for a long-running Workflow Execution that is generating a large Event History.
Limits
There is no limit to the number of concurrent Workflow Executions, albeit you must abide by the Workflow Execution's Event History limit.
As a precautionary measure, the Workflow Execution's Event History is limited to 51,200 Events or 50 MB and will warn you after 10,240 Events or 10 MB.
There is also a limit to the number of certain types of incomplete operations.
Each in-progress Activity generates a metadata entry in the Workflow Execution's mutable state.
Too many entries in a single Workflow Execution's mutable state causes unstable persistence.
To protect the system, Temporal enforces a maximum number of incomplete Activities, Child Workflows, Signals, or Cancellation requests per Workflow Execution (by default, 2,000 for each type of operation).
Once the limit is reached for a type of operation, if the Workflow Execution attempts to start another operation of that type (by producing a ScheduleActivityTask
, StartChildWorkflowExecution
, SignalExternalWorkflowExecution
, or RequestCancelExternalWorkflowExecution
Command), it will be unable to (the Workflow Task Execution will fail and get retried).
These limits are set with the following dynamic configuration keys:
NumPendingActivitiesLimit
NumPendingChildExecutionsLimit
NumPendingSignalsLimit
NumPendingCancelRequestsLimit
Workflow Execution Nexus Operation Limits
There is a limit to the maximum number of Nexus Operations in a Workflow before Continue-As-New is required. Each in-progress Nexus Operation generates a metadata entry in the Workflow Execution's mutable state. Too many entries in a single Workflow Execution's mutable state causes unstable persistence. To protect the system, Temporal enforces a maximum number of incomplete (and complete in Public Preview) Nexus Operation requests per Workflow Execution (by default, 30 Nexus Operations). Once the limit is reached for a type of operation, if the Workflow Execution attempts to start another Nexus operation (by producing a ScheduleNexusOperation), it will be unable to do so (the Workflow Task Execution will fail and get retried).
These limits are set with the following dynamic configuration keys:
- MaxConcurrentOperations
What is a Command?
A Command is a requested action issued by a Worker to the Temporal Service after a Workflow Task Execution completes.
The action that the Temporal Service takes is recorded in the Workflow Execution's Event History as an Event. The Workflow Execution can await on some of the Events that come as a result from some of the Commands.
Commands are generated by the use of Workflow APIs in your code. During a Workflow Task Execution there may be several Commands that are generated. The Commands are batched and sent to the Temporal Service as part of the Workflow Task Execution completion request, after the Workflow Task has progressed as far as it can with the Workflow function. There will always be WorkflowTaskStarted and WorkflowTaskCompleted Events in the Event History when there is a Workflow Task Execution completion request.
Commands are generated by the use of Workflow APIs in your code
Commands are described in the Command reference and are defined in the Temporal gRPC API.
What is an Event?
Events are created by the Temporal Service in response to external occurrences and Commands generated by a Workflow Execution. Each Event corresponds to an enum
that is defined in the Server API.
All Events are recorded in the Event History.
A list of all possible Events that could appear in a Workflow Execution Event History is provided in the Event reference.
Activity Events
Seven Activity-related Events are added to Event History at various points in an Activity Execution:
- After a Workflow Task Execution reaches a line of code that starts/executes an Activity, the Worker sends the Activity Type and arguments to the Temporal Service, and the Temporal Service adds an ActivityTaskScheduled Event to Event History.
- When
ActivityTaskScheduled
is added to History, the Temporal Service adds a corresponding Activity Task to the Task Queue. - A Worker polling that Task Queue picks up the Activity Task and runs the Activity function or method.
- If the Activity function returns, the Worker reports completion to the Temporal Service, and the Temporal Service adds ActivityTaskStarted and ActivityTaskCompleted to Event History.
- If the Activity function throws a non-retryable Failure, the Temporal Service adds ActivityTaskStarted and ActivityTaskFailed to Event History.
- If the Activity function throws an error or retryable Failure, the Temporal Service schedules an Activity Task retry to be added to the Task Queue (unless you’ve reached the Maximum Attempts value of the Retry Policy, in which case the Temporal Service adds ActivityTaskStarted and ActivityTaskFailed to Event History).
- If the Activity’s Start-to-Close Timeout passes before the Activity function returns or throws, the Temporal Service schedules a retry.
- If the Activity’s Schedule-to-Close Timeout passes before Activity Execution is complete, or if Schedule-to-Start Timeout passes before a Worker gets the Activity Task, the Temporal Service writes ActivityTaskTimedOut to Event History.
- If the Activity is canceled, the Temporal Service writes ActivityTaskCancelRequested to Event History, and if the Activity accepts cancellation, the Temporal Service writes ActivityTaskCanceled.
While the Activity is running and retrying, ActivityTaskScheduled is the only Activity-related Event in History: ActivityTaskStarted is written along with a terminal Event like ActivityTaskCompleted or ActivityTaskFailed.
What is an Event History?
An append-only log of Events for your application.
- Event History is durably persisted by the Temporal service, enabling seamless recovery of your application state from crashes or failures.
- It also serves as an audit log for debugging.
Event History limits
The Temporal Service stores the complete Event History for the entire lifecycle of a Workflow Execution.
The Temporal Service logs a warning after 10Ki (10,240) Events and periodically logs additional warnings as new Events are added. If the Event History exceeds 50Ki (51,200) Events, the Workflow Execution is terminated.
What is Continue-As-New?
Continue-As-New is a mechanism by which the latest relevant state is passed to a new Workflow Execution, with a fresh Event History.
As a precautionary measure, the Workflow Execution's Event History is limited to 51,200 Events or 50 MB and will warn you after 10,240 Events or 10 MB.
To prevent a Workflow Execution Event History from exceeding this limit and failing, use Continue-As-New to start a new Workflow Execution with a fresh Event History.
All values passed to a Workflow Execution through parameters or returned through a result value are recorded into the Event History. The Temporal Service stores the full Event History of a Workflow Execution for the duration of a Namespace's retention period. A Workflow Execution that periodically executes many Activities has the potential of hitting the size limit.
A very large Event History can adversely affect the performance of a Workflow Execution. For example, in the case of a Workflow Worker failure, the full Event History must be pulled from the Temporal Service and given to another Worker via a Workflow Task. If the Event history is very large, it may take some time to load it.
The Continue-As-New feature enables developers to complete the current Workflow Execution and start a new one atomically.
The new Workflow Execution has the same Workflow Id, but a different Run Id, and has its own Event History.
In the case of Temporal Cron Jobs, Continue-As-New is actually used internally for the same effect.
- How to Continue-As-New using the Go SDK
- How to Continue-As-New using the Java SDK
- How to Continue-As-New using the PHP SDK
- How to Continue-As-New using the Python SDK
- How to Continue-As-New using the TypeScript SDK
- How to Continue-As-New using the .NET SDK
What is a Reset?
A Reset terminates a Workflow Execution and creates a new Workflow Execution with the same Workflow Type and Workflow ID. The Event History is copied from the original execution up to and including the reset point. The new execution continues from the reset point. Signals in the original history can be optionally copied to the new history, whether they appear after the reset point or not.
What is a Run Id?
A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
The current Run Id is mutable and can change during a Workflow Retry. You shouldn't rely on storing the current Run Id, or using it for any logical choices, because a Workflow Retry changes the Run Id and can lead to non-determinism issues.
Temporal guarantees that only one Workflow Execution with a given Workflow Id can be in an Open state at any given time. But when a Workflow Execution reaches a Closed state, it is possible to have another Workflow Execution in an Open state with the same Workflow Id. For example, a Temporal Cron Job is a chain of Workflow Executions that all have the same Workflow Id. Each Workflow Execution within the chain is considered a Run.
A Run Id uniquely identifies a Workflow Execution even if it shares a Workflow Id with other Workflow Executions.
Which operations lead to non-determinism issues?
An operation like ContinueAsNew
, Retry
, Cron
, and Reset
creates a Workflow Execution Chain as identified by the first_execution_run_id
.
Each operation creates a new Workflow Execution inside a chain run and saves its information as first_execution_run_id
.
Thus, the Run Id is updated during each operation on a Workflow Execution.
- The
first_execution_run_id
is the Run Id of the first Workflow Execution in a Chain run. - The
original_execution_run_id
is the Run Id when theWorkflowExecutionStarted
Event occurs.
A Workflow Reset
changes the first execution Run Id, but preserves the original execution Run Id.
For example, when a new Workflow Execution in the chain starts, it stores its Run Id in original_execution_run_id
.
A reset doesn't change that field, but the current Run Id is updated.
Because of this behavior, you shouldn't rely on the current Run Id in your code to make logical choices.
Learn more
For more information, see the following link.
What is a Workflow Id?
A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace.
A Workflow Id is meant to be a business-process identifier such as customer identifier or order identifier.
The Temporal Platform guarantees uniqueness of the Workflow Id within a Namespace based on the Workflow Id Reuse Policy.
A Workflow Id Reuse Policy can be used to manage whether a Workflow Id from a Closed Workflow can be re-used.
A Workflow Id Conflict Policy can be used to decide how to resolve a Workflow Id conflict with a Running Workflow.
A Workflow Execution can be uniquely identified across all Namespaces by its Namespace, Workflow Id, and Run Id.
What is a Workflow Id Reuse Policy?
A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.
It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution, regardless of the Workflow Id Reuse Policy.
See Workflow Id Conflict Policy for resolving a Workflow Id conflict.
The Workflow Id Reuse Policy can have one of the following values:
- Allow Duplicate: The Workflow Execution is allowed to exist regardless of the Closed status of a previous Workflow Execution with the same Workflow Id. This is the default policy, if one is not specified. Use this when it is OK to have a Workflow Execution with the same Workflow Id as a previous, but now Closed, Workflow Execution.
- Allow Duplicate Failed Only: The Workflow Execution is allowed to exist only if a previous Workflow Execution with the same Workflow Id does not have a Completed status. Use this policy when there is a need to re-execute a Failed, Timed Out, Terminated or Cancelled Workflow Execution and guarantee that the Completed Workflow Execution will not be re-executed.
- Reject Duplicate: The Workflow Execution cannot exist if a previous Workflow Execution has the same Workflow Id, regardless of the Closed status. Use this when there can only be one Workflow Execution per Workflow Id within a Namespace for the given retention period.
- Terminate if Running: Specifies that if a Workflow Execution with the same Workflow Id is already running, it should be terminated and a new Workflow Execution with the same Workflow Id should be started. This policy allows for only one Workflow Execution with a specific Workflow Id to be running at any given time.
The first three values (Allow Duplicate, Allow Duplicate Failed Only, and Reject Duplicate) of the Workflow Id Reuse Policy apply to Closed Workflow Executions that are retained within the Namespace. For example, given a default Retention Period, the Temporal Service can only check the Workflow Id of the spawning Workflow Execution based on the Workflow Id Reuse Policy against the Closed Workflow Executions for the last 30 days.
If you need to start a Workflow for a particular implementation only if it hasn't started yet, ensure that your Retention Period is long enough to check against. If this becomes unwieldy, consider using Workflow message passing instead of trying to start Workflows atomically.
The fourth value of the Workflow Id Reuse Policy, Terminate if Running, only applies to a Workflow Execution that is currently open within the Namespace. For Terminate if Running, the Retention Period is not a consideration for this policy.
If there is an attempt to spawn a Workflow Execution with a Workflow Id Reuse Policy that won't allow it, the Server will prevent the Workflow Execution from spawning.
What is a Workflow Id Conflict Policy?
A Workflow Id Conflict Policy determines how to resolve a conflict when spawning a new Workflow Execution with a particular Workflow Id used by an existing Open Workflow Execution. See Workflow Id Reuse Policy for managing the reuse of a Workflow Id of a Closed Workflow.
By default, this results in a Workflow execution already started
error.
The default StartWorkflowOptions behavior in the Go SDK is to not return an error when a new Workflow Execution is attempted with the same Workflow Id as an Open Workflow Execution. Instead, it returns a WorkflowRun instance representing the current or last run of the Open Workflow Execution.
To return the Workflow execution already started
error, set WorkflowExecutionErrorWhenAlreadyStarted
to true
.
The Workflow Id Conflict Policy can have one of the following values:
- Fail: Prevents the Workflow Execution from spawning and returns a
Workflow execution already started
error. This is the default policy, if one isn't specified. - Use Existing: Prevents the Workflow Execution from spawning and returns a successful response with the Open Workflow Execution's Run Id.
- Terminate Existing: Terminates the Open Workflow Execution then spawns the new Workflow Execution with the same Workflow Id.
What is a Timer?
Temporal SDKs offer Timer APIs so that Workflow Executions are deterministic in their handling of time values.
Timers in Temporal are persisted, meaning that even if your Worker or Temporal Service is down when the time period completes, as soon as your Worker and Temporal Service become available, the call that is awaiting the Timer in your Workflow code will resolve, causing execution to proceed Timers are reliable and efficient. Workers consume no additional resources while waiting for a Timer to fire, so a single Worker can await millions of Timers concurrently.
- How to set Timers in Go
- How to set Timers in Java
- How to set Timers in PHP
- How to set Timers in Python
- How to set Timers in TypeScript
- How to set Timers in .NET
The duration of a Timer is fixed, and your Workflow might specify a value as short as one second or as long as several years. Although it's possible to specify an extremely precise duration, such as 36 milliseconds or 15.072 minutes, your Workflows should not rely on sub-second accuracy for Timers. We recommend that you consider the duration as a minimum time, one which will be rounded up slightly due to the latency involved with scheduling and firing the Timer. For example, setting a Timer for 11.97 seconds is guaranteed to delay execution for at least that long, but will likely be closer to 12 seconds in practice.
What is a Memo?
A Memo is a non-indexed set of Workflow Execution metadata that developers supply at start time or in Workflow code and that is returned when you describe or list Workflow Executions.
The primary purpose of using a Memo is to enhance the organization and management of Workflow Executions. Add your own metadata, such as notes or descriptions, to a Workflow Execution, which lets you annotate and categorize Workflow Executions based on developer-defined criteria. This feature is particularly useful when dealing with numerous Workflow Executions because it facilitates the addition of context, reminders, or any other relevant information that aids in understanding or tracking the Workflow Execution.
Memos shouldn't store data that's critical to the execution of a Workflow, for some of the following reasons:
- Unlike Workflow inputs, Memos lack type safety
- Memos are subject to eventual consistency and may not be immediately available
- Excessive reliance on Memos hides mutable state from the Workflow Execution History
What is a Dynamic Handler?
Temporal supports Dynamic Workflows, Activities, Signals, and Queries.
Currently, the Temporal SDKs that support Dynamic Handlers are:
The Go SDK supports Dynamic Signals through the GetUnhandledSignalNames function.
These are unnamed handlers that are invoked if no other statically defined handler with the given name exists.
Dynamic Handlers provide flexibility to handle cases where the names of Workflows, Activities, Signals, or Queries aren't known at run time.
Dynamic Handlers should be used judiciously as a fallback mechanism rather than the primary approach. Overusing them can lead to maintainability and debugging issues down the line.
Instead, Workflows, Activities, Signals, and Queries should be defined statically whenever possible, with clear names that indicate their purpose. Use static definitions as the primary way of structuring your Workflows.
Reserve Dynamic Handlers for cases where the handler names are not known at compile time and need to be looked up dynamically at runtime. They are meant to handle edge cases and act as a catch-all, not as the main way of invoking logic.
What is a Side Effect?
Side Effects are included in the Go, Java, and PHP SDKs. They are not included in other SDKs. Local Activities fit the same use case and are slightly less resource intensive.
A Side Effect is a way to execute a short, non-deterministic code snippet, such as generating a UUID, that executes the provided function once and records its result into the Workflow Execution Event History.
A Side Effect does not re-execute upon replay, but instead returns the recorded result.
Do not ever have a Side Effect that could fail, because failure could result in the Side Effect function executing more than once. If there is any chance that the code provided to the Side Effect could fail, use an Activity.
What is a Schedule?
- Is Generally Available as of Nov 2023
- Introduced in Temporal Server version 1.17.0
- Available in the Temporal CLI
- Available in Temporal Cloud
- Available in Go SDK since v1.22.0
- Available in Java SDK since v1.20.0
- Available in Python SDK since v1.1.0
- Available in TypeScript SDK since v1.5.0
- Available in .NET SDK since v0.1.0
- Available in PHP SDK since v2.7.0
- Available in gRPC API
A Schedule contains instructions for starting a Workflow Execution at specific times. Schedules provide a more flexible and user-friendly approach than Temporal Cron Jobs.
A Schedule has an identity and is independent of a Workflow Execution. This differs from a Temporal Cron Job, which relies on a cron schedule as a property of the Workflow Execution.
For triggering a Workflow Execution at a specific one-time future point rather than on a recurring schedule, the Start Delay option should be used instead of a Schedule.
Action
The Action of a Schedule is where the Workflow Execution properties are established, such as Workflow Type, Task Queue, parameters, and timeouts.
Workflow Executions started by a Schedule have the following additional properties:
- The Action's timestamp is appended to the Workflow Id.
- The
TemporalScheduledStartTime
Search Attribute is added to the Workflow Execution. The value is the Action's timestamp. - The
TemporalScheduledById
Search Attribute is added to the Workflow Execution. The value is the Schedule Id.
Spec
The Schedule Spec defines when the Action should be taken. Unless many Schedules have Actions scheduled at the same time, Actions should generally start within 1 second of the specified time. There are two kinds of Schedule Spec:
- A simple interval, like "every 30 minutes" (aligned to start at the Unix epoch, and optionally including a phase offset).
- A calendar-based expression, similar to the "cron expressions" supported by lots of software, including the older Temporal Cron feature.
These two kinds have multiple representations, depending on the interface or SDK you're using, but they all support the same features.
In the Temporal CLI, for example, an interval is specified as a string like 45m
to mean every 45 minutes, or 6h/5h
to mean every 6 hours but at the start of the fifth hour within each period.
In the Temporal CLI, a calendar expression can be specified as either a traditional cron string with five (or six or seven) positional fields, or as JSON with named fields:
{
"year": "2022",
"month": "Jan,Apr,Jul,Oct",
"dayOfMonth": "1,15",
"hour": "11-14"
}
The following calendar JSON fields are available:
year
month
dayOfMonth
dayOfWeek
hour
minute
second
comment
Each field can contain a comma-separated list of ranges (or the *
wildcard), and each range can include a slash followed by a skip value.
The hour
, minute
, and second
fields default to 0
while the others default to *
, so you can describe many useful specs with only a few fields.
For month
, names of months may be used instead of integers (case-insensitive, abbreviations permitted).
For dayOfWeek
, day-of-week names may be used.
The comment
field is optional and can be used to include a free-form description of the intent of the calendar spec, useful for complicated specs.
No matter which form you supply, calendar and interval specs are converted to canonical representations. What you see when you "describe" or "list" a Schedule might not look exactly like what you entered, but it has the same meaning.
Other Spec features:
Multiple intervals/calendar expressions: A Spec can have combinations of multiple intervals and/or calendar expressions to define a specific Schedule.
Time bounds: Provide an absolute start or end time (or both) with a Spec to ensure that no actions are taken before the start time or after the end time.
Exclusions: A Spec can contain exclusions in the form of zero or more calendar expressions. This can be used to express scheduling like "each Monday at noon except for holidays. You'll have to provide your own set of exclusions and include it in each schedule; there are no pre-defined sets. (This feature isn't currently exposed in the Temporal CLI or the Temporal Web UI.)
Jitter: If given, a random offset between zero and the maximum jitter is added to each Action time (but bounded by the time until the next scheduled Action).
Time zones: By default, calendar-based expressions are interpreted in UTC. Temporal recommends using UTC to avoid various surprising properties of time zones. If you don't want to use UTC, you can provide the name of a time zone. The time zone definition is loaded on the Temporal Server Worker Service from either disk or the fallback embedded in the binary.
For more operational control, embed the contents of the time zone database file in the Schedule Spec itself. (Note: this isn't currently exposed in the Temporal CLI or the web UI.)
Pause
A Schedule can be Paused. When a Schedule is Paused, the Spec has no effect. However, you can still force manual actions by using the temporal schedule trigger command.
To assist communication among developers and operators, a “notes” field can be updated on pause or resume to store an explanation for the current state.
Backfill
A Schedule can be Backfilled.
When a Schedule is Backfilled, all the Actions that would have been taken over a specified time period are taken now (in parallel if the AllowAll
Overlap Policy is used; sequentially if BufferAll
is used).
You might use this to fill in runs from a time period when the Schedule was paused due to an external condition that's now resolved, or a period before the Schedule was created.
Limit number of Actions
A Schedule can be limited to a certain number of scheduled Actions (that is, not trigger immediately). After that it will act as if it were paused.
Policies
A Schedule supports a set of Policies that enable customizing behavior.
Overlap Policy
The Overlap Policy controls what happens when it is time to start a Workflow Execution but a previously started Workflow Execution is still running. The following options are available:
Skip
: Default. Nothing happens; the Workflow Execution is not started.BufferOne
: Starts the Workflow Execution as soon as the current one completes. The buffer is limited to one. If another Workflow Execution is supposed to start, but one is already in the buffer, only the one in the buffer eventually starts.BufferAll
: Allows an unlimited number of Workflows to buffer. They are started sequentially.CancelOther
: Cancels the running Workflow Execution, and then starts the new one after the old one completes cancellation.TerminateOther
: Terminates the running Workflow Execution and starts the new one immediately.AllowAll
Starts any number of concurrent Workflow Executions. With this policy (and only this policy), more than one Workflow Execution, started by the Schedule, can run simultaneously.
Catchup Window
The Temporal Service might be down or unavailable at the time when a Schedule should take an Action. When it comes back up, the Catchup Window controls which missed Actions should be taken at that point. The default is one year, meaning Actions will be taken unless over one year late. If your Actions are more time-sensitive, you can set the Catchup Window to a smaller value (minimum ten seconds), accepting that an outage longer than the window could lead to missed Actions. (But you can always Backfill.)
Pause-on-failure
If this policy is set, a Workflow Execution started by a Schedule that ends with a failure or timeout (but not Cancellation or Termination) causes the Schedule to automatically pause.
Note that with the AllowAll
Overlap Policy, this pause might not apply to the next Workflow Execution, because the next Workflow Execution might have started before the failed one finished.
It applies only to Workflow Executions that were scheduled to start after the failed one finished.
Last completion result
A Workflow started by a Schedule can obtain the completion result from the most recent successful run. (How you do this depends on the SDK you're using.)
For overlap policies that don't allow overlap, “the most recent successful run” is straightforward to define.
For the AllowAll
policy, it refers to the run that completed most recently, at the time that the run in question is started.
Consider the following overlapping runs:
time -------------------------------------------->
A |----------------------|
B |-------|
C |---------------|
D |--------------T
If D asks for the last completion result at time T, it gets the result of A. Not B, even though B started more recently, because A completed later. And not C, even though C completed after A, because the result for D is captured when D is started, not when it's queried.
Failures and timeouts do not affect the last completion result.
When a Schedule triggers a Workflow that completes successfully and yields a result, the result from the initial Schedule execution can be accessed by the subsequent scheduled execution through LastCompletionResult
.
Be aware that if, during the subsequent run, the Workflow employs the Continue-As-New feature, LastCompletionResult
won't be accessible for this new Workflow iteration.
It is important to note that the status of the subsequent run is marked as Continued-As-New
and not as Completed
.
A scheduled Workflow Execution may complete with a result up to the maximum blob size (2 MiB by default). However, due to internal limitations, results that are within 1 KiB of this limit cannot be passed to the next execution. So, for example, a Workflow Execution that returns a result of size 2,096,640 bytes (which is above 2MiB - 1KiB limit) will be allowed to compete successfully, but that value will not be available as a last completion result. This limitation may be lifted in the future.
Last failure
A Workflow started by a Schedule can obtain the details of the failure of the most recent run that ended at the time when the Workflow in question was started. Unlike last completion result, a successful run does reset the last failure.
Limitations
Internally, a Schedule is implemented as a Workflow. If you're using Advanced Visibility (Elasticsearch), these Workflow Executions are hidden from normal views. If you're using Standard Visibility, they are visible, though there's no need to interact with them directly.
What is a Temporal Cron Job?
We recommend using Schedules instead of Cron Jobs. Schedules were built to provide a better developer experience, including more configuration options and the ability to update or pause running Schedules.
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
- How to set a Cron Schedule using the Go SDK
- How to set a Cron Schedule using the Java SDK
- How to set a Cron Schedule using the PHP SDK
- How to set a Cron Schedule using the Python SDK
- How to set a Cron Schedule using the TypeScript SDK
Temporal Cron Job timeline
A Temporal Cron Job is similar to a classic unix cron job. Just as a unix cron job accepts a command and a schedule on which to execute that command, a Cron Schedule can be provided with the call to spawn a Workflow Execution. If a Cron Schedule is provided, the Temporal Server will spawn an execution for the associated Workflow Type per the schedule.
Each Workflow Execution within the series is considered a Run.
- Each Run receives the same input parameters as the initial Run.
- Each Run inherits the same Workflow Options as the initial Run.
The Temporal Server spawns the first Workflow Execution in the chain of Runs immediately.
However, it calculates and applies a backoff (firstWorkflowTaskBackoff
) so that the first Workflow Task of the Workflow Execution does not get placed into a Task Queue until the scheduled time.
After each Run Completes, Fails, or reaches the Workflow Run Timeout, the same thing happens: the next run will be created immediately with a new firstWorkflowTaskBackoff
that is calculated based on the current Server time and the defined Cron Schedule.
The Temporal Server spawns the next Run only after the current Run has Completed, Failed, or has reached the Workflow Run Timeout.
This means that, if a Retry Policy has also been provided, and a Run Fails or reaches the Workflow Run Timeout, the Run will first be retried per the Retry Policy until the Run Completes or the Retry Policy has been exhausted.
If the next Run, per the Cron Schedule, is due to spawn while the current Run is still Open (including retries), the Server automatically starts the new Run after the current Run completes successfully.
The start time for this new Run and the Cron definitions are used to calculate the firstWorkflowTaskBackoff
that is applied to the new Run.
A Workflow Execution Timeout is used to limit how long a Workflow can be executing (have an Open status), including retries and any usage of Continue As New. The Cron Schedule runs until the Workflow Execution Timeout is reached or you terminate the Workflow.
Temporal Cron Job Run Failure with a Retry Policy
Cron Schedules
Cron Schedules are interpreted in UTC time by default.
The Cron Schedule is provided as a string and must follow one of two specifications:
Classic specification
This is what the "classic" specification looks like:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
For example, 15 8 * * *
causes a Workflow Execution to spawn daily at 8:15 AM UTC.
Use the crontab guru site to test your cron expressions.
robfig
predefined schedules and intervals
You can also pass any of the predefined schedules or intervals described in the robfig/cron
documentation.
| Schedules | Description | Equivalent To |
| ---------------------- | ------------------------------------------ | ------------- |
| @yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 1 1 * |
| @monthly | Run once a month, midnight, first of month | 0 0 1 * * |
| @weekly | Run once a week, midnight between Sat/Sun | 0 0 * * 0 |
| @daily (or @midnight) | Run once a day, midnight | 0 0 * * * |
| @hourly | Run once an hour, beginning of hour | 0 * * * * |
For example, "@weekly" causes a Workflow Execution to spawn once a week at midnight between Saturday and Sunday.
Intervals just take a string that can be accepted by time.ParseDuration.
@every <duration>
Time zones
This feature only applies in Temporal 1.15 and up
You can change the time zone that a Cron Schedule is interpreted in by prefixing the specification with CRON_TZ=America/New_York
(or your desired time zone from tz). CRON_TZ=America/New_York 15 8 * * *
therefore spawns a Workflow Execution every day at 8:15 AM New York time, subject to caveats listed below.
Consider that using time zones in production introduces a surprising amount of complexity and failure modes! If at all possible, we recommend specifying Cron Schedules in UTC (the default).
If you need to use time zones, here are a few edge cases to keep in mind:
- Beware Daylight Saving Time: If a Temporal Cron Job is scheduled around the time when daylight saving time (DST) begins or ends (for example,
30 2 * * *
), it might run zero, one, or two times in a day! The Cron library that we use does not do any special handling of DST transitions. Avoid schedules that include times that fall within DST transition periods.- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
1:59 … 1:00 … 1:01 … 1:59 … 2:00 … 2:01 AM
and any Cron jobs that fall in that 1 AM hour are fired again. The inverse happens when clocks "spring forward" for DST, and Cron jobs that fall in the 2 AM hour are skipped. - In other time zones like Chile and Iran, DST "spring forward" is at midnight. 11:59 PM is followed by 1 AM, which means
00:00:00
never happens.
- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
- Self Hosting note: If you manage your own Temporal Service, you are responsible for ensuring that it has access to current
tzdata
files. The official Docker images are built with tzdata installed (provided by Alpine Linux), but ultimately you should be aware of how tzdata is deployed and updated in your infrastructure. - Updating Temporal: If you use the official Docker images, note that an upgrade of the Temporal Service may include an update to the tzdata files, which may change the meaning of your Cron Schedule. You should be aware of upcoming changes to the definitions of the time zones you use, particularly around daylight saving time start/end dates.
- Absolute Time Fixed at Start: The absolute start time of the next Run is computed and stored in the database when the previous Run completes, and is not recomputed. This means that if you have a Cron Schedule that runs very infrequently, and the definition of the time zone changes between one Run and the next, the Run might happen at the wrong time. For example,
CRON_TZ=America/Los_Angeles 0 12 11 11 *
means "noon in Los Angeles on November 11" (normally not in DST). If at some point the government makes any changes (for example, move the end of DST one week later, or stay on permanent DST year-round), the meaning of that specification changes. In that first year, the Run happens at the wrong time, because it was computed using the older definition.
How to stop a Temporal Cron Job
A Temporal Cron Job does not stop spawning Runs until it has been Terminated or until the Workflow Execution Timeout is reached.
A Cancellation Request affects only the current Run.
Use the Workflow Id in any requests to Cancel or Terminate.
What is a Start Delay?
Start Delay determines the amount of time to wait before initiating a Workflow Execution.
This is useful if you have a Workflow you want to schedule out in the future, but only want it to execute once: in comparison to reoccurring Workflows using Schedules.
If the Workflow receives a Signal-With-Start during the delay, it dispatches a Workflow Task and the remaining delay is bypassed. If the Workflow receives a Signal during the delay that is not a Signal-With-Start, the Signal does not interrupt the delay, and the Workflow continues to be delayed until the delay expires or a Signal-With-Start is received.
You can delay the dispatch of the initial Workflow Execution by setting this option in the Workflow Options field of the SDK of your choice.
What is a State Transition?
A State Transition is a unit of progress made by a Workflow Execution. Each State Transition is recorded in a persistence store.
Some operations, such as Activity Heartbeats, require only one or two State Transitions each. With an Activity Heartbeat, there are two: the Activity Heartbeat and a Timer.
Most operations require multiple State Transitions.
For example, a simple Workflow with two sequential Activity Tasks (and no retries) produces 11 State Transitions: two for Workflow start, four for each Activity, and one for Workflow completion.