Days #11 with Cloud Workflows: sleeping in a workflow

Workflows are not necessarily instantaneous, and executions can span over a long period of time. Some steps may potentially launch asynchronous operations, which might take seconds or minutes to finish, but you are not notified when the process is over. So when you want for something to finish, for example before polling again to check the status of the async operation, you can introduce a sleep operation in your workflows.

To introduce a sleep operation, add a step in the workflow with a call to the built-in sleep operation:

- someSleep:
    call: sys.sleep
        seconds: 10
- returnOutput:
    return: We waited for 10 seconds!

A sleep operation takes a seconds argument, where you can specify the number of seconds to wait.

By combining conditional jumps and sleep operations, you can easily implement polling some resource or API at a regular interval, to double check that it completed.

Day #10 with Cloud Workflows: accessing built-in environment variables

Google Cloud Workflows offers a few built-in environment variables that are accessible from your workflow executions.

There are currently 5 environment variables that are defined:

  • GOOGLE_CLOUD_PROJECT_NUMBER: The workflow project's number.

  • GOOGLE_CLOUD_PROJECT_ID: The workflow project's identifier.

  • GOOGLE_CLOUD_LOCATION: The workflow's location.

  • GOOGLE_CLOUD_WORKFLOW_ID: The workflow's identifier.

  • GOOGLE_CLOUD_WORKFLOW_REVISION_ID: The workflow's revision identifier.

Let’s see how to access them from our workflow definition:

- envVars:
      - projectID: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
      - projectNum: ${sys.get_env("GOOGLE_CLOUD_PROJECT_NUMBER")}
      - projectLocation: ${sys.get_env("GOOGLE_CLOUD_LOCATION")}
      - workflowID: ${sys.get_env("GOOGLE_CLOUD_WORKFLOW_ID")}
      - workflowRev: ${sys.get_env("GOOGLE_CLOUD_WORKFLOW_REVISION_ID")}
- output:
    return: ${projectID + " " + projectNum + " " + projectLocation + " " + workflowID + " " + workflowRev}

We use the built-in sys.get_env() function to access those variables. We’ll revisit the various existing built-in functions in later episodes.

Then when you execute this workflow, you’ll get an output like this:

"workflows-days 783331365595 europe-west4 w10-builtin-env-vars 000001-3af"

There’s one variable I’d like to see added to this list, that would be the current execution ID. That could potentially be useful for identifying a particular execution, when looking in the logs, to reason about potential failure, or for auditing purposes.

Day #9 with Cloud Workflows: deploying and executing workflows from the command-line

So far, in this series on Cloud Workflows, we’ve only used the Google Cloud Console UI to manage our workflow definitions, and their executions. But it’s also possible to deploy new definitions and update existing ones from the command-line, using the GCloud SDK. Let’s see how to do that!

If you don’t already have an existing service account, you should create one following these instructions. I’m going to use the workflow-sa service account I created for the purpose of this demonstration.                                                                                                                     

Our workflow definition is a simple “hello world” like the one we created for day #1 of our exploration of Google Cloud Workflows:

- hello:
    return: Hello from gcloud!

To deploy this workflow definition, we’ll launch the following gcloud command, specifying the name of our workflow, passing the local source definition, and the service account:

$ gcloud beta workflows deploy w09-new-workflow-from-cli \
    --source=w09-hello-from-gcloud.yaml \

You can also add labels with the --labels flag, and a description with the --description flag, just like in the Google Cloud Console UI. 

If you want to update the workflow definition, this is also the same command to invoke, passing the new version of your definition file.

Time to create an execution of our workflow!

$ gcloud beta workflows run w09-new-workflow-from-cli

You will see an output similar to this:

Waiting for execution [d4a3f4d4-db45-48dc-9c02-d25a05b0e0ed] to complete...done.
argument: 'null'
endTime: '2020-12-16T11:32:25.663937037Z'
name: projects/783331365595/locations/us-central1/workflows/w09-new-workflow-from-cli/executions/d4a3f4d4-db45-48dc-9c02-d25a05b0e0ed
result: '"Hello from gcloud!"'
startTime: '2020-12-16T11:32:25.526194298Z'
workflowRevisionId: 000001-47f

Our workflow being very simple, it executed and completed right away, hence why you see the result string (our Hello from gcloud! message), as well as the state as SUCCEEDED. However, workflows often take longer to execute, consisting of many steps. If the workflow hasn’t yet completed, you’ll see its status as ACTIVE instead, or potentially FAILED if something went wrong.

When the workflow takes a long time to complete, you can check the status of the last execution from your shell session with:

$ gcloud beta workflows executions describe-last

If you want to know about the ongoing workflow executions:

$ gcloud beta workflows executions list your-workflow-name

It’ll give you a list of operation IDs for those ongoing executions. You can then inspect a particular one with:

$ gcloud beta workflows executions describe the-operation-id

There are other operations on executions, to wait for an execution to finish, or even cancel an ongoing, unfinished execution. 

You can learn more about workflow execution in the documentation. And in some upcoming episodes, we’ll also have a look at how to create workflow executions from client libraries, and from the Cloud Workflows REST API.

Day #8 with Cloud Workflows: calling an HTTP endpoint

Time to do something pretty handy: calling an HTTP endpoint, from your Google Cloud Workflows definitions. Whether calling GCP specific APIs such as the ML APIs, REST APIs of other products like Cloud Firestore, or when calling your own services, third-party, external APIs, this capability lets you plug your business processes to the external world!

Let’s see calling HTTP endpoints in action in the following video, before diving into the details below:

By default, when creating a new workflow definition, a default snippet / example is provided for your inspiration. We’ll take a look at it for this article. There are actually two HTTP endpoint calls, the latter depending on the former: the first step (getCurrentTime) is a cloud function returning the day of the week, whereas the second step (readWikipedia) searches Wikipedia for articles about that day of the week.

- getCurrentTime:
    call: http.get
    result: CurrentDateTime

The getCurrentTime step contains a call attribute of type http.get, to make HTTP GET requests to an API endpoint. You have the ability to do either call: http.get or call: For other methods, you’ll have to do call: http.request, and add another key/value pair under args, with method: GET, POST, PATCH or DELETE. Under args, for now, we’ll just put the URL of our HTTP endpoint. The last key will be the result, which gives the name of a new variable that will contain the response of our HTTP request.

Let’s call Wikipedia with our day of the week search query:

- readWikipedia:
    call: http.get
            action: opensearch
            search: ${CurrentDateTime.body.dayOfTheWeek}
    result: WikiResult

Same thing with call, and args.url, however, we have a query where you can define the query parameters for the Wikipedia API. Also note how we can pass data from the previous step function invocation: CurrentDateTime.body.dayOfTheWeek. We retrieve the body of the response of the previous call, and from there, we get the dayOfTheWeek key in the resulting JSON document. We then return WikiResult, which is the response of that new API endpoint call.

- returnOutput:
    return: ${WikiResult.body[1]}

Then, the last step is here to return the result of our search. We retrieve the body of the response. The response’s body is an array, with a first term being the search query, and the second item is the following array of document names, which is what our workflow execution will return:

  "Monday Night Football",
  "Monday Night Wars",
  "Monday Night Countdown",
  "Monday Morning (newsletter)",
  "Monday Night Golf",
  "Monday Mornings",
  "Monday (The X-Files)",
  "Monday's Child",

So our whole workflow was able to orchestrate two independent API endpoints, one after the other. Instead of having two APIs that are coupled via some messaging passing mechanism, or worse, via explicit calls to one or the other, Cloud Workflows is here to organize those two calls. It’s the orchestration approach, instead of a choreography of services (see my previous article on orchestration vs choreography, and my colleague’s article on better service orchestration with Cloud Workflows).

To come back to the details of API endpoint calls, here’s their structure:

    call: {http.get||http.request}
        url: URL_VALUE
        [method: REQUEST_METHOD]
            KEY:VALUE ...]
            KEY:VALUE ...]
            KEY:VALUE ...]
        [timeout: VALUE_IN_SECONDS]
    [result: RESPONSE_VALUE]

In addition to the URL, the method and query, note that you can pass headers and a body. There is also a built-in mechanism for authentication which works with GCP APIs: the authentication is done transparently. You can also specify a timeout in seconds, if you want to fail fast and not wait forever a response that never comes. But we’ll come back to error handling in some of our upcoming articles.

Day #7 with Cloud Workflows: Pass an input argument to your workflow

All the workflow definitions we’ve seen so far, in this series, were self-contained. They were not parameterized. But we often need our business processes to take arguments (the ID of an order, the details of the order, etc.), so that we can treat those input values and do something about them. That’s where workflow input parameters become useful!

Let’s start with a simple greeting message that we want to customize with a firstname and lastname. We’d like our workflow to look something like this:

- output:
    return: ${"Your name is " + person.firstname + " " + person.lastname}

In the example above, we have a person variable, on which we’re requesting the fields firstname and lastname. This is actually a dictionary. But how do we let Cloud Workflows know about this variable? We need to define it somehow. 

Workflow arguments are global to all the steps, so they need to be defined outside the scope of the steps themselves. Actually, workflows can be structured in sub-workflows: there’s a main workflow, and possibly additional sub-workflows which are like routines or internal function definitions. We’ll revisit the topic of sub-workflows in a later article. To declare our input parameter, we’ll do it at the level of the main workflow, but in a more explicit fashion, with the following notation:

    params: [person]
        - output:
            return: ${"Your name is " + person.firstname + " " + person.lastname}

We explicitly show the name of our main workflow. We use the params instruction. Note that our single argument, person, is surrounded by square brackets. The main workflow can only take a single dictionary parameter, however, as we’ll see later, sub-workflows can take several input arguments, hence the square brackets notation to specify a list of arguments.

How do we pass this input argument? In the execution screen, in the input pane on the left, we create a JSON object, with a firstname and lastname keys. This JSON object is the dictionary in the person variable of our workflow definition.

In this video, you'll see input arguments in action:

© 2012 Guillaume Laforge | The views and opinions expressed here are mine and don't reflect the ones from my employer.