Schedule a workflow execution

There are different ways to launch the execution of a workflow. In previous articles, we mentioned that you can use the gcloud command-line tool to create an execution, you can also use the various client libraries to invoke Workflows, or use the REST API. A workflow itself can also invoke other workflows! 


But today, I’d like to tell you how to schedule the execution of a workflow. For that purpose, we’ll take advantage of Cloud Scheduler. The documentation is actually covering this topic in detail, so be sure to grab all the info there. However, I’ll go quickly through the steps, and tell you about a nice new feature in the cloud console to ease the scheduling of workflows!


First, you need to have both Workflows and Cloud Scheduler enabled:


gcloud services enable cloudscheduler.googleapis.com workflows.googleapis.com


Cloud Scheduler will need a service account with workflows.invoker role, to be allowed to call Workflows:


gcloud iam service-accounts create workflows_caller_sa
gcloud projects add-iam-policy-binding MY_PROJECT_ID \
  --member serviceAccount:workflows_caller_sa@MY_PROJECT_ID.iam.gserviceaccount.com \
  --role roles/workflows.invoker


Now it’s time to create the cron job:


gcloud scheduler jobs create http every_5_minute_schedule \
    --schedule="*/5 * * * *" \
    --uri="https://workflowexecutions.googleapis.com/v1/projects/MY_PROJECT_ID/locations/REGION_NAME/workflows/WORKFLOW_NAME/executions" \
    --message-body="{\"argument\": \"DOUBLE_ESCAPED_JSON_STRING\"}" \
    --time-zone="America/New_York" \
    --oauth-service-account-email="workflows_caller_sa@MY_PROJECT_ID.iam.gserviceaccount.com"


Here, you can see that Scheduler will run every 5 minutes (using the cron notation), and that it’s going to call the Workflows REST API to create a new execution. You can also pass an argument for the workflow input. 


The cool new feature I was eager to mention today was the direct integration of the scheduling as part of the Workflows creation flow, in the cloud console.


Now, when you create a new workflow, you can select a trigger:



Click on the “ADD NEW TRIGGER” button, and select “Scheduler”. A side panel on the right will show up, and you will be able to specify the schedule to create, directly integrated, instead of having to head over to the Cloud Scheduler product section:

And there, you can specify the various details of the schedule! It’s nice to see both products nicely integrated, to ease the flow of creating a scheduled workflow.

Using the Secret Manager connector for Workflows to call an authenticated service

Workflows allows you to call APIs, whether from or hosted on Google Cloud, or any external API in the wild. A few days ago, for example, we saw an example on how to use the SendGrid API to send emails from a workflow. However, in that article, I had the API key hard-coded into my workflow, which is a bad practice. Instead, we can store secrets in Secret Manager. Workflows has a specific connector for Secret Manager, and a useful method to access secrets.


In this article, we’ll learn two things:

  • How to access secrets stored in Secret Manager with the Workflows connector

  • How to call an API that requires basic authentication


Let's access the secrets I need to do my basic auth call to the API I need to call:


- get_secret_user:
    call: googleapis.secretmanager.v1.projects.secrets.versions.accessString
    args:
      secret_id: basicAuthUser
    result: secret_user


- get_secret_password:
    call: googleapis.secretmanager.v1.projects.secrets.versions.accessString
    args:
      secret_id: basicAuthPassword
    result: secret_password


The user login and password are now stored in variables that I can reuse in my workflow. I will create the Base64 encoded user:password string required to pass in the authorization header:


- assign_user_password:
    assign:
    - encodedUserPassword: ${base64.encode(text.encode(secret_user + ":" + secret_password))}


Equipped with my encoded user:password string, I can now call my API (here a cloud function) by added an authorization header with basic authentication (and return the output of the function):


- call_function:
    call: http.get
    args:
        url: https://europe-west1-workflows-days.cloudfunctions.net/basicAuthFn
        headers:
            Authorization: ${"Basic " + encodedUserPassword}
    result: fn_output
- return_result:

    return: ${fn_output.body}


Workflows has built-in OAuth2 and OIDC support for authenticating to Google hosted APIs, functions and Cloud Run services, but it’s also useful to know how to invoke other authenticated services, like those requiring basic auth, or other bearer tokens.

Load and use JSON data in your workflow from GCS

Following up the article on writing and reading JSON files in cloud storage buckets, we saw that we could access the data of the JSON file, and use it in our workflow. Let’s have a look at a concrete use of this.


Today, we’ll take advantage of this mechanism to avoid hard-coding the URLs of the APIs we call from our workflow. That way, it makes the workflow more portable across environments.


Let’s regroup the logic for reading and loading the JSON data in a reusable subworkflow:


read_env_from_gcs:
    params: [bucket, object]
    steps:
    - read_from_gcs:
        call: http.get
        args:
            url: ${"https://storage.googleapis.com/download/storage/v1/b/" + bucket + "/o/" + object}
            auth:
                type: OAuth2
            query:
                alt: media
        result: env_file_json_content
    - return_content:
        return: ${env_file_json_content.body}


You call this subworkflow with two parameters: the bucket name, and the object or file name that you want to load. 


Now let’s use it from the main workflow. We need a first step to call the subworkflow to load a specific file from a specific bucket. The subworkflow below will return the content of the JSON data in the env_details variable.


​​main:
    params: [input]
    steps:
    - load_env_details:
        call: read_env_from_gcs
        args:
            bucket: workflow_environment_info
            object: env-info.json
        result: env_details


Imagine the JSON file contains a JSON object with a SERVICE_URL key, pointing at the URL of a service, then you can call the service with the following expression: ${env_details.SERVICE_URL} as shown below.


    - call_service:
        call: http.get
        args:
            url: ${env_details.SERVICE_URL}
        result: service_result
    - return_result:
        return: ${service_result.body}


This is great for avoiding hardcoding certain values in your workflow definitions. However, for true environment-specific deployments, this is not yet ideal, as you would have to point to a different file in the bucket, or use a different bucket. And that information is currently hardcoded in the definition when you make the call to the subworkflow. But if you follow some naming conventions for the project names and bucket names, that map to environments, this can work! (ie. PROD_bucket vs DEV_bucket, or PROD-env-info.json vs DEV-env-info.json)


Let’s wait for the support of environment variables in Workflows!


Sending an email with SendGrid from Workflows

For notification purposes, especially in an asynchronous way, email is a great solution. I wanted to add an email notification step in Google Cloud Workflows. Since GCP doesn’t have an email service, I looked at the various email services available in the cloud: SendGrid, Mailgun, Mailjet, and even ran a quick Twitter poll to see what folks in the wild are using. I experimented with SendGrid, and the sign up process was pretty straightforward, as I was able to get started quickly, by creating an API key, and sending my first email with cURL command. 


Now comes the part where I needed to call that API from my workflow definition. And that’s actually pretty straightforward as well. Let’s see that in action:


- retrieve_api_key:
    assign:
        - SENDGRID_API_KEY: "MY_API_KEY"
- send_email:
    call: http.post
    args:
        url: https://api.sendgrid.com/v3/mail/send
        headers:
            Content-Type: "application/json"
            Authorization: ${"Bearer " + SENDGRID_API_KEY}
        body:
            personalizations:
                - to:
                    - email: to@example.com
            from:
                email: from@example.com
            subject: Sending an email from Workflows
            content:
                - type: text/plain
                  value: Here's the body of my email
    result: email_result
- return_result:
    return: ${email_result.body}


In the retrieve_api_key step, I simply hard-coded the SendGrid API key. However, you can of course store that secret in Secret Manager, and then fetch the secret key thanks to the Workflows Secret Manager connector (that’s probably worth a dedicated article!)


Then, in the send_email step, I prepare my HTTP POST request to the SendGrid API endpoint. I specify the content type, and of course, the authorization using the SendGrid API key. Next, I prepare the body of that request, describing my email, with a from field with a registered email user that I defined in SendGrid, a to field corresponding to the recipient, an email subject and body (just plain text, here). And that’s pretty much it! I just translated the JSON body sent in the cURL example from SendGrid’s documentation, into YAML (using a handy JSON to YAML conversion utility)


Reading in and writing a JSON file to a storage bucket from a workflow

Workflows provides several connectors for interacting with various Google Cloud APIs and services. In the past, I’ve used for example the Document AI connector to parse documents like expense receipts, or the Secret Manager connector to store and access secrets like passwords. Another useful connector I was interested in using today was the Google Cloud Storage connector, to store and read files stored in storage buckets. 


Those connectors are auto-generated from their API discovery descriptors, but there are some limitations currently that prevent, for example, to download the content of a file. So instead of using the connector, I looked at the JSON API for cloud storage to see what it offered (insert and get methods).


What I wanted to do was to store a JSON document, and to read a JSON document. I haven’t tried with other media types yet, like pictures or other binary files. Anyhow, here’s how to write a JSON file into a cloud storage bucket:


main:
    params: [input]
    steps:
    - assignment:
        assign:
            - bucket: YOUR_BUCKET_NAME_HERE
    - write_to_gcs:
        call: http.post
        args:
            url: ${"https://storage.googleapis.com/upload/storage/v1/b/" + bucket + "/o"}
            auth:
                type: OAuth2
            query:
                name: THE_FILE_NAME_HERE
            body:
                name: Guillaume
                age: 99


In the file, I’m storing a JSON document that contains a couple keys, defined in the body of that call. By default, here, a JSON media type is assumed, so the body defined at the bottom in YAML is actually written as JSON in the resulting file. Oh and of course, don’t forget to change the names of the bucket and the object in the example above.


And now, here’s how you can read the content of the file from the bucket:


main:
    params: [input]
    steps:
    - assignment:
        assign:
            - bucket: YOUR_BUCKET_NAME_HERE
            - name: THE_FILE_NAME_HERE
    - read_from_gcs:
        call: http.get
        args:
            url: ${"https://storage.googleapis.com/download/storage/v1/b/" + bucket + "/o/" + name}
            auth:
                type: OAuth2
            query:
                alt: media
        result: data_json_content
    - return_content:
        return: ${data_json_content.body}


This time we change the GCS URL from “upload” to “download”, and we use the alt=media query parameter to instruct the GCS JSON API that we want to retrieve the content of the file (not just its metadata). In the end, we return the body of that call, which contains the content.






 
© 2012 Guillaume Laforge | The views and opinions expressed here are mine and don't reflect the ones from my employer.