Serverless: Panacea, or not?

At DevDay Belgium, a few months ago, I had the pleasure to give a keynote on the theme of "serverless". Let me share with you this talk today!

The Serverless Panacea… Or Not?

The term “serverless” has become a trendy buzzword: if you don’t have the checkbox ticked, you’re not cool anymore. Really?

Spoiler alert: There may be servers involved in serverless solutions. It’s not just about function-as-a-service. And it’s actually more complicated than it may seem!

But first, let’s come back to the basics: what is serverless exactly, where does it come from, what are its characteristics? Then, beyond the definition, we’ll discuss the challenges, and the risks associated with serverless architectures. Eventually, going further, we’ll think about where serverless is heading in the near future.

You can find the slides online here, and you can find the video below. Further down, I'll detail each slide of my keynote:


Let's dive in!


Today, I’d like to tell you about Serverless.

As a Developer Advocate, for Google Cloud, that’s products and the topic I’m focusing on.

Serverless is a big buzzword of the day, but is it a real panacea or not?




Like Obelix, I fell into the magic serverless potion a long time ago…

I started playing with Google App Engine Java in 2009, even before it was officially announced by Google at the Google I/O conference.

Google team reached out to me, to work together in stealth mode, to ensure that alternative JVM languages would run fine on their upcoming Java flavor for App Engine (I’m the co-founder of the Groovy language)

I couldn’t imagine then that I’d be starting to work for Google 7 years later. And that I would focus on those serverless solutions!

It was still called Platform-as-a-Service, as the term “serverless” wasn’t invented yet (although they are pretty similar)

And I’ve been a big fan and big user of App Engine ever since.




After this brief personal story with Obelix and Google, let’s actually start with a little bit of background and history.

This is my version of the story, so don’t take it too seriously.




At the beginning, humans created the server. 

A machine on which you could run various programs and apps.

Well, we also created the internet, of course, otherwise we couldn’t connect our web apps to our users.

If you have a few users, a single server may suffice.

But you know how it goes, with the spread of the web and internet, we now have billions of users, and millions of servers.

Things kinda got complicated, and we introduced lots of hard concepts around distributed microservices, replicated databases.

We even coined theorems, like the CAP theorem, for Consistency, Availability, Partitioning. But you can only pick 2.




Humans invented the cloud, in order to avoid dealing with the physical world.

But there are still databases, servers, or virtual machines to manage.

However, you don’t have to get your hands dirty with the ethernet cables, changing the failing hard-drives, upgrading to the latest CPU and RAM.

Usually, it’s the cloud provider that has folks that wake up in the middle of the night to upgrade those things.

You can sleep a little bit better at night, even if your boss may still call you at 3am because your app is misbehaving.




To focus on the code, and to avoid the complexity of managing servers, provisioning clusters, configuring networking, fine-tuning databases, even in the cloud… humans came up with the concept of serverless! 

Here is a picture of the latest Google datacenter for serverless! Look, no servers!

Well, I’m kidding, of course, there are always servers around!




Even if the word serverless wasn’t there yet, it all started with Platform-as-a-Service, with App Engine and Heroku.

The promise was “give us your code or your app, and we’ll run it for you”.

The hardware management aspect was already there. Scaling was also handled by the platform.

The pricing as well was proportional to the usage of the resources.




You also have BaaS — Backend as a Service

It’s pretty similar to PaaS actually.

It comes with batteries-included. You focus on the frontend, and all the backend is provided for you.

Parse and Firebase are two good examples. Facebook kinda abandoned Parse into open-source land.

But Firebase is still around and integrates more and more with the various Google Cloud Platform services.

So you can have hosting of static assets, a datastore to save your information, some kind of runtime environment to run your business logic.

And tons of other services, for authentication, mobile crash analysis, performance testing, analytics, and more.




PaaS, then BaaS, and also FaaS: Functions-as-a-service.

With a FaaS solution, your unit of work, of deployment, becomes a small, granular function.

This concept was popularized by Amazon Lambda.

And often, even still today, people tend to confuse FaaS with Serverless.

But FaaS is really just one facet of the Serverless ecosystem. Like PaaS or BaaS.




Another interesting facet of serverless is the Container-as-a-Service approach, with containerized workloads.

Instead deploying apps or functions, you’re deploying containers.

Put anything you want inside a container. 

That’s the approach that Google took with its Cloud Run container service.

You don’t have the complexity of Kubernetes, but you can run your container easily in the cloud, in a fully managed environment.




Right, so I described a bit the history that lead us to serverless and its various facets, and some of the serverless products that are available today, but let’s take some time to give a proper definition of what serverless is.




For me, Serverless is the easiest way to get an idea to production in a minimal amount of time.

As a developer, you work on some code, and then you deploy it. That’s it! Really!




The term serverless was coined around 2010 by someone called Ken Elkabany, who created the PiCloud computing platform.

Compared to Heroku and App Engine which came earlier, and focusing on implementing web stacks in their cloud datacenters, PiCloud was more generic and supported different kinds of workloads, not just serving web requests.

The catchy term came from the fact that they were actually selling a service, rather than selling or renting servers, machines, VMs, to their customers.




There are 2 ways to think about Serverless: there’s the Operational model, and the Programming model.


Operational model:

  • Fully managed, 

  • Automatic scaling, 

  • Pay as you go


Programming model

  • Service based, 

  • Event driven, 

  • Stateless




There’s no provisioning of clusters, servers, instances, VMs or anything.

It’s all handled by the platform, for you. Just give your code, your app, your function.

It’s a fully managed environment. Security patches are applied automatically. 

You remember specter and meltdown recently? They’ve been mitigated transparently and rapidly for customers by Google Cloud. No wake up call in the night.

Your apps will scale automatically, from 0 to 1 instance, from 1 to n instances, and from n down to 1, as well as back to zero.

Tracking the CPU load, memory usage, number of incoming requests, with some magic formula, serverless platforms are able to scale up and down your services.

Without you having to worry about it. The cloud provider is handling that for you.




In terms of pricing, it’s a Pay-as-you-go cost model.

It goes hand in hand with automatic scaling.

If there’s no traffic, you pay zero.

If there’s twice more traffic than usual, you pay proportionately.

And if the load goes back to zero, the instances serving your app are decommissioned, and again you pay zero.




Now onto the programming model.

More and more, we’re transitioning from building big monoliths into orchestrating smaller services, or microservices.

It has its challenges though, but with smaller services, your teams can develop them more independently, scale them differently, or event deploy them with different life cycles.




Since you have some more loosely coupled services, they tend to react to incoming events from your system or from the cloud (for example a notification of a new file in cloud storage, a new line in a reactive datastore like Cloud Firestore, a message in a message bus like Pub/Sub), 

Your services usually communicate asynchronously, to stay properly decoupled.

But the more asynchronous you are, the harder things are to operate and monitor, when business logic spans several services, you have to figure out what’s the current status of that workflow across those services.




Another important aspect of the fact that services can scale up and down and back to zero is that there’s no guarantee that you’re going to hit the same server all the time. 

So you can’t be certain that some data that would be cached is still there.

You have to program defensively to ensure that any fresh instance of your app is able to cope with any incoming request.

State is pretty much an enemy of scaling. So the more stateless you can be, the better it is.




  • Compute, 

  • Data Analytics

  • ML & AI

  • Database & Storage

  • Smart assistants & chat

  • DevOps

  • Messaging


We’ve been speaking about serverless compute, but serverless is not just about compute.

You could consider that anything that is fully managed, that offers a pay-as-you-go cost model, that is a service in the cloud, then it’s also all serverless, since you don’t have to worry about the infrastructure and the scaling.

There are great examples of this in Google Cloud, for example BigQuery which is a fully-managed, serverless data warehouse and analytics platform. You pay proportionally to the amount of data your queries are going through! Not for the storage, not for the running servers, etc. 

But let’s get back to serverless compute.




Serverless sounds pretty cool, right?

But there are also challenges, compared to running a good old monolith on your on-premises server.

We’ve already given some hints of some of the challenges.

In particular, I’d like to spend some time to tell you about four key aspects:

The lock-in factor

The infamous cold starts

Cost controls

And the mess of spaghetti microservices




PaaS or BaaS often come with batteries-included.

They have built-in APIs or databases, which are super convenient for developers.

As a developer, you are much more productive, because you don’t have to wire things up, configure third-party services. The choice is already made for you.

But I’m seeing those batteries more and more being externalized, as their own standalone products. 

Google Cloud has externalized things like its NoSQL database, its Pub/Sub message hub, its scheduler, its task handling capabilities. Before those services were actually part of the Platform-as-a-Service.




However great having built-in powerful batteries is, often these are proprietary and specific to that platform.

You end up being locked to the platform you’re building upon.

It can be a choice, as long as you are aware of it.

It’s a trade-off between portability and time-to-market.

You might still be tied to those products, but at least, you can still move the business logic around, if those services are externalized. 

And you can create a level of indirection to be able, some day, potentially, to move away from those service dependencies if needed.




A common issue you hear about in serverless-land is the infamous “Cold Start”.

Since you can scale to zero, it means there’s currently no server, instance, or clone, to serve an incoming request.

So what happens? The cloud provider has to reinitialize, re-instantiate, re-hydrate some kind of server, VM, or container.

Additionally, the underlying language runtime has to startup as well, initializing its internal data structures.

Not only that, but your app also needs to get started too.

So you’d better try to minimize the time your apps need to be ready to serve their first request. Since you have control over this part.




There are workarounds, like pinging your service at regular intervals to keep it warm, but it’s a bit of a hack, or even an anti-pattern.

Depending on the pricing, that might mean you’re paying for nothing potentially, for something that’s sitting idle.

Some platforms provide some knobs that you can play with like “min instances” or “provisioned instances”, usually at a lower price.

For instance, on Google Cloud Functions or Cloud Run, you can specify that you want a certain minimum number of instances that are already warm, and ready to serve, and that are cheaper.




I mention a minimum number of instances, but what about the notion of maximum number of instances?

It’s actually an important idea. 

With a platform that auto-scales transparently, that can spin up as many instances to serve increased traffic, it also means that your costs are going to increase just as much! 

So in order to bound your budget to a known quantity, rather than burning your money with all your hot instances, you can cap the number of instances that will serve your content. The service may be a bit degraded when you reach that limit, as latency will likely increase, but at least, your budget doesn’t go through the roof!

That’s why Google Cloud Platform introduced that notion of capping the number of instances running your functions, apps or containers in its serverless product: to have more visibility and control around costs.




The last challenge I’d like to mention is spaghetti services.

It’s so easy to write many functions and services on a serverless platform.

One service does one thing and does it well, right?

But after a while, you end up with a spaghetti of microservices. A big mess.

It becomes very complicated to see what invokes what. 

Hard for monitoring and observability to really figure out what happened, when one microservice starting somehow to misbehave completely ruin your clockwork architecture.

And you know: monoliths aren’t that bad, actually.

Don’t start right away with writing the smallest unit of work possible. 

Pay attention to how you split the big monolith into microservices.

Otherwise, you’ll end up with that big plate of spaghetti. 

There are good articles on when and how to split monolith, but it’s not a simple rule of thumb answer.




So what does the future hold for serverless?

I believe that the highlights will be about:

  • Openness

  • Containers

  • Glue

  • Edge

  • Machine Learning




Let’s start with open, and openness. That’s open like in open source!

We want to avoid lock-in. We want portability.

For instance, the platforms rely on open source software for sure, but the platforms themselves can be open source too.

If you look at Google’s Cloud Run, it’s actually based on the Knative open source Kubernetes platform.

So you’re not locked in Google Cloud when you’re using Cloud Run. You can move your workload, your app, on a Knative compatible platform from another cloud provider, or even on-premises, on your own infrastructure.

I worked on the Java Cloud Functions runtime, and it is also available as open source. So you can deploy your functions in Google Cloud, but you can also run your functions elsewhere too, in a hybrid cloud scenario, or even just locally on your machine for greater developer experience and a tighter development feedback loop.

Also, the way you build your services from sources, this can be made more open too. 

For instance, Heroku and Google Cloud partnered together on Cloud Native Buildpacks, it helps you transform your application source code into images that can run on any cloud.

Really, it’s all about portability and avoiding lock-in, by making things as open as possible.




As I’m mentioning Cloud Native Buildpacks, and the fact it builds portable containers for your app’s source code, notice that we’re speaking of containers.

Why containers, you may ask. 

With things like platform or function as a service, you are pushing apps on the platform runtime. But you may be limited in terms of language runtime, or library, or binary that you can run there or bundle. If you’re using an esoteric language, or need some special software installed, perhaps you won’t be able to run your app there.

Instead, if you could put everything you need in a box, and if the cloud could just run that box for you, then you can do pretty much anything.

That’s why we’re using containers more and more. And that’s also why Google Cloud released Cloud Run, to run your containers, but serverlessly, with any runtime, library or language that you want, without limitations.

So I’m seeing more containers in the future.




You remember my plate of spaghettis? 

To orchestrate your services, to observe and monitor them, to track that they are communicating properly, asynchronously, you’ll need more tools to ensure that it all runs fine in the cloud. That’s why I’m seeing more tools like Google Cloud Tasks, Cloud Scheduler, Cloud Workflows, and in the Azure and AWS worlds, you have things like Logic App or Step Functions.

You also have various messaging busses, like Google Cloud Pub/Sub, Amazon SQS, Azure Service Bus.

And in the Kubernetes world, we’ve seen service meshes emerge as a key architectural pattern.

A monolith is much simpler to develop & operate, but as you move to a microservice architecture, those glue services will be needed more and more.

So I see more glue in the future!




Recently, CloudFlare released a product called CloudFlare Workers.

It’s using the V8 JavaScript engine, and its isolates concept to run your functions, in a sandboxed manner.

There are two very interesting aspects to me in this new product.

First of all, that’s the idea of having your serverless functions run at the edge of the network. 

Not deep in a handful of big data centers. Instead, those functions are as close to the users as possible.

So the latency is really minimal.

Secondly, to further reduce latency, there’s a great innovation that almost completely eliminates cold starts!

CloudFlare actually starts warming up your function as soon as the SSL handshake is requested when you invoke the function via HTTPS, although normally the whole handshake operation has to be done first, and the call routed to your function, before really starting.

So that’s a really great optimization! And we’ll probably see more stuff moving to the edge of the cloud infrastructure.




Lastly, looking even further in the future, I’m curious to see how machine learning will play a role in the serverless offering of cloud providers.

In particular, you still have to specify a VM or instance size, its memory or CPU. Some would say it’s not very serverless, since servers are supposed to be abstracted away.

In Google Cloud, for example, we have what we call a “clone scheduler” that is responsible for creating a new instance of your function or app, depending on various factors, like CPU usage, memory usage, number of incoming queries, etc. 

There’s some magical calculation that figures out how and when to spin up a new instance.




Google recently automated its datacenter thanks to Machine Learning, reducing its power consumption by 40%! (Power Usage Efficiency)

I can imagine a future where Machine Learning is used to further upsize or downsize the underlying machines running your serverless code, and provision the right amount of resources, to reduce latency, CPU usage, etc.

So let’s see what the future holds for Serverless!





Day #15 with Cloud Workflows: built-in cloud logging function

In the two previous episodes, we saw how to create and call subworkflows, and we applied this technique to making a reusable routine for logging with Cloud Logging. However, there’s already a built-in function for that purpose! So let’s have a look at this integration.


To call the built-in logging function, just create a new step, and make a call to the sys.log function:


- logString:
    call: sys.log
    args:
        text: Hello Cloud Logging!
        severity: INFO


This function takes a mandatory parameter: text. And an optional one: severity.


The text parameter accepts all types of supported values, so it’s not only string, but all kinds of numbers, as well as arrays and dictionaries. Their string representation will be used as text.


The optional severity parameter is an enum that can take the values: DEFAULT, DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, EMERGENCY, with DEFAULT being… the default value if you don’t specify a severity!


Here’s another example with a dictionary as parameter, which will be output as text in the logs, and a severity of WARNING:


- createDict:
    assign:
        - person:
            name: Guillaume
            kids: 2
- logDict:
    call: sys.log
    args:
        text: ${person}
        severity: WARNING


Looking at the results in the cloud logging console, you will see both messages appear:



Don’t hesitate to have a look at reference documentation to find more about the available built-in functions.


Day #14 with Cloud Workflows: Subworkflows

Workflows are made of sequences of steps and branches. Sometimes, some particular sequence of steps can be repeated, and it would be a good idea to avoid error-prone repetitions in your workflow definition (in particular if you change in one place, and forget to change in another place). You can modularize your definition by creating subworkflows, a bit like subroutines or functions in programming languages. For example, yesterday, we had a look at how to log to Cloud Logging: if you want to log in several places in your workflow, you can extract that routine in a subworkflow.


Let’s see that in action in the video below, and you can read all the explanations afterwards:



First things first, let’s step back and look at the structure of workflow definitions. You write a series of steps, directly in the main YAML file. You can move back and forth between steps thanks to jumps, but it wouldn’t be convenient to use jumps to emulate subroutines (remember the good old days of BASIC and its gotos?). Instead, Cloud Workflows allows you to separate steps under a “main”, and subroutines under their own subroutine name.


So far we had just a sequence of steps:


- stepOne:
    ...
- stepTwo:
    ...
- stepThree:
  ...


Those steps are implicitly under a main routine. And here’s how to show this main routine explicitly, by having that main block, and steps underneath:


main:
    steps:
        - stepOne:
            ...
        - stepTwo:
            ...
        - stepThree:
          ...


To create a subworkflow, we follow the same structure, but with a different name than name, but you can also pass parameters like so:


subWorkflow:
    params: [param1, param2, param3: "default value"]
    steps:
        - stepOne:
            ...
        - stepTwo:
            ...
        - stepThree:
          ...


Notice that you can pass several parameters, and that parameters can have default values when that parameter is not provided at the call site.


Then in your main flow, you can call that subworkflow with a call instruction. Let’s take a look at a concrete example, that simply concatenates two strings:


main:
    steps:
        - greet:
            call: greet
            args:
                greeting: "Hello"
                name: "Guillaume"
            result: concatenation
        - returning:
            return: ${concatenation}

greet:
    params: [greeting, name: "World"]
    steps:
        - append:
            return: ${greeting + ", " + name + "!"}


In the call instruction, we pass the greeting and name arguments, and the result will contain the output of the subworkflow call. In the subworkflow, we defined our parameters, and we have a single step just return an expression which is the desired greeting message concatenation.


One last example, but perhaps more useful than concatenating strings! Let’s turn yesterday’s Cloud Logging integration into a reusable subworkflow. That way, you’ll be able to call the log subworkflow as many times as needed in your main workflow definition, without repeating yourself:


main:
  steps:
    - first_log_msg:
        call: logMessage
        args:
          msg: "First message"
    - second_log_msg:
        call: logMessage
        args:
          msg: "Second message"
   
logMessage:
  params: [msg]
  steps:
    - log:
        call: http.post
        args:
            url: https://logging.googleapis.com/v2/entries:write
            auth:
                type: OAuth2
            body:
                entries:
                    - logName: ${"projects/" + sys.get_env("GOOGLE_CLOUD_PROJECT_ID") + "/logs/workflow_logger"}
                      resource:
                        type: "audited_resource"
                        labels: {}
                      textPayload: ${msg}


And voila! We called our logMessage subworkflow twice in our main workflow, just passing the text message to log into Cloud Logging.

Day #13 with Cloud Workflows: Logging with Cloud Logging

Time to come back to our series on Cloud Workflows. Sometimes, for debugging purposes or for auditing, it is useful to be able to log some information via Cloud Logging. As we saw last month, you can call HTTP endpoints from your workflow. We can actually use Cloud Logging’s REST API to log such messages! Let’s see that in action.


- log:
    call: http.post
    args:
        url: https://logging.googleapis.com/v2/entries:write
        auth:
            type: OAuth2
        body:
            entries:
                - logName: ${"projects/" + sys.get_env("GOOGLE_CLOUD_PROJECT_ID") + "/logs/workflow_logger"}
                  resource:
                    type: "audited_resource"
                    labels: {}
                  textPayload: Hello World from Cloud Workflows!


We call the https://logging.googleapis.com/v2/entries:write API endpoint to write new logging entries. We authenticate via OAuth2—as long as the service account used for the workflow execution allows it to use the logging API. Then we pass a JSON structure as the body of the call, indicating the name of the logger to use, which resources it applies to, and also the textPayload containing our text message. You could also use a ${} expression to log more complex values.


Once this workflow definition is done and deployed, you can execute it, and you should see in the logs your message appear:



Voila! You can log messages to Cloud Logging!


Let's recap in this video:



In the next episode, we’ll take advantage of subworkflows, to create a reusable set of steps that you will be able to call several times throughout your workflow definition, without repeating yourself, by turning this logging example into a subworkflow.


Day #12 with Cloud Workflows: loops and iterations

In previous episodes of this Cloud Workflows series, we’ve learned about variable assignment, data structures like arrays, jumps and switch conditions to move between steps, and expressions to do some computations, including potentially some built-in functions. 


With all these previous learnings, we are now equipped with all the tools to let us create loops and iterations, like for example, iterating over the element of an array, perhaps to call an API several times but with different arguments. So let’s see how to create such an iteration!




First of all, let’s prepare some variable assignments:


- define:
    assign:
        - array: ['Google', 'Cloud', 'Workflows']
        - result: ""
        - i: 0


  • The array variable will hold the values we’ll be iterating over.

  • The result variable contains a string to which we’ll append each values from the array.

  • And the i variable is an index, to know our position in the array.


Next, like in a for loop of programming languages, we need to prepare a condition for the loop to finish. We’ll do that in a dedicated step:


- checkCondition:
    switch:
        - condition: ${i < len(array)}
          next: iterate
    next: returnResult


We define a switch, with a condition expression that compares the current index position with the length of the array, using the built-in len() function. If the condition is true, we’ll go to an iterate step. If it’s false, we’ll go to the ending step (called returnResult here).


Let’s tackle the iteration body itself. Here, it’s quite simple, as we’re just assigning new values to the variables: we append the i-th element of the array into the result variable, and we increment the index by one. Then we go back to the checkCondition step.


- iterate:
    assign:
        - result: ${result + array[i] + " "}
        - i: ${i+1}
    next: checkCondition


Note that if we were doing something more convoluted, for example calling an HTTP endpoint with the element of the array as argument, we would need two steps: one for the actual HTTP endpoint call, and one for incrementing the index value. However in the example above, we’re only assigning variables, so we did the whole body of the iteration in this simple assignment step.


When going through the checkCondition step, if the condition is not met (ie. we’ve reached the end of the array), then we’re redirected to the returnResult step:


- returnResult:
    return: ${result}


This final step simply returns the value of the result variable.


 
© 2012 Guillaume Laforge | The views and opinions expressed here are mine and don't reflect the ones from my employer.