Viewing my Groovy source files in Stackdriver's debug view

As I was working on a demo for one of my talks at Devoxx, I was encountering a bug in my Groovy code (a Gaelyk app using Glide). I had deployed a new version of my App Engine app, changing some code to persist some data in the Datastore. After those changes, I saw a trace in the logs:

Looks like there's an error in receiveTweet.groovy on line 11. And there's a link! Although I hadn't linked the source code to the application, I was surprised to see this link. But I knew that Stackdriver is able to pick up sources in different ways (from uploaded local files, from a Google code source repository, from Github or BitBucket, or with a "source capture").

And actually, clicking that link brought me to the debug view, offering me the different ways to link to or upload the source code. Conveniently, the source capture approach provided a command, using the gcloud CLI, to link the sources with traces:

I then launched that command in my terminal, and I was able to see the trace along with my source code afterwards in the Web console:

On the left side, I can see my Groovy source files, highlighting the offending script with the bug. In the middle column: at the bottom, the stacktrace, and at the top, the source code, with the line where the exception occurred highlighted in blue. 

On the right, there's also the live debugger view! But I haven't played with it yet, but it's pretty powerful, as you can live debug a production app! Let's keep it for another post!

However, now with your Groovy hat on, you'll notice two things:

The Groovy source code is not nicely colored! Syntax coloring is available for languages like Java, Go, Python, JavaScript, but alas, not (yet?) for Groovy!

The other funny thing was the red message on the right as well:

Although it says only files with .java extension are supported, it was nice to see that it was still showing my Groovy source code!


It's pretty neat to be able to associate the code and the logs directly in the web interface, to quickly spot where problems are coming from. Of course, you can go back to your IDE or text editor to find out (and ultimately that's what you'll be doing) but it's pretty handy to quickly visualize the origin of the problem and start figuring out what the problem may be. 

Also, as I said, I haven't tried the live production debugger, but it's quite a killer feature in my book to be able to introspect a running app. It's not always easy to figure out some problems locally, as your emulator is not the actually running infrastructure, your tests are mocking things out but are "not like the real thing", so having the ability to dive deeper in the running system is pretty compelling!

IP filtering access to your VMs on Google Cloud

How do you filter access to your VMs on Google Cloud Platform? During a discussion with a customer, I was asked this question: only certain IP addresses or a range of IP addresses should have access to a particular VM. Let's see that in action!

Let's assume you already have an account on Google Cloud Platform, but if you don't, don't miss the $300 credits for a free trial! I created a new project, then navigated to the Compute Engine section to create a new VM instance. I used all the default parameters, except that I checked the checkbox for "Allow HTTP traffic", at the bottom of the following screenshot:

For the purpose of this demo, I went with allowing traffic first, and then updating the firewall rule, but the best approach (since you don't want to let users access this VM) is to not allow HTTP traffic, and add the right rule afterwards. But I wanted to check that the traffic was flowing through normally, and then updated the rule to check that, indeed, the traffic was filtered.

My VM server isn't doing anything useful at this point, so I should at least run some web app on it! Wearing my Groovy hat on, I decided to write a quick Groovy script with the Ratpack framework. Let's see how to setup our VM to serve a simple hello world!

Once your VM instance is instantiated, you'll see a little SSH link along your instance in the list of running VMs. You can click on it, and you'll be able to SSH into your running system. So what's the recipe to run a little hello world in Ratpack? I installed OpenJDK 8, SDKMan (to install Groovy, but which needed unzip to be installed for itself), and Groovy, with the following steps:

sudo su -
apt-get update
apt-get install openjdk-8-jdk
apt-get install unzip
curl -s "" | bash
source "/root/.sdkman/bin/"
sdk install groovy
mkdir ratpack
cd ratpack
vim hello.groovy

Then I created the hello.groovy Ratpack server with the following code:

import static ratpack.groovy.Groovy.ratpack
ratpack { serverConfig { port 80 }     handlers {
        get {
            render "Hello World!"

And then, I was ready to fire it with:

groovy hello

If you go back to your Google Cloud console, in the list of running instance, you certainly noticed the column showing the "External IP" address of your sever? Now you just need to let your browser open it. So head over to (or whichever IP you got), and you should see the infamous "Hello World!" message!

So far so good, but what we really want is to prevent access to this machine from anywhere, except a particular IP or range of IP addresses. Let's see how to do that next.

Let's go to the "Networking > Firewall rules":

We're going to update the first rule: "default-allow-http". Instead of "allowing from any source" with the "" IP range, we're going to use our own custom range. In our case, let's say my own external IP address is, so I'll restrict the range to just this IP with entering "" as "Source IP range". Let's save the firewall, and let the platform apply that change to our deployment. Once the change has taken place, you'll still be able to access your server at, because only your own IP address is white listed basically. But if you try with any other address (from a co-worker's machine, etc.), normally, nobody else will be able to access the sever beside you.


And now, for the bonus points! I started playing with that last week, and made the mistake of letting my VM instance running, underutilized. And this afternoon, as I resumed working on this article, I watch the list of instances running, and what do I see? The console telling me my VM instance is under-utilized and that I could save money by using a smaller VM instead! Looks like Google Cloud doesn't want me to waste my money! Sweet!

gcloud informative update message

I was playing with the new IntelliJ IDEA plugin for Google App Engine yesterday. The plugin depends on the gcloud SDK to do its work. And I started exploring gcloud a little bit more. 

I was experiencing some odd bug which prevented me to run my application locally with the App Engine's local app server. It was a bug which was present in an old version of gcloud and its App Engine component, so I had to update the SDK and its App Engine Java component to fix it. No big deal, but what I wanted to highlight here was a little detail about that upgrade process.

I love when SDKs give me information about what needs updating and what's new in the new versions I'm using!

I've been using SDKMan for dealing with various SDK installations, like those for Groovy, Grails, Gradle, etc, and I've always liked when it was telling me which SDK updates were available, what was new in SDKMan. And I'm glad to see that gcloud behaves the same, and gives informative details about what's new. So let's see that in action.

First of all, while debugging my problem with a colleague, he asked me which versions of the SDK and the App Engine component I had. So I ran the following command:

$ gcloud version
Google Cloud SDK 119.0.0
alpha 2016.01.12
app-engine-java 1.9.38
app-engine-python 1.9.38
beta 2016.01.12
bq 2.0.24
bq-nix 2.0.24
core 2016.07.21
core-nix 2016.06.06
gsutil 4.19
gsutil-nix 4.19

At the time of this writing, the latest version of gcloud was actually 127.0.0, but I had 119.0.0. And for the app-engine-java component, I had version 1.9.38 although 1.9.42 was available. So it was time to update!

$ gcloud components update

Your current Cloud SDK version is: 119.0.0
You will be upgraded to version: 127.0.0
│            These components will be updated.             │
│               Name              │  Version   │    Size   │
│ BigQuery Command Line Tool      │     2.0.24 │   < 1 MiB │
│ Cloud SDK Core Libraries        │ 2016.09.20 │   4.9 MiB │
│ Cloud Storage Command Line Tool │       4.21 │   2.8 MiB │
│ gcloud app Java Extensions      │     1.9.42 │ 135.6 MiB │
│ gcloud app Python Extensions    │     1.9.40 │   7.2 MiB │
The following release notes are new in this upgrade.
Please read carefully for information about new features, breaking changes,
and bugs fixed.  The latest full release notes can be viewed at:
127.0.0 (2016/09/21)
  Google BigQuery
      ▪ New load/query option in BigQuery client to support schema update
        within a load/query job.
      ▪ New query option in BigQuery client to specify query parameters in
        Standard SQL.
  Google Cloud Dataproc
      ▪ gcloud dataproc clusters create flag
        --preemptible-worker-boot-disk-size can be used to specify future
        preemptible VM boot disk size.
  Google Container Engine
      ▪ Update kubectl to version 1.3.7.
  Google Cloud ML
      ▪ New gcloud beta ml predict command to do online prediction.
      ▪ New gcloud beta ml jobs submit prediction command to submit batch
        prediction job.
  Google Cloud SQL
      ▪ New arguments to beta sql instances create/patch commands for Cloud
        SQL Second Generation instances:
        ◆ --storage-size Sets storage size in GB.
        ◆ --maintenance-release-channel Sets production or preview channel
          for maintenance window.
        ◆ --maintenance-window-day Sets day of week for maintenance window.
        ◆ --maintenance-window-hour Sets hour of day for maintenance window.
        ◆ --maintenance-window-any (patch only) Clears maintenance window

I snipped the output to just show the details of the changes for the latest version of gcloud, but it showed me the actual changelog up until the version I had... and as I hadn't updated in a while, there was lots of improvements and fixes! But it's really nice to see what had changed, and sometimes, you can discover some gems you weren't even aware of!

So if you're working on some kind of SDK, with auto-update capabilities, be sure to provide a built-in changelog facility to help your users know what's new and improved!

JavaOne 2016 sessions

Next week will be this time of the year where tons of Java developers are gathering & meeting in San Francisco for JavaOne. It'll be my 10th edition or so, time flies! 

This year, I'll participate to a couple sessions:

  • Java and the Commoditization of Machine Intelligence [CON2291]
    It's a panel discussion with representative from IBM, Microsoft and Google to talk about Machine Learning APIs. I'll be covering the ML APIs from Google Cloud Platform: Vision, Speech, Natural Language.
  • A Groovy Journey in Open Source [CON5932]
    In this session, I'll cover the history of the Apache Groovy project, and talk about the latest developments and new features.
Google colleagues will also be present to speak about:

  • gRPC 101 for Java Developers [CON5750] by Ray Tsang
  • Managing and Deploying Java-Based Applications and Services at Scale [CON5730] by Ray Tsang
  • Hacking Hiring [BOF1459] by Elliotte Harold
  • The Ultimate Build Tools Face-off [CON2270] with Dmitry Churbanau and Baruch Sadogursky
  • RIA Technologies and Frameworks Panel [CON4675] with Kevin Nilson
There are quite a few interesting Groovy ecosystem related talks on the agenda:

  • Improving Your Groovy Kung-Fu [CON1293] by Dierk König
  • Groovy and Java 8: Making Java Better [CON3277] by Ken Kousen
  • Spock: Test Well and Prosper [CON3273] by Ken Kousen
  • Writing Groovy AST Transformations: Getting Practical in an Hour [CON1238] by Baruch Sadogursky
  • Juggling Multiple Java Platforms and Jigsaw with Gradle [CON4832] by Cédric Champeau
  • Maven Versus Gradle: Ready...Steady...Go! [CON2951] by Mert Caliskan & Murat Yener
  • Meet the Gradle Team [BOF6209] with Sterling Greene & Cédric Champeau
  • Faster Java EE Builds with Gradle [CON4921] by Ryan Cuprak
  • Lightweight Developer Provisioning with Gradle [BOF5154] by Mario-Leander Reimer
  • Making the Most of Your Gradle Build [CON6468] by Andrés Almiray
  • Gradle Support in NetBeans: A State of the Union [CON6253] with Sven Reimers & Martin Klähn
  • A Practical RxJava Example with Ratpack [CON4044] by Laurent Doguin
Lots of interesting content! I'm really looking forward to meeting you there, in the hallways, to chat about Google Cloud Platform and Apache Groovy!

Natural language API and JavaScript promises to bind them all

A bit of web scraping with Jsoup and REST API calls with groovy-wsclient helped me build my latest demo with Glide / Gaelyk on App Engine, but now, it's time to look a bit deeper into the analysis of the White House speeches:

I wanted to have a feel of how positive and negative sentences flow together in speeches. Looking at the rhetoric of those texts, you'd find some flows of generally neutral introduction, then posing the problem with some negativity connotation, then the climax trying to unfold the problems with positive solutions. Some other topics might be totally different, though, but I was curious to see how this played out on the corpus of texts from the speeches and remarks published by the White House press office

The Cloud Natural Language API

For that purpose, I used the Cloud Natural Language API:
  • Split the text into sentences thanks to the text annotation capability. The API can split sentences even further, of course, by word, to figure out verbs, nouns, and all components of sentences (POS: Part Of Speech tagging).
  • Define the sentiment of sentences, with a polarity (negative to positive), and a magnitude (for the intensity of the sentiment expressed).
  • Extract entities, ie. finding people / organization / enterprise names, place locations, etc.
Text annotation is important for better understanding text, for example to create more accurate language translations. Sentiment analysis can help brands track how their customers appreciate their products. And entity extraction can help figure out the topics of articles, who's mentioned, places where the action takes places, which is useful for further contextual search, like finding all the articles about Obama, all the speeches about Europe, etc. There's a wide applicability of those various services to provide more metadata, a better understanding for a given piece of text.

Asynchronously calling the service and gathering results

Let's look back at my experiment. When I scrape the speeches, I actually get a list of paragraphs (initially enclosed in <p> tags basically). But I want to analyze the text sentence by sentence, so I need to use the text annotation capability to split all those paragraphs into sentences that I analyze individually.

Currently, the sentiment analysis works on one piece of text at a time. So you have to make one call per sentence! Hopefully an option might come to allow to send several pieces of text in a batch, or giving the sentiment per sentence for a big chunk of text, etc. But for now, it means I'll have to make p calls for my p paragraphs, and then n calls for all the sentences. those p + n calls might be expensive in terms of network traffic, but on the other hand, I can make the sentence coloring appear progressively, and asynchronously, by using JavaScript Promises and Fetch API, as I'm making those calls from the client side. But it seems it's possible to batch requests with the Google API Client, but I haven't tried that yet.

First of all, to simplify the code a bit, I've created a helper function that calls my backend services calling the NL API, that wraps the usage of the Fetch API, and the promise handling to gather the JSON response:
        var callService = function (url, key, value) {
            var query = new URLSearchParams();
            query.append(key, value);
            return fetch(url, {
                method: 'POST',
                body: query
            }).then(function (resp) {
                return resp.json();
I use the URLSearchParams object to pass my query parameter. The handy json() method on the response gives me the data structure resulting from the call. I'm going to reuse that callService function in the following snippets:
            callService('/content', 'url', e.value).then(function (paragraphs) {
                paragraphs.forEach(function (para, paraIdx) {
                    z('#output').append('<p id="para' + paraIdx + '">' + para + '</p>');
                    callService('/sentences', 'content', para).then(function (data) {
                        var sentences = (sentence) {
                            return sentence.text.content;
                        return Promise.all( (sentence) {
                            return callService('/sentence', 'content', sentence).then(function (sentenceSentiment) {
                                var polarity = sentenceSentiment.documentSentiment.polarity;
                                var magnitude = sentenceSentiment.documentSentiment.magnitude;
                                return {
                                    sentence: sentence,
                                    polarity: polarity,
                                    magnitude: magnitude
                    }).then(function (allSentiments) {
                        var coloredSentences = (sentiment) {
                            var hsl = 'hsl(' +
                                Math.floor((sentiment.polarity + 1) * 60) + ', ' +
                                Math.min(Math.floor(sentiment.magnitude * 100), 100) + '%, ' +
                                '90%) !important';
                            return '<span style="background-color: ' + hsl + '">' + sentiment.sentence + '</span>';
                        z('#para' + paraIdx).html(coloredSentences);
The first call will fetch the paragraphs from the web scraping service. I display each paragraph right away, uncolored, with an id so that I can then later update each paragraph with colored sentences with their sentiment.

Now for each paragraph, I call the sentences service, which calls the NL API to get the individual sentences of each paragraph. With all the sentences in one go, I use the Promise.all(iterable) method which returns a promise that resolves when all the promises of sentiment analysis per sentence have resolved. This will help me keep track of the order of sentences, as the analysis can give me results in a non predictable order. 

I also keep track of the paragraph index to replace all the sentences of each paragraph, once all the promises for the sentences are resolved. I update the paragraph with colored sentences once all sentences of a paragraph are resolved, joining all colored sentences together.

© 2012 Guillaume Laforge | The views and opinions expressed here are mine and don't reflect the ones from my employer.