Analyzing Gradle, Grails, and Apache Groovy source code hosted on Github with BigQuery


A few months ago, I wrote an article about what you can learn from millions lines of Apache Groovy source hosted on Github, thanks to Google BigQuery. We answered a few questions like:

  • How many Groovy files are there on Github?
  • What are the most popular Groovy file names?
  • How many lines of Groovy source code are there?
  • What's the distribution of size of source files?
  • What are the most frequent imported packages?
  • What are the most popular Groovy APIs used?
  • What are the most used AST transformations?
  • Do people use import aliases much?
  • Did developers adopt traits?
At G3 Summit this week, I gave a presentation on this source code analysis, but decided to expand it a little bit, by also adding queries about Grails and Gradle.

For Gradle, here are the questions that I answered:

  • How many Gradle build files are there?
  • How many Maven build files are there?
  • Which versions of Gradle are being used?
  • How many of those Gradle files are settings files?
  • What are the most frequent build file names?
  • What are the most frequent Gradle plugins?
  • What are the most frequent “compile” and “test” dependencies?
And for Grails, here's what I covered:

  • What are the most used SQL database used?
  • What are the most frequent controller names?
  • What are the repositories with the biggest number of controllers?
  • What is the distribution of number of controllers?
I'll come back on those new queries in subsequent articles! But in the meantime, let me show you the slides I presented, and the results of those queries.


 

 
© 2012 Guillaume Laforge | The views and opinions expressed here are mine and don't reflect the ones from my employer.