Analyzing Gradle, Grails, and Apache Groovy source code hosted on Github with BigQuery
Posted on 29 November, 2016 (2 years ago)
A few months ago, I wrote an article about what you can learn from millions lines of Apache Groovy source hosted on Github, thanks to Google BigQuery. We answered a few questions like:
- How many Groovy files are there on Github?
- What are the most popular Groovy file names?
- How many lines of Groovy source code are there?
- What's the distribution of size of source files?
- What are the most frequent imported packages?
- What are the most popular Groovy APIs used?
- What are the most used AST transformations?
- Do people use import aliases much?
- Did developers adopt traits?
At G3 Summit this week, I gave a presentation on this source code analysis, but decided to expand it a little bit, by also adding queries about Grails and Gradle.
For Gradle, here are the questions that I answered:
- How many Gradle build files are there?
- How many Maven build files are there?
- Which versions of Gradle are being used?
- How many of those Gradle files are settings files?
- What are the most frequent build file names?
- What are the most frequent Gradle plugins?
- What are the most frequent “compile” and “test” dependencies?
And for Grails, here's what I covered:
- What are the most used SQL database used?
- What are the most frequent controller names?
- What are the repositories with the biggest number of controllers?
- What is the distribution of number of controllers?
I'll come back on those new queries in subsequent articles! But in the meantime, let me show you the slides I presented, and the results of those queries.