I think correlating "pushes per repository" to certain languages is interesting. The top "pushes per repository" are C++, TeX, Rust, C, and CSS. I guess it's no surprise many would also consider those the most guess-and-check or hard-to-get-right-upfront-without-tooling languages too.
Really? I don't think Rust is like that because it has such strong compile time checking. More likely because Rust 1.0 hadn't even been released in 2014 so by definition every Rust project was extremely new and active.
I really, really want this updated too and saw it in my bookmarks. Figured the historic data was interesting, and that someone might want to give this another go.
Absolutely stunning and ingenious visualization, but disappointing data. In 2014 there were 2.2 million repos, while in 2025 there are closer to 500 million. The repo was last updated seven years ago, so I assume that this project has been abandoned.
A cursory glance at the source code[1] reveals that it's using GitHub Archive data. Looking through the gharchive data[2], it seems like it was last updated in 2024. So there's 10 years of publicly accessible new data.
Is there any reason we (by "we" I mean "random members of the community" as opposed to the developer of the project) can't re-build GitHut with the new data, seeing as it's open source? It's only processing the repo metadata, meaning it shouldn't even be that much data and should be well under the free 1TB limit in BigQuery (The processed data from 2014 stored in the repo[3] is only 71MB in size, though I assume the 2024 data will be larger), so cost shouldn't be a concern.
I'm not experienced enough to know whether creating an updated version of this would take an afternoon or several weeks.
As noted, should be (2014).
There is also GitHut 2.0: https://madnight.github.io/githut/#/pull_requests/2024/1
This updates through 2024.
Interesting to see the number of JS pushes go down significantly, but actually realize that it's just because many more projects are using TypeScript:
https://i.imgur.com/AJBE9so.png
The library space converged to TS far faster than the rest of the JS world.
I think correlating "pushes per repository" to certain languages is interesting. The top "pushes per repository" are C++, TeX, Rust, C, and CSS. I guess it's no surprise many would also consider those the most guess-and-check or hard-to-get-right-upfront-without-tooling languages too.
It's unclear if that's the takeaway here. Pushes per repository can just as well indicate a project that's just old, or active, or popular, or etc.
Really? I don't think Rust is like that because it has such strong compile time checking. More likely because Rust 1.0 hadn't even been released in 2014 so by definition every Rust project was extremely new and active.
Yes, maybe the causation assumption here is inaccurate.
The connectors are interesting, but I wish there was a way to sort by a column and have the rows be actually linear.
Also, worth noting that it looks like this data only covers 2012-2014?
Would love to see an update to 2025
I really, really want this updated too and saw it in my bookmarks. Figured the historic data was interesting, and that someone might want to give this another go.
Would be fun to weight each language by average number of stars, but normalize by repository count.
Data analysys without adjusting groups by popularity is a bit lame.
Absolutely stunning and ingenious visualization, but disappointing data. In 2014 there were 2.2 million repos, while in 2025 there are closer to 500 million. The repo was last updated seven years ago, so I assume that this project has been abandoned.
A cursory glance at the source code[1] reveals that it's using GitHub Archive data. Looking through the gharchive data[2], it seems like it was last updated in 2024. So there's 10 years of publicly accessible new data.
Is there any reason we (by "we" I mean "random members of the community" as opposed to the developer of the project) can't re-build GitHut with the new data, seeing as it's open source? It's only processing the repo metadata, meaning it shouldn't even be that much data and should be well under the free 1TB limit in BigQuery (The processed data from 2014 stored in the repo[3] is only 71MB in size, though I assume the 2024 data will be larger), so cost shouldn't be a concern.
I'm not experienced enough to know whether creating an updated version of this would take an afternoon or several weeks.
[1]: https://github.com/littleark/githut/
[2]: https://console.cloud.google.com/bigquery?project=githubarch...
[3]: https://github.com/littleark/githut/blob/master/server/data/...
Apparently someone worked on it, but (IMO) the visualization is a lot less nice compared to the original: https://madnight.github.io/githut/#/pull_requests/2024/1
Why are Nim, Odin, Zig, Mojo not included (and probably many others)?
Probably because this was made in 2014 :D