Issue is, op is trying to use matrix strategy when with cross-compiling they could avoid it. I have done it for https://github.com/anttiharju/compare-changes (which has nontrivial CI pipelines but they could be a lot simpler for op's needs)
I think this post accurately isolates the single main issue with GitHub Actions, i.e. the lack of a tight feedback loop. Pushing and waiting for completion on what's often a very simple failure mode is frustrating.
Others have pointed out that there are architectural steps you can take to minimize this pain, like keeping all CI operations isolated within scripts that can be run locally (and treating GitHub Actions features purely as progressive enhancements, e.g. only using `GITHUB_STEP_SUMMARY` if actually present).
Another thing that works pretty well to address the feedback loop pain is `workflow_dispatch` + `gh workflow run`: you still need to go through a push cycle, but `gh workflow run` lets you stay in development flow until you actually need to go look at the logs.
(One frustrating limitation with that is that `gh workflow run` doesn't actually spit out the URL of the workflow run it triggers. GitHub claims this is because it's an async dispatch, but I don't see how there can possibly be no context for GitHub to provide here, given that they clearly obtain it later in the web UI.)
I've standardized on getting github actions to create/pull a docker image and run build/test inside that. So if something goes wrong I have a decent live debug environment that's very similar to what github actions is running. For what it's worth.
I do the same with Nix as it works for macOS builds as well
It has the massive benefit of solving the lock-in problem. Your workflow is generally very short so it is easy to move to an alternative CI if (for example) Github were to jack up their prices for self hosted runners...
That said, when using it in this way I personally love Github actions
I was doing something similar when moving from Earthly. But I have since moved to Nix to manage the environment. It is a lot better of a developer experience and faster! I would checkout an environment manager like Nix/Mise etc so you can have the same tools etc locally and on CI.
Yeah, images seem to work very well as an abstraction layer for most CI/CD users. It's kind of unfortunate that they don't (can't) fully generalize across Windows and macOS runners as well, though, since in practice that's where a lot of people start to get snagged by needing to do things in GitHub Actions versus using GitHub Actions as an execution layer.
We need SSH access to the failed instances so we can poke around and iterate from any step in the workflow.
Production runs should be immutable, but we should be able to get in to diagnose, edit, and retry. It'd lead to faster diagnosis, resolution, and fixing.
The logs and everything should be there for us.
And speaking of the logs situation, the GHA logs are really buggy sometimes. They don't load about half of the time I need them to.
1. Don't use bash, use a scripting language that is more CI friendly. I strongly prefer pwsh.
2. Don't have logic in your workflows. Workflows should be dumb and simple (KISS) and they should call your scripts.
3. Having standalone scripts will allow you to develop/modify and test locally without having to get caught in a loop of hell.
4. Design your entire CI pipeline for easier debugging, put that print state in, echo out the version of whatever. You don't need it _now_, but your future self will thank you when you do it need it.
5. Consider using third party runners that have better debugging capabilities
I would disagree with 1. if you need anything more than shell that starts to become a smell to me. The build/testing process etc should be simple enough to not need anything more.
I agree with #2, I meant more if you are calling out to something that is not a task runner(Make, Taskfile, Just etc) or a shell script thats a bit of a smell to me. E.g. I have seen people call out to Python scripts etc and it concerns me.
Huh? Who cares if the script is .sh, .bash, Makefile, Justfile, .py, .js or even .php? If it works it works, as long as you can run it locally, it'll be good enough, and sometimes it's an even better idea to keep it in the same language the rest of the project is. It all depends and what language a script is made in shouldn't be considered a "smell".
> Huh? Who cares if the script is .sh, .bash, Makefile, Justfile, .py, .js or even .php?
Me, typically I have found it to be a sign of over-engineering and found no benefits over just using shell script/task runner, as all it should be is plumbing that should be simple enough that a task runner can handle it.
> If it works it works, as long as you can run it locally, it'll be good enough,
Maybe when it is your own personal project "If it works it works" is fine. But when you come to corporate environment there starts to be issues of readability, maintainability, proprietary tooling, additional dependencies etc I have found when people start to over-engineer and use programming languages(like Python).
E.g.
> never_inline 30 minutes ago | parent | prev | next [–]
> Build a CLI in python or whatever which does the same thing as CI, every CI stage should just call its subcommands.
However,
> and sometimes it's an even better idea to keep it in the same language the rest of the project is
I'll agree. Depending on the project's language etc other options might make sense. But personally so far everytime I have come across something not using a task runner it has just been the wrong decision.
> But personally so far everytime I have come across something not using a task runner it has just been the wrong decision.
Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
Typically I see larger issues being created from "under-engineering" and just rushing with the first idea people can think of when they implement things, rather than "over-engineering" causing similarly sized future issues. But then I also know everyone's history is vastly different, my views are surely shaped by the specific issues I've witnessed (and sometimes contributed to :| ), than anything else.
> Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
Strong opinions, loosely held :)
> Typically I see larger issues being created from "under-engineering" and just rushing with the first idea people can think of when they implement things, rather than "over-engineering"
Funnily enough running with the first idea I think is creating a lot of the "over-engineering" I am seeing. Not stopping to consider other simpler solutions or even if the problem needs/is worth solving in the first place.
> Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
I quickly asked Claude to convert one of my open source repos using Make/Nix/Shell -> Python/Nix to see how it would look. It is actually one of the better Python as a task runners I have seen.
While the Python version is not as bad as I have seen previously, I am still struggling to see why you'd want it over Make/Shell.
It introduces more dependencies(Python which I solved via Nix) but others haven't solved this problem and the Python script has dependencies(such as Click for the CLI).
It is less maintainable as it is more code, roughly x3 the amount of the Makefile.
To me the Python code is more verbose and not as simple compared to the Makefile's target so it is less readable as well.
> If your CI can do things that you can't do locally: that is a problem.
Probably most of the times when this is an actual problem, is building across many platforms. I'm running Linux x86_64 locally, but some of my deliverables are for macOS and Windows and ARM, and while I could cross-compile for all of them on Linux (macOS was a bitch to get working though), it always felt better to compile on the hardware I'm targeting.
Sometimes there are Windows/macOS-specific failures, and if I couldn't just ssh in and correct/investigate, and instead had to "change > commit > push" in an endless loop, it's possible I'd quite literally would lose my mind.
I literally had to do this push > commit > test loop yesterday because apparently building universal Python wheels on MacOS is a pain in the ass. And I don't have a mac, so if I want to somewhat reliably reproduce how the runner might behave, I have to either test it on GH actions or rent one from something like Scaleway. Mainly because I don't currently knwo how else to do it. It's so, so frustrating and if anyone has ideas on making my life a bit better that would be nice lol.
> If your CI can do things that you can't do locally: that is a problem.
IME this is where all the issues lie. Our CI pipeline can push to a remote container registry, but we can't do this locally. CI uses wildly different caching strategies to local builds, which diverges. Breaking up builds into different steps means that you need to "stash" the output of stages somewhere. If all your CI does is `make test && make deploy` then sure, but when you grow beyond that (my current project takes 45 minutes with a _warm_ cache) you need to diverge, and that's where the problems start.
> If your CI can do things that you can't do locally: that is a problem.
Completely agree.
> I'm a huge fan of "train as you fight", whatever build tools you have locally should be what's used in CI.
That is what I am doing, having my GitHub Actions just call the Make targets I am using locally.
> I mean, at some point you are bash calling some other language anyway.
Yes, shell scripts and or task runners(Make, Just, Task etc) are really just plumbing around calling other tools. Which is why it feels like a smell to me when you need something more.
I don't agree with (1), but agree with (2). I recommend just putting a Makefile in the repo and have that have CI targets, which you can then easily call from CI via a simple `make ci-test` or similar. And don't make the Makefiles overcomplicated.
Of course, if you use something else as a task runner, that works as well.
For certain things, makefiles are great options. For others though they are a nightmare. From a security perspective, especially if you are trying to reach SLSA level 2+, you want all the build execution to be isolated and executed in a trusted, attestable and disposable environment, following predefined steps. Having makefiles (or scripts) with logical steps within them, makes it much, much harder to have properly attested outputs.
Using makefiles mixes execution contexts between the CI pipeline and the code within the repository (that ends up containing the logic for the build), instead of using - centrally stored - external workflows that contains all the business logic for the build steps (e.g., compiler options, docker build steps etc.).
For example, how can you attest in the CI that your code is tested if the workflow only contains "make test"? You need to double check at runtime what the makefile did, but the makefile might have been modified by that time, so you need to build a chain of trust etc.
Instead, in a standardized workflow, you just need to establish the ground truth (e.g., tools are installed and are at this path), and the execution cannot be modified by in-repo resources.
Makefile or scripts/do_thing either way this is correct. CI workflows should only do 1 thing each step. That one thing should be a command. What that command does is up to you in the Makefile or scripts. This keeps workflows/actions readable and mostly reusable.
Neither do most people, probably but it's kinda neat how they suggested fix for github actions' ploy to maintain vendor lock-in is to swap it with a language invented by that very same vendor.
I was once hired to manage a build farm. All of the build jobs were huge pipelines of Jenkins plugins that did various things in various orders. It was a freaking nightmare. Never again. Since then, every CI setup I’ve touched is a wrapper around “make build” or similar, with all the smarts living in Git next to the code it was building. I’ll die on this hill.
I typically use make for this and feel like I’m constantly clawing back scripts written in workflows that are hard to debug if they’re even runnable locally.
This isn’t only a problem with GitHub Actions though. I’ve run into it with every CI runner I’ve come across.
How do you handle persistent state in your actions?
For my actions, the part that takes the longest to run is installing all the dependencies from scratch. I'd like to speed that up but I could never figure it out. All the options I could find for caching deps sounded so complicated.
> How do you handle persistent state in your actions?
You shouldn't. Besides caching that is.
> All the options I could find for caching deps sounded so complicated.
In reality, it's fairly simple, as long as you leverage content-hashing. First, take your lock file, compute the sha256sum. Then check if the cache has an artifact with that hash as the ID. If it's found, download and extract, those are your dependencies. If not, you run the installation of the dependencies, then archive the results, with the ID set to the hash.
It really isn't more to it. I'm sure there are helpers/sub-actions/whatever Microsoft calls it, for doing all of this with 1-3 lines or something.
The tricky bit for me was figuring out which cache to use, and how to use and test it locally. Do you use the proprietary github actions stuff? If the installation process inside the actions runner is different from what we use in the developer machines, now we have two sets of scripts and it's harder to test and debug...
> Do you use the proprietary github actions stuff?
If I can avoid it, no. Almost everything I can control is outside of the Microsoft ecosystem. But as a freelancer, I have to deal a bunch with GitHub and Microsoft anyways, so in many of those cases, yes.
Many times, I end up using https://github.com/actions/cache for the clients who already use Actions, and none of that runs in the local machines at all.
Typically I use a single Makefile/Justfile, that sometimes have most of the logic inside of it for running tests and what not, sometimes shell out to "proper" scripts.
But that's disconnected from the required "setup", so Make/Just doesn't actually download dependencies, that's outside of the responsibilities of whatever runs the test.
And also, with a lot of languages, it doesn't matter if you run an extra "npm install" over already existing node_modules/, it'll figure out what's missing/there already, so you could in theory still have "make test" do absolute everything locally, including installing dependencies (if you now wish this), and still do the whole "hash > find cache > extract > continue" thing before running "make test", and it'll skip the dependencies part if it's there already.
Depends on the build toolchain but usually you'd hash the dependency file and that hash is your cache key for a folder in which you keep your dependencies. You can also make a Docker image containing all your dependencies but usually downloading and spinning that up will take as long as installing the dependencies.
For things like installing deps, you can use GitHub Actions or several third party runners have their own caching capabilities that are more mature than what GHA offers.
Step 0. Stop using CI services that purposefully waste your time, and use CI services that have "Rebuild with SSH" or similar. From previous discussions (https://news.ycombinator.com/item?id=46592643), seems like Semaphore CI still offers that.
Its not Github Actions' fault but the horrors people create in it, all under the pretense that automation is simply about wrapping a GitHub Action around something. Learn to create a script in Python or similar and put all logic there so you can execute it locally and can port it to the next CI system when a new CTO arrives.
I think in this case they hate the fact that they cannot easily SSH into the failing VM and debug from there. Like "I have to edit my workflow, push it, wait for it to run and fail, and repeat".
systemd units that are small, simple, and call into a single script are usually fantastic. There's no reason for these scripts to be part of another init system; but making as much of your code completely agnostic to the env it runs in sounds good regardless. I think that's the feeling you're feeling.
No, it is github's fault. They encourage the horrors because they lead to vendor lock in. This is the source of most of Microsoft's real profit margins.
This is probably why they invented a whole programming language and then neglected to build any debugging tools for it.
I like Github Actions and it is better than what I used before (Travis) and I think it solves an important problem. For OSS projects it's a super valuable free resource.
For me what worked wonders was adopting Nix. Make sure you have a reproducible dev environment and wrap your commands in `nix-shell --run`, or even better `nix develop --command`, or even better your most of your CI tasks derivations that run with `nix build` or `nix flake check`.
Not only does this make it super easy to work with Github Actions, also with your colleagues or other contributors.
The way I deal with all these terrible CI platforms (there is no good one, merely lesser evils) is to do my entire CI process in a container and the CI tool just pulls and runs that. You can trivially run this locally when needed.
Of course, the platforms would rather have you not do that since it nullifies their vendor lock-in.
1. When the build fails, you can SSH into the machine and debug it from there.
2. You can super easily edit & run the manifest without having to push to a branch at all. That makes it super easy to even try a minimum reproducible example on the remote machine.
Other than that, self-hosting (with Github or preferrably Forgejo) makes it easy to debug on the machine, but then you have to self-host.
Self-hosted runners with Github is a whole world of pain because it literally just runs commands on the host and does not handle provisioning/cleanup, meaning you need to make sure your `docker run` commands don't leave any leftover state that can mess up future/concurrent builds. It doesn't even handle concurrency by itself, so you have to run multiple instances of the runner.
Thats what i always did for our gitlab CI pipeline, just deploy dedicated images for different purposes. We had general terraform images for terraform code, this made it easy to standardize versions etc. Then we made specific ones for projects with a lot of dependencies so we could run the deployment pipeline in seconds instead of minutes. But now you need to maintain the docker images too. All about trade-offs.
It's kind of their whole thing, when you think about it. They didn't get to where they are by playing nice with others. If you're supporting anything in the Apple ecosystem, the fix is in.
The main downside of bitbucket pipelines is bitbucket. And the only significant feature I recall over GitHub Actions is that Pipelines support cron jobs.
fyi i maintain a repo that accidentally tracks github actions cron reliability (https://www.swyx.io/github-scraping) - just runs a small script every hour.
I've gotten to a point where my workflow YAML files are mostly `mise` tool calls (because it handles versioning of all tooling and has cache support) and webhooks, and still it is a pain. Also their concurrency and matrix strategies are just not working well, and sometimes you end up having to use a REST API endpoint to force cancel a job because their normal cancel functionality simply does not take.
There was a time I wanted our GH actions to be more capable, but now I just want them to do as little as possible. I've got a Cloudflare worker receiving the GitHub webhooks firehose, storing metadata about each push and each run so I don't have to pass variables between workflows (which somehow is a horrible experience), and any long-running task that should run in parallel (like evaluations) happens on a Hetzner machine instead.
I'm very open to hear of nice alternatives that integrate well with GitHub, but are more fun to configure.
If you wanted a better version of GitHub Actions/CI (the orchestrator, the job definition interface, or the programming of scripts to execute inside those jobs), it would presumably need to be more opinionated and have more constraints?
Who here has been thinking about this problem? Have you come up with any interesting ideas? What's the state of the art in this space?
GHA was designed in ~2018. What would it look like if you designed it today, with all we know now?
In general, I've never really experienced the issues mentioned, but I also use Gitea with Actions rather than GitHub. I also avoid using any complex logic within an Action.
For the script getting run, there's one other thing. I build my containers locally, test the scripts thoroughly, and those scripts and container are what are then used in the build and deploy via Action. As the entire environment is the same, I haven't encountered many issues at all.
Standard msft absurdity. 8 years later there is still no local gh action runner to test your script before you commit, push, and churn through logs, and without some 3rd party hack, no way to ssh in and debug. It doesn't matter how simple the build command you write is, because the workflow itself is totally foreign technology to most, and no one wants to be a gh action dev.
Like most of the glaring nonsense that costs people time when using msft, this is financially beneficial to msft in that each failed run counts against paid minutes. It's a racket from disgusting sleaze scum who literally hold meetings dedicated to increasing user pain because otherwise the bottom line will slip fractionally and no one in redmond has a single clue how to make money without ripping off the userbase.
Would a tool like act help here? (https://github.com/nektos/act) I suppose orchestration that is hiding things from different processor architectures could also very well run differently online than offline, but still.
That's correct and it's linux-only (as of the last time I looked), you can run it on macOS but you can't run macOS runners (which is where I need the most help debugging normally, for building iOS apps).
The best CI platforms let you "Rebuild with SSH" or something similar, and instead of having the cycle of "change > commit > push > wait > see results" (when you're testing CI specific stuff, not iterating on Makefiles or whatever, assuming most of it is scripts you can run both locally and in CI), you get a URL to connect to while the job is running, so you can effectively ensure manually it works, then just copy-paste whatever you did to your local sources.
I use that a lot with SourceHut: after a build fails, you have 10 minutes to SSH into the machine and debug from there. Also they have a very nice "edit manifest and run" feature that makes it easy to quickly test a change while debugging.
Are there other platforms allowing that? Genuinely interested.
> after a build fails, you have 10 minutes to SSH into the machine and debug from there.
Ah, that's like 90% of the way there, just need to enable so the SSH endpoint is created at the beginning, rather than the end, so you could for example watch memory usage and stuff while the "real" job is running in the same instance.
But great to hear they let you have access to the runner at all, only that fact makes it a lot better than most CI services out there, creds to SourceHut.
No. Act is for running actions locally. What was mentioned is a way to insert an SSH step at some well-chosen point of a workflow so you can login at the runner and understand what is wrong and why it's not working. I have written one such thing, it relies on cloudflare free tunnels. https://github.com/efrecon/sshd-cloudflared. There are other solutions around to achieve more or less the same goal.
I designed a fairly complex test matrix with a lot of logic offloading to the control mechanisms Gitlab offers. You create job templates or base jobs that control the overall logic and extend them for each particular use case. I had varying degrees of success, and it's not a job for a Devs side quest, that means I think you need someone dedicated to explore, build and debug these pipelines, but for a CI tool it's very good.
Because you can extend and override jobs, you can create seams so that each piece of the pipeline is isolated and testable. This way there is very little that can go wrong in production that's the CI fault. And I think that's only possible because of the way that Gitlab models their jobs and stages.
For the last decade I've been doing my CI/CD as simple .NET console apps that run wherever. I don't see why we switch to these wildly different technologies when the tools we are already using can do the job.
Being able to run your entire "pipeline" locally with breakpoints is much more productive than whatever the hell goes on in GH Actions these days.
I can do that with github actions too? For tests, I can either run them locally (with a debugger if I want), or in github actions. Smaller checks go in a pre-commit config that github action also runs.
Setting up my github actions (or gitlab) checks in a way that can easily run locally can be a bit of extra work, but it's not difficult.
Of all the valid complaints about Github Actions or CI in general, this seems to be an odd one. No details about what was tried or not tried, but hard to see a `-run: go install cuelang.org/go/cmd/cue@latest` step not working?
> For the love of all that is holy, don’t let GitHub Actions
> manage your logic. Keep your scripts under your own damn
> control and just make the Actions call them!
I mean your problem was not `build.rs` here and Makefiles did not solve it, was your logic not already in `build.rs` which was called by Cargo via GitHub Actions?
The problem was the environment setup? You couldn't get CUE on Linux ARM and I am assuming when you moved to Makefiles you removed the need for CUE or something? So really the solution was something like Nix or Mise to install the tooling, so you have the same tooling/version locally & on CI?
"GitHub actions bad" is a valid take - you should reduce your use to a minimum.
"My build failed because of GitHub actions couldn't install a dependency of my build" is a skill issue. Don't use GitHub actions to install a program your build depends on.
Before that, most people would avoid Jenkins and probably never try Buildbot (because devs typically don't want to spend any time learning tools). Devs would require "devops" to do the CI stuff. Again, mostly because they couldn't be arsed to make it themselves, but also because it required setting up a machine (do you self-host, do you use a VPS?).
Then came tools like Travis or CircleCI, which made it more accessible. "Just write some kind of script and we run it on our machines". Many devs started using that.
And then came GitHub Actions, which were a lot better than Travis and CircleCI: faster, more machines, and free (for open source projects at least). I was happy to move everything there.
But as soon as something becomes more accessible, you get people who had never done it before. They can't say "it enables me to do it, so it's better than me relying on a devops team before" or "well it's better than my experience with Travis". They will just complain because it's not perfect.
And for the OP's defense, I do agree that not being able to SSH into a machine after the build fails is very frustrating.
I think it's possible to both think GitHub Actions is an incredible piece of technology (and an incredible de facto public resource), while also thinking it has significant architectural and experiential flaws. The latter can be fixed; the former is difficult for competitors to replicate.
(In general, I think a lot of criticisms of GitHub Actions don't consider the fully loaded cost of an alternative -- there are lots of great alternative CI/CD services out there, but very few of them will give you the OS/architecture matrix and resource caps that GitHub Actions gives every single OSS project for free.)
As soon as I need more than two tries to get some workflow working, I set up a tmate session and debug things using a proper remote shell. It doesn't solve all the pain points, but it makes things a lot better.
> For the love of all that is holy, don’t let GitHub Actions
> manage your logic. Keep your scripts under your own damn
> control and just make the Actions call them!
The pain is real. I think everyone that's ever used GitHub actions has come to this conclusion. An ideal action has 2 steps: (1) check out the code, (2) invoke a sane script that you can test locally.
Honestly, I wonder if a better workflow definition would just have a single input: a single command to run. Remove the temptation to actually put logic in the actions workflow.
> I think everyone that's ever used GitHub actions has come to this conclusion
This is not even specific to GitHub Actions. The logic goes into the scripts, and the CI handles CI specific stuff (checkout, setup tooling, artifacts, cache...). No matter which CI you use, you're in for a bad time if you don't do this.
I assume you're using the currently recommended docker-in-docker method. The legacy Gitlab way is horrible and it makes it basically impossible to run pipelines locally.
GitHub introduced all their fancy GHA apps or callables or whatever they're called for specific tasks, and the community went wild. Then people built test and build workflows entirely in GHA instead of independent of it. And added tons of complexity to the point they have a whole build and test application written in GHA YAML.
This is basically how most other CI systems work. GitLab CI, Jenkins, Buildbot, Cirrus CI, etc. are all other systems I've used and they work this way.
I find GitHub Actions abhorrent in a way that I never found a CI/CD system before...
It seems more of a cultural issue that -- I'm pretty sure -- predates Microsoft's acquisition of GitHub. I assume crappy proprietary yaml can be blamed on use of Ruby. And there seems to be an odd and pervasive "80% is good enough" feel to pretty much everything in GitHub, which is definitely cultural, and I'm pretty sure, also predates Microsoft's acquisition.
GHA is based on Azure Actions. This is evident in how bad its security stance is, since Azure Actions was designed to be used in a more closed/controlled environment.
> I think everyone that's ever used GitHub actions has come to this conclusion.
I agree that that should be reasonable but unfortunately I can tell you that not all developers (including seniors) naturally arrive at such conclusion no.
GHA’s componentized architecture is appealing, but it’s astonishing to me that there’s still seemingly no way to actually debug workflows, run them locally, or rapidly iterate on them in any way. Alas.
I think this is a specific example of a generalized mistake, one that various bits of our infrastructure and architecture all but beg us to make, over and over, and which must be resisted, which is: Your development feedback loop must be as tight as possible.
Granted, if you are working on "Windows 12", you won't be building, installing, testing, and deploying that locally. I understand and acknowledge that "as tight as possible" will still sometimes push you into remote services or heavyweight processes that can't be pushed towards you locally. This is an ideal to strive for, but not one that can always be accomplished.
However, I see people surrender the ability to work locally much sooner than they should, and implement massively heavyweight processes without any thought for whether you could have gotten 90% of the result of that process with a bit more thought and kept it local and fast.
And even once you pass the event horizon where the system as a whole can't be feasibly built/tested/whatever on anything but a CI system, I see them surrendering the ability to at least run the part of the thing you're working on locally.
I know it's a bit more work, building sufficient mocks and stubs for expensive remote services that you can feasibly run things locally, but the payoff for putting a bit of work into having it run locally for testing and development purposes is just huge, really huge, the sort of huge you should not be ignoring.
"Locally" here does not mean "on your local machine" per se, though that is a pretty good case, but more like, in an environment that you have sole access to, where you're not constantly fighting with latency, and where you have full control. Where if you're debugging even a complex orchestration between internal microservices, you have enough power to crank them all up to "don't ever timeout" and attach debuggers to all of them simultaneously, if you want to. Where you can afford to log every message in the system, interrupt any process, run any test, and change any component in the system in any manner necessary for debugging or development without having to coordinate with anyone. The more only the CI system can do by basically mailing it a PR, and the harder it is to convince it to do just the thing you need right now rather than the other 45 minutes of testing it's going to run before running the 10 second test you actually need, the worse your development speed is going to be.
Fortunately, and I don't even how exactly the ratio between sarcasm and seriousness here (but I'm definitely non-zero serious), this is probably going to fix itself in the next decade or so... because while paying humans to sit there and wait for CI and get sidetracked and distracted is just Humans Doing Work and after all what else are we paying them for, all of this stuff is going to be murder on AI-centric workflows, which need tight testing cycles to work at their best. Can't afford to have AI waiting for 30 minutes to find out that its PR is syntactically invalid, and can't afford for the invalid syntax to come back with bad error messages that leave it baffled as to what the actual problem is. If we won't do it for the humans, we'll do it for the AIs. This is definitely not something AI fixes, despite the fact they are way more patient than us and much less prone to distraction in the meantime since from their "lived experience" they don't experience the time taken for things to build and test, it is made much worse and more obvious that this is a real problem and not just humans being whiny and refusing to tough it through.
To some extent I do agree that is sounds like "my build was failing on a platform I had never tested, and I was pissed because I had to debug it".
But it is true that GitHub Actions don't make it easy to debug (e.g. you can't just SSH into the machine after the build fails). Not sure if it justifies hating with Passion, though.
So the article is about the frustrating experience of fixing GitHub Actions when something goes wrong, especially when a workflow only fails on one platform, potentially due to how GitHub runner is set up (inconsistently across platforms).
Took me a while to figure that out. While I appreciate occasional banters in blog articles, this one seems to diverge into rant a bit too much, and could have made its point much clearer, with, for example, meaningful section headers.
Until I resd this blog I was under the impression that everyone wrote Python/ other files and used Github Actions to just call the scripts!
This way we can test it on local machine before deployment.
Also as other commenters have said - bash is not a good option - Use Python or some other language and write reusabe scripts. If not for this then for the off chance that it'll be migrated to some other cicd platform
I wouldn't say that, but I would say there's no "should" here; it's often much more hassle than people expect and everyone has to decide for themselves whether the number of users is worth it.
Prefacing this with the fact that act is great, however, it has many shortcomings. Too often I've run into roadblocks, and when looking up the issue for it, it seems they are hard to address. Simpler workflows work fine with it, but more complex workflows will be much harder.
Don't put your logic in proprietary tooling. I have started writing all logic into mise tasks since I already manage the tool dependencies with mise. I tend to write them in a way where it can easily take advantage of GHA features such as concurrency, matrixes, etc. But beyond that, it is all running within mise tasks.
act is often mentioned as a drop-in replacement but I never got it to replicate GitHub actions environment. I didn't try it for this particular case, though.
Issue is, op is trying to use matrix strategy when with cross-compiling they could avoid it. I have done it for https://github.com/anttiharju/compare-changes (which has nontrivial CI pipelines but they could be a lot simpler for op's needs)
Main issue is Rust. Writing catchy headlines about hating something may feel good, but a lot of people could avoid these pains if - zig cc gets support for new linker flag that Rust requires https://codeberg.org/ziglang/zig/pulls/30628 - rust-lang/libc gets to 1.0 which removes iconv issues for macos https://github.com/rust-lang/libc/issues/3248
I think this post accurately isolates the single main issue with GitHub Actions, i.e. the lack of a tight feedback loop. Pushing and waiting for completion on what's often a very simple failure mode is frustrating.
Others have pointed out that there are architectural steps you can take to minimize this pain, like keeping all CI operations isolated within scripts that can be run locally (and treating GitHub Actions features purely as progressive enhancements, e.g. only using `GITHUB_STEP_SUMMARY` if actually present).
Another thing that works pretty well to address the feedback loop pain is `workflow_dispatch` + `gh workflow run`: you still need to go through a push cycle, but `gh workflow run` lets you stay in development flow until you actually need to go look at the logs.
(One frustrating limitation with that is that `gh workflow run` doesn't actually spit out the URL of the workflow run it triggers. GitHub claims this is because it's an async dispatch, but I don't see how there can possibly be no context for GitHub to provide here, given that they clearly obtain it later in the web UI.)
> i.e. the lack of a tight feedback loop.
Lefthook helps a lot https://anttiharju.dev/a/1#pre-commit-hooks-are-useful
Thing is that people are not willing to invest in it due to bad experiences with various git hooks, but there are ways to have it be excellent
I've standardized on getting github actions to create/pull a docker image and run build/test inside that. So if something goes wrong I have a decent live debug environment that's very similar to what github actions is running. For what it's worth.
I do the same with Nix as it works for macOS builds as well
It has the massive benefit of solving the lock-in problem. Your workflow is generally very short so it is easy to move to an alternative CI if (for example) Github were to jack up their prices for self hosted runners...
That said, when using it in this way I personally love Github actions
I was doing something similar when moving from Earthly. But I have since moved to Nix to manage the environment. It is a lot better of a developer experience and faster! I would checkout an environment manager like Nix/Mise etc so you can have the same tools etc locally and on CI.
Yeah, images seem to work very well as an abstraction layer for most CI/CD users. It's kind of unfortunate that they don't (can't) fully generalize across Windows and macOS runners as well, though, since in practice that's where a lot of people start to get snagged by needing to do things in GitHub Actions versus using GitHub Actions as an execution layer.
We need SSH access to the failed instances so we can poke around and iterate from any step in the workflow.
Production runs should be immutable, but we should be able to get in to diagnose, edit, and retry. It'd lead to faster diagnosis, resolution, and fixing.
The logs and everything should be there for us.
And speaking of the logs situation, the GHA logs are really buggy sometimes. They don't load about half of the time I need them to.
I wrote something recently with webrtc to get terminal on failure: https://blog.gripdev.xyz/2026/01/10/actions-terminal-on-fail...
Are there solutions to this like https://github.com/marketplace/actions/ssh-to-github-action-... ?
1. Don't use bash, use a scripting language that is more CI friendly. I strongly prefer pwsh.
2. Don't have logic in your workflows. Workflows should be dumb and simple (KISS) and they should call your scripts.
3. Having standalone scripts will allow you to develop/modify and test locally without having to get caught in a loop of hell.
4. Design your entire CI pipeline for easier debugging, put that print state in, echo out the version of whatever. You don't need it _now_, but your future self will thank you when you do it need it.
5. Consider using third party runners that have better debugging capabilities
I would disagree with 1. if you need anything more than shell that starts to become a smell to me. The build/testing process etc should be simple enough to not need anything more.
That's literally point #2, but I had the same reaction as you when I first read point #1 :)
I agree with #2, I meant more if you are calling out to something that is not a task runner(Make, Taskfile, Just etc) or a shell script thats a bit of a smell to me. E.g. I have seen people call out to Python scripts etc and it concerns me.
Huh? Who cares if the script is .sh, .bash, Makefile, Justfile, .py, .js or even .php? If it works it works, as long as you can run it locally, it'll be good enough, and sometimes it's an even better idea to keep it in the same language the rest of the project is. It all depends and what language a script is made in shouldn't be considered a "smell".
> Huh? Who cares if the script is .sh, .bash, Makefile, Justfile, .py, .js or even .php?
Me, typically I have found it to be a sign of over-engineering and found no benefits over just using shell script/task runner, as all it should be is plumbing that should be simple enough that a task runner can handle it.
> If it works it works, as long as you can run it locally, it'll be good enough,
Maybe when it is your own personal project "If it works it works" is fine. But when you come to corporate environment there starts to be issues of readability, maintainability, proprietary tooling, additional dependencies etc I have found when people start to over-engineer and use programming languages(like Python).
E.g.
> never_inline 30 minutes ago | parent | prev | next [–]
> Build a CLI in python or whatever which does the same thing as CI, every CI stage should just call its subcommands.
However,
> and sometimes it's an even better idea to keep it in the same language the rest of the project is
I'll agree. Depending on the project's language etc other options might make sense. But personally so far everytime I have come across something not using a task runner it has just been the wrong decision.
> But personally so far everytime I have come across something not using a task runner it has just been the wrong decision.
Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
Typically I see larger issues being created from "under-engineering" and just rushing with the first idea people can think of when they implement things, rather than "over-engineering" causing similarly sized future issues. But then I also know everyone's history is vastly different, my views are surely shaped by the specific issues I've witnessed (and sometimes contributed to :| ), than anything else.
> Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
Strong opinions, loosely held :)
> Typically I see larger issues being created from "under-engineering" and just rushing with the first idea people can think of when they implement things, rather than "over-engineering"
Funnily enough running with the first idea I think is creating a lot of the "over-engineering" I am seeing. Not stopping to consider other simpler solutions or even if the problem needs/is worth solving in the first place.
> Yeah, tends to happen a lot when you hold strong opinions with strong conviction :) Not that it's wrong or anything, but it's highly subjective in the end.
I quickly asked Claude to convert one of my open source repos using Make/Nix/Shell -> Python/Nix to see how it would look. It is actually one of the better Python as a task runners I have seen.
* https://github.com/DeveloperC286/clean_git_history/pull/431
While the Python version is not as bad as I have seen previously, I am still struggling to see why you'd want it over Make/Shell.
It introduces more dependencies(Python which I solved via Nix) but others haven't solved this problem and the Python script has dependencies(such as Click for the CLI).
It is less maintainable as it is more code, roughly x3 the amount of the Makefile.
To me the Python code is more verbose and not as simple compared to the Makefile's target so it is less readable as well.
Using shell becomes deeply miserable as soon as you encounter its kryptonite, the space character. Especially but not limited to filenames.
I mean, at some point you are bash calling some other language anyway.
I'm a huge fan of "train as you fight", whatever build tools you have locally should be what's used in CI.
If your CI can do things that you can't do locally: that is a problem.
> If your CI can do things that you can't do locally: that is a problem.
Probably most of the times when this is an actual problem, is building across many platforms. I'm running Linux x86_64 locally, but some of my deliverables are for macOS and Windows and ARM, and while I could cross-compile for all of them on Linux (macOS was a bitch to get working though), it always felt better to compile on the hardware I'm targeting.
Sometimes there are Windows/macOS-specific failures, and if I couldn't just ssh in and correct/investigate, and instead had to "change > commit > push" in an endless loop, it's possible I'd quite literally would lose my mind.
I literally had to do this push > commit > test loop yesterday because apparently building universal Python wheels on MacOS is a pain in the ass. And I don't have a mac, so if I want to somewhat reliably reproduce how the runner might behave, I have to either test it on GH actions or rent one from something like Scaleway. Mainly because I don't currently knwo how else to do it. It's so, so frustrating and if anyone has ideas on making my life a bit better that would be nice lol.
> If your CI can do things that you can't do locally: that is a problem.
IME this is where all the issues lie. Our CI pipeline can push to a remote container registry, but we can't do this locally. CI uses wildly different caching strategies to local builds, which diverges. Breaking up builds into different steps means that you need to "stash" the output of stages somewhere. If all your CI does is `make test && make deploy` then sure, but when you grow beyond that (my current project takes 45 minutes with a _warm_ cache) you need to diverge, and that's where the problems start.
> If your CI can do things that you can't do locally: that is a problem.
Completely agree.
> I'm a huge fan of "train as you fight", whatever build tools you have locally should be what's used in CI.
That is what I am doing, having my GitHub Actions just call the Make targets I am using locally.
> I mean, at some point you are bash calling some other language anyway.
Yes, shell scripts and or task runners(Make, Just, Task etc) are really just plumbing around calling other tools. Which is why it feels like a smell to me when you need something more.
I don't agree with (1), but agree with (2). I recommend just putting a Makefile in the repo and have that have CI targets, which you can then easily call from CI via a simple `make ci-test` or similar. And don't make the Makefiles overcomplicated.
Of course, if you use something else as a task runner, that works as well.
For certain things, makefiles are great options. For others though they are a nightmare. From a security perspective, especially if you are trying to reach SLSA level 2+, you want all the build execution to be isolated and executed in a trusted, attestable and disposable environment, following predefined steps. Having makefiles (or scripts) with logical steps within them, makes it much, much harder to have properly attested outputs.
Using makefiles mixes execution contexts between the CI pipeline and the code within the repository (that ends up containing the logic for the build), instead of using - centrally stored - external workflows that contains all the business logic for the build steps (e.g., compiler options, docker build steps etc.).
For example, how can you attest in the CI that your code is tested if the workflow only contains "make test"? You need to double check at runtime what the makefile did, but the makefile might have been modified by that time, so you need to build a chain of trust etc. Instead, in a standardized workflow, you just need to establish the ground truth (e.g., tools are installed and are at this path), and the execution cannot be modified by in-repo resources.
Makefile or scripts/do_thing either way this is correct. CI workflows should only do 1 thing each step. That one thing should be a command. What that command does is up to you in the Makefile or scripts. This keeps workflows/actions readable and mostly reusable.
>I don't agree with (1)
Neither do most people, probably but it's kinda neat how they suggested fix for github actions' ploy to maintain vendor lock-in is to swap it with a language invented by that very same vendor.
makefile commands are the way
I was once hired to manage a build farm. All of the build jobs were huge pipelines of Jenkins plugins that did various things in various orders. It was a freaking nightmare. Never again. Since then, every CI setup I’ve touched is a wrapper around “make build” or similar, with all the smarts living in Git next to the code it was building. I’ll die on this hill.
Build a CLI in python or whatever which does the same thing as CI, every CI stage should just call its subcommands.
Just use a task runner(Make, Just, Taskfile) this is what they were designed for.
In many enterprise environments, deployment logic would be quite large for bash.
Personally, I have never found the Python as a task runners to be less code, more readable or maintainable.
I typically use make for this and feel like I’m constantly clawing back scripts written in workflows that are hard to debug if they’re even runnable locally.
This isn’t only a problem with GitHub Actions though. I’ve run into it with every CI runner I’ve come across.
How do you handle persistent state in your actions?
For my actions, the part that takes the longest to run is installing all the dependencies from scratch. I'd like to speed that up but I could never figure it out. All the options I could find for caching deps sounded so complicated.
> How do you handle persistent state in your actions?
You shouldn't. Besides caching that is.
> All the options I could find for caching deps sounded so complicated.
In reality, it's fairly simple, as long as you leverage content-hashing. First, take your lock file, compute the sha256sum. Then check if the cache has an artifact with that hash as the ID. If it's found, download and extract, those are your dependencies. If not, you run the installation of the dependencies, then archive the results, with the ID set to the hash.
It really isn't more to it. I'm sure there are helpers/sub-actions/whatever Microsoft calls it, for doing all of this with 1-3 lines or something.
The tricky bit for me was figuring out which cache to use, and how to use and test it locally. Do you use the proprietary github actions stuff? If the installation process inside the actions runner is different from what we use in the developer machines, now we have two sets of scripts and it's harder to test and debug...
> Do you use the proprietary github actions stuff?
If I can avoid it, no. Almost everything I can control is outside of the Microsoft ecosystem. But as a freelancer, I have to deal a bunch with GitHub and Microsoft anyways, so in many of those cases, yes.
Many times, I end up using https://github.com/actions/cache for the clients who already use Actions, and none of that runs in the local machines at all.
Typically I use a single Makefile/Justfile, that sometimes have most of the logic inside of it for running tests and what not, sometimes shell out to "proper" scripts.
But that's disconnected from the required "setup", so Make/Just doesn't actually download dependencies, that's outside of the responsibilities of whatever runs the test.
And also, with a lot of languages, it doesn't matter if you run an extra "npm install" over already existing node_modules/, it'll figure out what's missing/there already, so you could in theory still have "make test" do absolute everything locally, including installing dependencies (if you now wish this), and still do the whole "hash > find cache > extract > continue" thing before running "make test", and it'll skip the dependencies part if it's there already.
Depends on the build toolchain but usually you'd hash the dependency file and that hash is your cache key for a folder in which you keep your dependencies. You can also make a Docker image containing all your dependencies but usually downloading and spinning that up will take as long as installing the dependencies.
For caching you use GitHubs own cache action.
You don't.
For things like installing deps, you can use GitHub Actions or several third party runners have their own caching capabilities that are more mature than what GHA offers.
If you are able to use the large runners, custom images are a recent addition to what Github offers.
https://docs.github.com/en/actions/how-tos/manage-runners/la...
> Don't use bash
What? Bash is the best scripting language available for CI flows.
1. Just no. Unless you are some sort of Windows shop.
If you're building for Windows, then bash is "just no", so it's either cmd/.bat, or pwsh/.ps. <shrugs>
That’s the only reason for sure.
I mean, if you're a Windows shop you really should be using powershell.
Step 0. Stop using CI services that purposefully waste your time, and use CI services that have "Rebuild with SSH" or similar. From previous discussions (https://news.ycombinator.com/item?id=46592643), seems like Semaphore CI still offers that.
Its not Github Actions' fault but the horrors people create in it, all under the pretense that automation is simply about wrapping a GitHub Action around something. Learn to create a script in Python or similar and put all logic there so you can execute it locally and can port it to the next CI system when a new CTO arrives.
I think in this case they hate the fact that they cannot easily SSH into the failing VM and debug from there. Like "I have to edit my workflow, push it, wait for it to run and fail, and repeat".
This is giving "Debian systemd units call their old init.d scripts" energy but I kind of like it
systemd units that are small, simple, and call into a single script are usually fantastic. There's no reason for these scripts to be part of another init system; but making as much of your code completely agnostic to the env it runs in sounds good regardless. I think that's the feeling you're feeling.
No, it is github's fault. They encourage the horrors because they lead to vendor lock in. This is the source of most of Microsoft's real profit margins.
This is probably why they invented a whole programming language and then neglected to build any debugging tools for it.
I like Github Actions and it is better than what I used before (Travis) and I think it solves an important problem. For OSS projects it's a super valuable free resource.
For me what worked wonders was adopting Nix. Make sure you have a reproducible dev environment and wrap your commands in `nix-shell --run`, or even better `nix develop --command`, or even better your most of your CI tasks derivations that run with `nix build` or `nix flake check`.
Not only does this make it super easy to work with Github Actions, also with your colleagues or other contributors.
The way I deal with all these terrible CI platforms (there is no good one, merely lesser evils) is to do my entire CI process in a container and the CI tool just pulls and runs that. You can trivially run this locally when needed.
Of course, the platforms would rather have you not do that since it nullifies their vendor lock-in.
I really like the SourceHut CI, because:
1. When the build fails, you can SSH into the machine and debug it from there.
2. You can super easily edit & run the manifest without having to push to a branch at all. That makes it super easy to even try a minimum reproducible example on the remote machine.
Other than that, self-hosting (with Github or preferrably Forgejo) makes it easy to debug on the machine, but then you have to self-host.
Self-hosted runners with Github is a whole world of pain because it literally just runs commands on the host and does not handle provisioning/cleanup, meaning you need to make sure your `docker run` commands don't leave any leftover state that can mess up future/concurrent builds. It doesn't even handle concurrency by itself, so you have to run multiple instances of the runner.
That was included in my "but then you have to self-host" :-).
As I said, I really like the SourceHut CI.
Thats what i always did for our gitlab CI pipeline, just deploy dedicated images for different purposes. We had general terraform images for terraform code, this made it easy to standardize versions etc. Then we made specific ones for projects with a lot of dependencies so we could run the deployment pipeline in seconds instead of minutes. But now you need to maintain the docker images too. All about trade-offs.
The one issue with that is there isn’t a good way to containerise macOS builds.
I mean this has been an issue building for iOS forever. The MacOS lock-in sucks really really badly.
It's kind of their whole thing, when you think about it. They didn't get to where they are by playing nice with others. If you're supporting anything in the Apple ecosystem, the fix is in.
Your newsletter. I needs it.
I know GitHub Actions won the war, but I think Bitbucket Pipelines are much nicer to work with. They just seem simpler and less fragile.
But almost every company uses GitHub, and changing to Bitbucket isn't usually viable.
The main downside of bitbucket pipelines is bitbucket. And the only significant feature I recall over GitHub Actions is that Pipelines support cron jobs.
Reading the rest of the thread, it looks like Actions also has cron jobs.
fyi i maintain a repo that accidentally tracks github actions cron reliability (https://www.swyx.io/github-scraping) - just runs a small script every hour.
i just checked and in 2025 there was at least 2 outages a month every month https://x.com/swyx/status/2011463717683118449?s=20 . not quite 3 nines.
Of course avoid Actions if you expect tasks to complete promptly or ever. Have we forgotten safe_sleep.sh? I don't think it was unique.
I've gotten to a point where my workflow YAML files are mostly `mise` tool calls (because it handles versioning of all tooling and has cache support) and webhooks, and still it is a pain. Also their concurrency and matrix strategies are just not working well, and sometimes you end up having to use a REST API endpoint to force cancel a job because their normal cancel functionality simply does not take.
There was a time I wanted our GH actions to be more capable, but now I just want them to do as little as possible. I've got a Cloudflare worker receiving the GitHub webhooks firehose, storing metadata about each push and each run so I don't have to pass variables between workflows (which somehow is a horrible experience), and any long-running task that should run in parallel (like evaluations) happens on a Hetzner machine instead.
I'm very open to hear of nice alternatives that integrate well with GitHub, but are more fun to configure.
If you wanted a better version of GitHub Actions/CI (the orchestrator, the job definition interface, or the programming of scripts to execute inside those jobs), it would presumably need to be more opinionated and have more constraints?
Who here has been thinking about this problem? Have you come up with any interesting ideas? What's the state of the art in this space?
GHA was designed in ~2018. What would it look like if you designed it today, with all we know now?
In general, I've never really experienced the issues mentioned, but I also use Gitea with Actions rather than GitHub. I also avoid using any complex logic within an Action.
For the script getting run, there's one other thing. I build my containers locally, test the scripts thoroughly, and those scripts and container are what are then used in the build and deploy via Action. As the entire environment is the same, I haven't encountered many issues at all.
That sounds sane and also completely different from GH Actions.
Guys,
GitHub action is a totally broken piece of s !! I know about that broken loops cause I had to deal with it an incredible number of times.
I very often mention OneDev in my comments, and you know what ? Robin solved this issue 3 years ago : https://docs.onedev.io/tutorials/cicd/diagnose-with-web-term...
You can pause your action, connect through a web terminal, and debug/fix things live until it works. Then, you just patch your action easily.
And that’s just one of the many features that make OneDev superior to pretty much every other CI/CD product out there.
Standard msft absurdity. 8 years later there is still no local gh action runner to test your script before you commit, push, and churn through logs, and without some 3rd party hack, no way to ssh in and debug. It doesn't matter how simple the build command you write is, because the workflow itself is totally foreign technology to most, and no one wants to be a gh action dev.
Like most of the glaring nonsense that costs people time when using msft, this is financially beneficial to msft in that each failed run counts against paid minutes. It's a racket from disgusting sleaze scum who literally hold meetings dedicated to increasing user pain because otherwise the bottom line will slip fractionally and no one in redmond has a single clue how to make money without ripping off the userbase.
Would a tool like act help here? (https://github.com/nektos/act) I suppose orchestration that is hiding things from different processor architectures could also very well run differently online than offline, but still.
I haven’t looked into act for some time but I remember it NOT being a direct stand in locally. Like it covered 80% of use cases.
Maybe that has changed.
That's correct and it's linux-only (as of the last time I looked), you can run it on macOS but you can't run macOS runners (which is where I need the most help debugging normally, for building iOS apps).
it still isn't an 100% drop-in replacement
I think really what would help is a way to SSH into the machine after it fails. SourceHut allows that, and I find it great.
I actually built the last thing last weekend weirdly enough.
gg watch action
Finds the most recent or currently running action for the branch you have checked out. Among other things.
https://github.com/frankwiles/gg
Oh this is excellent. This is everything I wanted the `gh` cli to be, thanks.
edit: Just a quick note, the `gg` and `gg tui` commands for me don't show any repos at all, the current context stuff all works perfectly though.
Ah sorry need to make the docs more clear. You need to run ‘gg data refresh’ to populate the local DB first.
Ah, magnificent! Thanks!
Is any of this unique to GitHub Actions that does not happen on other cloud CI platforms?
The best CI platforms let you "Rebuild with SSH" or something similar, and instead of having the cycle of "change > commit > push > wait > see results" (when you're testing CI specific stuff, not iterating on Makefiles or whatever, assuming most of it is scripts you can run both locally and in CI), you get a URL to connect to while the job is running, so you can effectively ensure manually it works, then just copy-paste whatever you did to your local sources.
I use that a lot with SourceHut: after a build fails, you have 10 minutes to SSH into the machine and debug from there. Also they have a very nice "edit manifest and run" feature that makes it easy to quickly test a change while debugging.
Are there other platforms allowing that? Genuinely interested.
> after a build fails, you have 10 minutes to SSH into the machine and debug from there.
Ah, that's like 90% of the way there, just need to enable so the SSH endpoint is created at the beginning, rather than the end, so you could for example watch memory usage and stuff while the "real" job is running in the same instance.
But great to hear they let you have access to the runner at all, only that fact makes it a lot better than most CI services out there, creds to SourceHut.
> just need to enable so the SSH endpoint is created at the beginning
Maybe it is, I've never tried :-). I don't see a reason why not, probably it is.
There are a couple of GitHub actions that let you do this.
do you mean https://github.com/nektos/act or there is something else ?
No. Act is for running actions locally. What was mentioned is a way to insert an SSH step at some well-chosen point of a workflow so you can login at the runner and understand what is wrong and why it's not working. I have written one such thing, it relies on cloudflare free tunnels. https://github.com/efrecon/sshd-cloudflared. There are other solutions around to achieve more or less the same goal.
I designed a fairly complex test matrix with a lot of logic offloading to the control mechanisms Gitlab offers. You create job templates or base jobs that control the overall logic and extend them for each particular use case. I had varying degrees of success, and it's not a job for a Devs side quest, that means I think you need someone dedicated to explore, build and debug these pipelines, but for a CI tool it's very good.
Because you can extend and override jobs, you can create seams so that each piece of the pipeline is isolated and testable. This way there is very little that can go wrong in production that's the CI fault. And I think that's only possible because of the way that Gitlab models their jobs and stages.
For the last decade I've been doing my CI/CD as simple .NET console apps that run wherever. I don't see why we switch to these wildly different technologies when the tools we are already using can do the job.
Being able to run your entire "pipeline" locally with breakpoints is much more productive than whatever the hell goes on in GH Actions these days.
I can do that with github actions too? For tests, I can either run them locally (with a debugger if I want), or in github actions. Smaller checks go in a pre-commit config that github action also runs.
Setting up my github actions (or gitlab) checks in a way that can easily run locally can be a bit of extra work, but it's not difficult.
Of all the valid complaints about Github Actions or CI in general, this seems to be an odd one. No details about what was tried or not tried, but hard to see a `-run: go install cuelang.org/go/cmd/cue@latest` step not working?
The problem was the environment setup? You couldn't get CUE on Linux ARM and I am assuming when you moved to Makefiles you removed the need for CUE or something? So really the solution was something like Nix or Mise to install the tooling, so you have the same tooling/version locally & on CI?
Exactly.
"GitHub actions bad" is a valid take - you should reduce your use to a minimum.
"My build failed because of GitHub actions couldn't install a dependency of my build" is a skill issue. Don't use GitHub actions to install a program your build depends on.
Who doesn’t? I use it with Mise to have a very simple locally tested way or running tasks.
I avoid actions for these exact reasons unless I can run the exact same build on another host.
And that’s where there’s a Mac Studio that sits sadly in the corner, waiting for a new check in so it has something to do.
The love for Github Actions dissipated fast, it wasn't that long ago we had to read about how amazing Github Actions where. What changed?
I think it made CI management more accessible.
Before that, most people would avoid Jenkins and probably never try Buildbot (because devs typically don't want to spend any time learning tools). Devs would require "devops" to do the CI stuff. Again, mostly because they couldn't be arsed to make it themselves, but also because it required setting up a machine (do you self-host, do you use a VPS?).
Then came tools like Travis or CircleCI, which made it more accessible. "Just write some kind of script and we run it on our machines". Many devs started using that.
And then came GitHub Actions, which were a lot better than Travis and CircleCI: faster, more machines, and free (for open source projects at least). I was happy to move everything there.
But as soon as something becomes more accessible, you get people who had never done it before. They can't say "it enables me to do it, so it's better than me relying on a devops team before" or "well it's better than my experience with Travis". They will just complain because it's not perfect.
And for the OP's defense, I do agree that not being able to SSH into a machine after the build fails is very frustrating.
I think it's possible to both think GitHub Actions is an incredible piece of technology (and an incredible de facto public resource), while also thinking it has significant architectural and experiential flaws. The latter can be fixed; the former is difficult for competitors to replicate.
(In general, I think a lot of criticisms of GitHub Actions don't consider the fully loaded cost of an alternative -- there are lots of great alternative CI/CD services out there, but very few of them will give you the OS/architecture matrix and resource caps that GitHub Actions gives every single OSS project for free.)
We used it.
And we realized that the bs sales/marketing material was just as bs as always.
The typical cycle of all technologies:
1) New technology comes out, people get excited
2) People start recognising the drawbacks of the technology
3) Someone else makes an improved version that claims to fix all of the issues. GOTO 1
A lot of the pain of GitHub Actions gets much better using tools like action-tmate: https://github.com/mxschmitt/action-tmate
As soon as I need more than two tries to get some workflow working, I set up a tmate session and debug things using a proper remote shell. It doesn't solve all the pain points, but it makes things a lot better.
Tmate is not available anymore, and will be fully decommissioned[0]. Use upterm[1] and action-upterm[2] instead.
Honestly, this should be built into GitHub Actions.
[0] https://github.com/tmate-io/tmate/issues/322
[1] https://upterm.dev/
[2] https://github.com/marketplace/actions/debug-with-ssh
Honestly, I wonder if a better workflow definition would just have a single input: a single command to run. Remove the temptation to actually put logic in the actions workflow.
> I think everyone that's ever used GitHub actions has come to this conclusion
This is not even specific to GitHub Actions. The logic goes into the scripts, and the CI handles CI specific stuff (checkout, setup tooling, artifacts, cache...). No matter which CI you use, you're in for a bad time if you don't do this.
This is how we did things with Jenkins and gitlab runners before, idk why folks would do it differently for GHA.
If you can't run the same scripts locally (minus external hosted service/API) then how do you debug them w/o running the whole pipeline?
I assume you're using the currently recommended docker-in-docker method. The legacy Gitlab way is horrible and it makes it basically impossible to run pipelines locally.
Containers all the way down
GitHub introduced all their fancy GHA apps or callables or whatever they're called for specific tasks, and the community went wild. Then people built test and build workflows entirely in GHA instead of independent of it. And added tons of complexity to the point they have a whole build and test application written in GHA YAML.
This is basically how most other CI systems work. GitLab CI, Jenkins, Buildbot, Cirrus CI, etc. are all other systems I've used and they work this way.
I find GitHub Actions abhorrent in a way that I never found a CI/CD system before...
as usual for Microslop products: it's designed for maximum lock-in
everything is including some crappy proprietary yaml rather than using standard tooling
so instead of being a collection of easily composable and testable bits it's a mess that only works on their platform
It seems more of a cultural issue that -- I'm pretty sure -- predates Microsoft's acquisition of GitHub. I assume crappy proprietary yaml can be blamed on use of Ruby. And there seems to be an odd and pervasive "80% is good enough" feel to pretty much everything in GitHub, which is definitely cultural, and I'm pretty sure, also predates Microsoft's acquisition.
GHA is based on Azure Actions. This is evident in how bad its security stance is, since Azure Actions was designed to be used in a more closed/controlled environment.
> I find GitHub Actions abhorrent in a way that I never found a CI/CD system before...
That's just the good old Microsoft effect, they have a reverse-midas-touch when it comes to actually delivering good UX experiences.
> I think everyone that's ever used GitHub actions has come to this conclusion.
I agree that that should be reasonable but unfortunately I can tell you that not all developers (including seniors) naturally arrive at such conclusion no.
I thought that's how actions are supposed to work. Python is king. Just use the Actions script to feed your variables.
Skill issue bro: https://github.com/nektos/act
skill issue bro https://github.com/nektos/act
GHA’s componentized architecture is appealing, but it’s astonishing to me that there’s still seemingly no way to actually debug workflows, run them locally, or rapidly iterate on them in any way. Alas.
I think this is a specific example of a generalized mistake, one that various bits of our infrastructure and architecture all but beg us to make, over and over, and which must be resisted, which is: Your development feedback loop must be as tight as possible.
Granted, if you are working on "Windows 12", you won't be building, installing, testing, and deploying that locally. I understand and acknowledge that "as tight as possible" will still sometimes push you into remote services or heavyweight processes that can't be pushed towards you locally. This is an ideal to strive for, but not one that can always be accomplished.
However, I see people surrender the ability to work locally much sooner than they should, and implement massively heavyweight processes without any thought for whether you could have gotten 90% of the result of that process with a bit more thought and kept it local and fast.
And even once you pass the event horizon where the system as a whole can't be feasibly built/tested/whatever on anything but a CI system, I see them surrendering the ability to at least run the part of the thing you're working on locally.
I know it's a bit more work, building sufficient mocks and stubs for expensive remote services that you can feasibly run things locally, but the payoff for putting a bit of work into having it run locally for testing and development purposes is just huge, really huge, the sort of huge you should not be ignoring.
"Locally" here does not mean "on your local machine" per se, though that is a pretty good case, but more like, in an environment that you have sole access to, where you're not constantly fighting with latency, and where you have full control. Where if you're debugging even a complex orchestration between internal microservices, you have enough power to crank them all up to "don't ever timeout" and attach debuggers to all of them simultaneously, if you want to. Where you can afford to log every message in the system, interrupt any process, run any test, and change any component in the system in any manner necessary for debugging or development without having to coordinate with anyone. The more only the CI system can do by basically mailing it a PR, and the harder it is to convince it to do just the thing you need right now rather than the other 45 minutes of testing it's going to run before running the 10 second test you actually need, the worse your development speed is going to be.
Fortunately, and I don't even how exactly the ratio between sarcasm and seriousness here (but I'm definitely non-zero serious), this is probably going to fix itself in the next decade or so... because while paying humans to sit there and wait for CI and get sidetracked and distracted is just Humans Doing Work and after all what else are we paying them for, all of this stuff is going to be murder on AI-centric workflows, which need tight testing cycles to work at their best. Can't afford to have AI waiting for 30 minutes to find out that its PR is syntactically invalid, and can't afford for the invalid syntax to come back with bad error messages that leave it baffled as to what the actual problem is. If we won't do it for the humans, we'll do it for the AIs. This is definitely not something AI fixes, despite the fact they are way more patient than us and much less prone to distraction in the meantime since from their "lived experience" they don't experience the time taken for things to build and test, it is made much worse and more obvious that this is a real problem and not just humans being whiny and refusing to tough it through.
Yes exactly! I am using Nix/Make and I can prompt Claude to use Nix/Make, it uses these local feedback loops and corrects itself sometimes etc.
Skill issue.
To some extent I do agree that is sounds like "my build was failing on a platform I had never tested, and I was pissed because I had to debug it".
But it is true that GitHub Actions don't make it easy to debug (e.g. you can't just SSH into the machine after the build fails). Not sure if it justifies hating with Passion, though.
Care to elaborate?
Totally agree.
Agreed
So the article is about the frustrating experience of fixing GitHub Actions when something goes wrong, especially when a workflow only fails on one platform, potentially due to how GitHub runner is set up (inconsistently across platforms).
Took me a while to figure that out. While I appreciate occasional banters in blog articles, this one seems to diverge into rant a bit too much, and could have made its point much clearer, with, for example, meaningful section headers.
I was so lonely in this opinion for so long, and it is great to see it becoming mainstream. GHA is terrible.
Until I resd this blog I was under the impression that everyone wrote Python/ other files and used Github Actions to just call the scripts!
This way we can test it on local machine before deployment.
Also as other commenters have said - bash is not a good option - Use Python or some other language and write reusabe scripts. If not for this then for the off chance that it'll be migrated to some other cicd platform
> Even though my user base can be counted on a fingers of one-arm-less and second-arm-hook-equipped pirate, it’s still a thing “One Should Do”.
No. It's cargo cult science.
I wouldn't say that, but I would say there's no "should" here; it's often much more hassle than people expect and everyone has to decide for themselves whether the number of users is worth it.
I agree. Why are you building for platforms you don't even use?
just came here to say same, they are the absolute worst
>Now of course, in some Perfect World, GitHub could have a local runner with all the bells and whistles.
Not by GitHub, but isn't act supposed to be that?
https://github.com/nektos/act
Prefacing this with the fact that act is great, however, it has many shortcomings. Too often I've run into roadblocks, and when looking up the issue for it, it seems they are hard to address. Simpler workflows work fine with it, but more complex workflows will be much harder.
Don't put your logic in proprietary tooling. I have started writing all logic into mise tasks since I already manage the tool dependencies with mise. I tend to write them in a way where it can easily take advantage of GHA features such as concurrency, matrixes, etc. But beyond that, it is all running within mise tasks.
act is often mentioned as a drop-in replacement but I never got it to replicate GitHub actions environment. I didn't try it for this particular case, though.