Which goes on to prove that bottleneck isn't in writing the code. It is in reading and understanding the code.
We all had that one "productive" engineer in our teams who would write huge PRs that would have large swaths of refactoring whether warranted or not and that was way before anyone even could imagine in their wildest dreams that neural networks could generate that huge amounts of code.
The net effect of such a "productive" engineer always was that instead of increasing the team velocity, team would come to a crawling pace because either his PR had to be reviewed in detail eating up all the time and/or if you just did cursory LGTM then they blew up in production meanwhile forcing everyone back to the drawing board but project architecture would have shifted so rapidly due to his "productivity" that no one had a clear picture of the codebase such as what's where except that one "super smart talented productive loyal to the company goals" guy.
Sounds a like a tactical tornado, made me think of this paragraph:
“Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.”
- John Ousterhout, A Philosophy of Software Design
I have seen precisely zero consequences for these people because they usually leave after not too long and go somewhere else, sometimes for higher pay. The slower folks end up getting the worse code and no raises in exchange for comradery.
But also I have no idea how that situation arises unless the slower folks are just auto-approving PRs. You kind of did that to yourself if you let the new person get away with it.
> You kind of did that to yourself if you let the new person get away with it.
In theory, sure, but in practice, to echo the others, you often don't have a choice, because of power dynamics/politics.
Its easy to say "its management's fault", but the principle is the same - these guys are spammers and quacks (and deserve nothing less than to be confined to the level of hell reserved for spammers), they just have to spam long enough and something will get through (volume over quality). And after their "success" i.e. fraud, they can ditch the company and move onto the next. I've seen multiple "seniors" like this, not actually very good at the work, but great at pushing half-baked slop.
You start by rejecting those PRs, saying "write more maintainable code, not quick hacks".
Management starts pressuring the original developer "why is it not merged yet, I thought you had it working".
That developer hits back with "well, it failed code review, they want me to refactor it".
Management goes back to the reviewer, "why did you fail this? It meets coding standards right? Pipeline is green".
Reviewer says "Well, yes it technically meets coding standards but it's full of hacks and is not future proof, it will bite us."
Management says "If we coded for tomorrow we'd never get anything done. Don't be so awkward". And then code gets merged.
Then you learn to just let these people go wild. If it hurts in the future you have a nice little "I told you so". But in my experience, management doesn't actually care if it hurts us in the future, it's not their problem. They just say "Well give me bigger estimates if you need to refactor". Fair enough, it's not a big deal but it is a pointless slog of picking up the pieces.
The other way it comes about is when the original developer just isn't really that good of a developer. So you end up in such an endless feedback loop trying to get the code in a good state that you piss everyone off and it's just easier to merge it.
Some hills just aren't worth dying on. And these guys can be exploited for your own advantage if you want to get code merged quickly ;)
In objective terms, these tactical tornadoes are among the most valuable headcount at a big company to the extent that they can rapidly patch production issues and restore service, by any means possible.
The problem is allowing this kind of frantic tactical development even in "peace time".
They'll often exploit a power dynamic. If you're less senior than them, in some organizations it can be very difficult to stand up to that behavior. I experienced that at my previous employer as a relatively senior engineer, even. Top down organizations look unfavorably on feedback that runs upwards. Your pushback will be seen as standing in the way of progress.
For their bug bounty program, the company can just charge 5-10$ per submission to guarantee everything gets thoroughly reviewed by a human, and it completely eliminates bot slop DDoS submissions overnight. If your bug and PR was actually good, then you get 10+1000$, and If it wasn't good, then you need to do better due diligence next time, and the skilled human feedback you received on why it wasn't good, was a valuable lesson for your SW career, and it only cost you the price of a Starbucks latte. This way everyone wins.
I said it before and I'll say it again, for opportunities open to the entire world on the internet, adding monetary friction is THE ONLY WAY to filter out serious people from bad actors doing spray-and-pray hoping they make some money, or get that job, through weaponizing AI bots. You can't rely on honor systems and a high trust society on the anonymous open internet, you need to gatekeep to save yourself and your sanity and make sure the honest serious people you want to engage with don't end up drowning in the noise of the scammers.
We can't shut ourselves down just because we refuse to apply solutions to AI slop DDoS.
This is a great strategy idea, I like it. I'm not good at thinking out the curse of the monkey's paw, so I'm curious if folks can think of any downsides.
I said it before and I'll say it again, for opportunities open to the entire world on the internet, adding monetary friction is the only way to filter out serious people from bad actors doing spray-and-pray hoping they make some money or get that job through weaponizing AI bots and sucking all the air in the room.
So many problems can be solved that way, including customer support. Instead of having to post a sob story on Twitter and HN when the AI at BigCo bans my account for no reason, why not charge me $100 for access to human support that is empowered to triage and escalate genuine issues? Then, issue a refund if the problem is on their end.
But seriously, I guarantee you the opposite is more common- the incompetent devs which can't manage shipping anything, keep trying to do "surgical and small edits" after 1 week of thinking about them and then have them blow up in prod for someone else to fix quickly because if it's up to them, it'll take 2-3 sprints
10 years ago I was a lot closer to what y'all talking about. After having more and more colleagues I can no longer agree and suspect this is mostly the opinion of incompetents which try to discredit regular devs.
Another thing they always lack is the ability to see when a large change is necessary because that's just what is necessary to achieve the feature in a stable manner. Sorry to say this, but starting of this discussion while trying to discredit large change sets in the age of ai is incredibly inept.
When you wrote your software well, large changes are possible and increase stability when you actually need to add a fundamental change of behavior. Which can come from a miniscule requirement.
But to close off on the topic of this article: they made the right call. In the open source context you cannot have this kind of incentive anymore with openclaw continuously shitting out one PR after another
> Which goes on to prove that bottleneck isn't in writing the code. It is in reading and understanding the code.
No, it's that quite literally the PR submissions are SPAMmed. Whoever made them is acting like a SPAMmer, sending out lots of garbage and hoping it sticks. IE, they're not doing their part in reviewing what the AI found and generated.
"[...] bottleneck isn't in writing the code. It is in reading and understanding the code". 100% agreed! Furthermore, the more code is generated by AI, the fewer people will actually understand it!
Generally, software engineers already have little to no understanding of the code that's actually being executed. We're so used to high- and higher-level abstractions like C, Go, Python, and JavaScript that we forget that we're already working with mostly-deterministic symbolism in a process that more closely resembles invoking magic spells than writing machine code. One more level of abstraction is not the end of software engineering.
This argument comes up a lot. The point is that with unreviewed AI nobody understood the code at any time (including the AI). This is completely different to a C compiler wherein the writers and maintainers deeply understand the code. This means that even though I don't understand it, I can use it with some confidence.
Your point about AI being another abstraction similar to the "mostly deterministic" C compiler also comes up often but there are many arguments against it. If you think the determinism of a compiler and an AI are similar then I'm not sure whether you know anything about how either of them work or have even compared examples of what they produce.
To the extent that's true, it's already a problem plaguing the profession.
I wouldn't advocate for using different tools, but everyone should be able to reason about the machine instructions underlying their code. Both in the immediate sense of the assembly a simple function turns into, and the tricks language runtimes use to enable their neat features.
The attitude that things are magic is poison. There is a difference between feeling confident something is comprehensible and not yet needing to go learn it, vs resigning to a position of powerlessness.
I agree in principle, but every time I run a debugger on modern C++ it makes it clear that, rather than being a simple and cutesy transformation, "compiler optimization" is actually black magic.
I was (almost) just that guy for one PR. Removed something like 20% or more of the codebase by leveraging the libraries and external tools we already had in use better, but it meant almost every single thing we were doing had to use the library function instead of the one we wrote. But assuming you have good regression tests and linters, so you know the code works and it's not terrible, the review should be more about overall high level quality instead of poring over every character to check correctness. It was still a pain to review, though
You’re not an example of what we’re taking about here. Congratulations!
A better example would be if you’d changed the behavior of the library as you did this work, and the library changes introduced hard-to-detect bugs across the application.
Yes exactly. the GP isn't what we are talking about it and huge PR isn't what we are talking about either.
PR can be huge that's OK. For example, codebases that moved from Python 2 to Python 3 would have had huge PRs but the cognitive load was well understood.
As per the other person's comment, yeah basically I could have broken it up but it would've been an arbitrary demarcation. I just deleted our functions and fixed everything that yelled. Admittedly that could've been one and then leveraging the libraries better could've been another, but they would've been 2 PRs that changed almost every line. So done as one to mitigate review time.
Arbitrary demarcations can still be valuable! Just because something is arbitrary doesn't mean that it's not helpful. Working in chunks will let you take more time to review each callsite individually, and increase your confidence in the changes
In the future, I would definitely encourage you to explore a more iterative solution—fix the first 50 occurrences first, or maybe all the occurrences of a handful of functions. For example, if you have utility functions A, B, C and D, maybe fix functions A and B first, and then C and D second.
Ultimately, at the end of the day it's going to depend on how much code you're touching. If you're only touching 100 library calls, then it's probably easy to do them in one PR. But if you're updating 1000 library calls, you'll need to take a more iterative approach. Building those skills now will serve you well in the future when working on bigger codebases and harder refactors.
It depends a bit. But it would now mean that there are multiple ways of doing the same. Call your internal function or call the library directly. You need to put up some linting around it that people only use your function or the library function.
Otherwise you may get that you have your function, you think everywhere is using it, you make it fix a bug. And poof, you introduced bugs at the other call sites.
I don't understand why one wouldn't just auto reject big PRs and tell them to make smaller ones. Sounds like it's a communication and social problem, not a technological one.
Even with AI, just tell it to make smaller self contained PRs. I do this with Claude or GPT models and they do just fine.
Power dynamics. Usually the person making the giant PRs is the one with all the sway. An earlier-career engineer is unlikely to push back against that level of influence.
It can be a company wide policy rather than trying to target a single individual even if the outcome is that they are targeted. This is something that should be addressed to them through a manager etc or if not, it's time to leave while they ruin the product over time.
PRs are all about power dynamics and (un)spoken deals…
If you rubberstamp some people‘s PRs all the time, you can then get them to greenlight your unpleasant PRs via pm instantly.
The other way round, retaliation: I once added some serious review notes to the PR of a very senior engineer because it was a dangerous topic. He would then spend the next months nitpicking every single PR I created. Had to post my PR in slack whenever he was not online to get them merged. After that I never seriously reviewed his PRs again. Too much of a headache.
If you don't ever have a massive PR from a dynamite session, then you cannot ever be better than "average and plodding". So the question is, what's the context of the massive PR and how should it be handled?
* Mature product making money, intermediate engineer just refactored everything so it's "better"? Shut the fuck up, kindly please, you will have to demonstrate that you understand why things are this way and why it's better before we even have this conversation.
* Greenfield dev, trusted engineer getting from 0 -> 1 on something big? Maybe it shouldn't be held up in committee for 2 weeks. Maybe most objections will be superficial stylistic concerns.
Obviously there are many other contexts and these are 2 extremes in a multi-dimensional space. But if the process is "we litigate every line", then that's just not an innovative place to be. Yes, most PRs should be small, targeted, easy to review and tied to a ticket but if you're innovating? By definition it's a little different.
> you will have to demonstrate that you understand why things are this way and why it's better before we even have this conversation
I can fling that back to you: very often the team hates the conclusion I arrive it which "It worked during your initial crunch and then everyone is just afraid to change it, which means your test coverage is far from good -- why is it not enriched?"
I am not trying to be an arse on purpose but the inertia and cargo-culting and tribe-defending practices I've seen during my contracting years (10-11) made me almost physically sick. Programmers are a fiercely territorial bunch and it's often to the detriment of the organization.
Of course the reverse cases exist: where the domain is difficult and ugly hacks had to be done so the project works and makes money. Absolutely. I love receiving this knowledge and integrating it; makes for interesting engineering discussions.
> Greenfield dev, trusted engineer getting from 0 -> 1 on something big? Maybe it shouldn't be held up in committee for 2 weeks. Maybe most objections will be superficial stylistic concerns.
Yep, full agree. And often times these stylistic concerns are not even that; they are often "I suffered here at the beginning, this green-horn should suffer as well!" which is honestly pathetic and it also happens quite a lot.
Without AI, both writing and reading code are bottlenecks.
How many times have you reviewed your old code and been appalled at the terrible quality? You personally created slop; it's no different from GenAI output except that a human had to spend precious time crafting it. You likely were indeed bottlenecked by your ability to churn out code that you just had to get to work, for one reason or another.
The real issue is in the asymmetry when one party can use automation to create more code than another party can possibly manually verify.
Exactly! They should have set [your agentic AI toolkit could be here!] loose on these issues and 100x'd their output, all while actually shipping fixes to these issues instead of closing them. These Luddites are going to be left in the dust as AI is here to stay!
The reality is somewhere in the middle. Features are shipping 2x to 5x faster at a lot of organizations, with solid code still being produced and reviewed.
Anyone trying to suggest that AI hasn't sped up quality code production is just insisting on keeping their head in the sand, IMO.
They're just working at companies with mature products where people are in meetings all day -- they say so! Startups very much want to crank shit out faster.
I don't understand this. If that project is not offering a bug bounty, why are they getting so many PRs? What possible incentive is there to spend real money on tokens just to push junk PRs? Are the PRs spamming a product or something?
Why does every programming job application ask for your GitHub profile? The industry used open source contributions as a proxy for candidate quality, and this is Goodhart's law in action.
Closing the program is totally reasonable. However, there is another option: Make submitters pay a nominal fee that is returned in the case that a real bug is found.
Asking people to pay to submit bugs would start a firestorm of internet drama about asking people to do free work for the company and pay for the privilege. It doesn’t matter if the program actually paid out.
If they got even one report closed incorrectly we would never hear the end of it.
Unfortunately this isn't all black-and-white. There are some bug bounty where the company is very eager not to pay any bounty, aggressively marking vulnerabilities as out-of-scope or working-as-intended.
In those case you already lose time, but in the future you would also lose money.
Unfortunately you don't know how a company will react before submitting, especially if it's a small one.
There are many cryptocurrencies that allow anyone to move money quickly, cheaply, and on the same day in less than a minute and requires zero bank accounts.
And which are trivial to convert back and forth between real money and cryptocurrency? And hold their value with sufficient stability that you can convert USD into the currency, make a transaction, wait a few weeks, make a transaction the other direction and then convert back into USD, with roughly no loss in value?
Price it right. At the right price, it pays for everything you are talking about. At an even higher price, it is basically closing the program.
I'm not trying to suggest they _need_ to implement it. Like I said, closing it is reasonable. Completely aside from any other considerations, one could just decide that they don't feel like dealing with it. But there are other options.
It sounds like the bug bounty requires the user to extend the simulator, to cover the type of bug they found. Maybe the they could require a full run of the simulator test suite before submission? This serves as a nice check (that they didn’t break the simulator), and maybe it could also produce some proof-of-work artifact as a side-effect… (is this possible? I don’t know security).
The problem with that approach is that it will also deter genuine submissions, probably moreso than a "no bounty" system.
For those who encounter bugs as part of their employment, they'd now need to convince their employer to fork over money up front. For most employers, getting them to spend even insignificant money is like pulling teeth.
But even for the self-employed or hobbyists, gambling real money on "are they going to be a jerk about my exploit report". No offense towards Turso, but the bulk of software firms are TERRIBLE about handling reports like that. Many already have unstated policies of screwing people out of deserved bug bounties at every step.
To submit such reports today already requires you to accept that your work is statistically, just going to be a bunch of free labour that you gave away for the betterment of the product's users. Adding a cash fee just further deters submissions, especially once people haven't gotten their money back a few times. (Consider how many "AI detection tools" are themselves incredibly unreliable machine learning or sometimes even LLM systems)
How so? These bot systems work on volume – there's no regard for how much reviewer time they gobble up. The idea is to make producing reports basically free, so getting 1 in 1000 positives is still a success if you have no regard for externalities.
If they have to pay for reviewer time for each of 1000 reports, then the scheme stops being viable.
The majority of the exploits I can think of are fixed by setting the correct price. Other suggestions in this thread of denominating in bitcoin fix the other exploitation: chargebacks.
If you can think of something that isn't solved by one of those two mechanisms, I'd be interested in hearing them enumerated.
Honestly I think this is a great idea. My only suggestion is instead of being very nominal, it should be "reasonable" (so $10 and not $1).
It's even possible to directly link this to maintainers/employees - if you can review 10 such AI/real things per hour (likely more if it's AI slop that's easy to detect), you're generating another revenue stream. Now, I have no idea if these guys are based in SF Bay or a 3rd world country with low COL but as an "add on", $100 an hour isn't too shabby (and can be on the "low end" if one's good at spotting AI crap.)
Side note, isn't it possible to have some way to verify if the "vulns" are actual vulns or not? ...Heck why not throw an LLM at it, powered by a single $10 submission fee?
Sounds like a startup idea to me! Admittedly, the friction and the fact that you have to pay would prevent a lot of legitimate people from participation which sucks.
AI is really throwing a wrench in the economics of software development, isn’t it?
it seems we all will slowly learn to live within new contexts; i really appreciate their openness about it and it gives me insights to munch on
thanks to you all also to ring in with dev-style annecdotes (i'm stilllearning everyday, and hope to continue for a long time): those big-prs and tactical tornadoes stories are helping keep the crafts and thinking afloat, somehow.
Possibly stupid question (this is outside my wheelhouse): is there any way a final full run of the simulator test cases (presumably required to make sure the submitted simulator changes don’t break the thing) could act as a proof-of-work?
> It is possible to set up automated systems to gatekeep this, but with a non-negligible dollar value attached to it, the incentive is just too great for the AIs to just keep arguing, reopening the same PR, etc.
I wonder what Hacktoberfest would look like now if they were still giving out t-shirts to everyone. Probably not enough cotton in the world.
It can't be on individual maintainers to stop this, imo its on Github (and Gitlab) to stop these sort of accounts from even getting to the point of submitting PRs. Its essentially spam.
Look at the user who created the first PR they reference https://github.com/Samuelsills. This is not an account that should be allowed to do anything close to opening a PR against a well known repo.
I could have swear it said "0 contributions in the last year" when I opened the profile before, and it showed no activity in the contribution graph, but now I check again and it shows a bunch of stuff... Guess some request failed last time or something, my bad.
An interesting "conundrum" (at least from my outsider perspective): how many of those bot requests are from agents that utilize Turso on their backends?
Has anyone used Turso in production? It's an SQLite compatible rewrite in Rust but with added features like multiple writer support and being open to external contributions which SQLite is not.
I was thinking of using it for my full stack Rust apps just so everything works with cargo and I don't have to bring in SQLite separately.
It's alright as a dropin sqlite replacement. I ran into a bunch of problems with libsql on windows a year or two ago when I tried it but I'd assume it's fixed now. They also offer turso db as service with a very generous free plan which was my main reason to try it.
Being a verifiable human identity (not as-in age verification or whatever) but as in having a known, public, reputation online will go a long way in this new slop-first world.
There is hardly a bright line between real and fake. An influencer is just a person who rents out their identity. Can you imagine getting a real PR from a human engineer you trust, but the description says "This pull request was sponsored by Skeezy Software Inc."?
We sorely need a way to reliably detect AI slop, but unfortunately it doesn't seem possible and it's just getting harder and harder.
Last month I tried my hand at finding a way to tell whether an OSS project is slop or not, based on the amount of "human attention" it received vs the amount of code it contains. The idea is that a 100k LOC project which received 3 days' worth of attention from a human is most certainly slop.
The approach doesn't work very well, though¹, mostly because it's hard to gauge the amount of attention that was given. If I see one commit with +3000 LOC, I can assume it's AI-generated, but maybe you're just the type of dev that commits infrequently.
Maybe we need some sort of "proof of human attention" for digital artifacts, that guarantees that a human spent X time working on it.
I suspect that it will be impossible, soon. People will just train LLMs to "act human," and pass the various turing tests we throw at them.
I stay pretty busy[0], and have been accused of "gaming" my GH repos.
That's not the case. I'm retired, experienced, and working on software all day, every day. I just don't get paid for it.
I also don't especially care, whether or not anyone thinks I'm a bot. I eat my own dogfood. Most of my work is on modules that I use in my own projects.
There's no reason to care that a human spent time on it.
Humans are bad at writing code. Garbage PRs and slop have been a problem in open source and bug bounty programs since long before AI came on the scene.
We need better AI so that there's no need to solicit external bug fixes, and better AI so other contributions can be evaluated for usefulness and quality.
What do you care if a human ever looked at it at all? It implies that humans are adding value to the process. It's possible for a human to add value. The right human can add tremendous value. But I'll take a completely autonomous AI over 99% of the human software engineers and 99% of the people contributing PRs and bugfixes.
It was hard to keep up with slop before. It's a lot harder now. AI will help weed through the garbage.
we automated finding bugs. then we automated submitting bugs. now we're automating rejecting submissions. at no point did anyone automate fixing the bugs.
Fixing bugs has been automated by claude code and other tools for a while. We've merged a bunch of bug-fix PRs by claude, some of which were found by other bots.
Bots are using real tokens for this. So, ultimate honeypot idea: post heavily commented skeleton code in a github repo, promise a generous money reward for closing issues and never pay anyone. See the bots swarm and burn their tokens to write code for you.
AI can find useful exploits but the highly publicized ones are among a sea of false positives and the successes I've read were found by people who were already experts. I can 100% see a public bug bounty program being inundated with garbage even if there are diamonds in the rough.
The weird thing is it can't be that economically feasible to burn a ton of tokens in the hopes that you might get a bounty.. seems like a great way to set money on fire.
i was just thinking about that. i don't think anyone is systematically making money with that. it's probably just people who set up openclawd or whatever and told it to solve bounties.
>the author just injected garbage bytes manually into the database header, and then argued that this corrupted the database
>Steps to reproduce: Modified cli/main.rs to include a Vec with limited capacity. Forced a volatile write beyond the allocated bounds using std::ptr::write_volatile.
>author claims to have found a critical vulnerability that allows for the execution of arbitrary SQL statements. Imagine that? A SQL database that allows the execution of SQL statements. How can we ever recover from this.
I wonder why are they even doing this. Do any of these PRs ever win any money? It feels like they are burning down a forest thinking they'll find gold if they do it, without any evidence that there will be any gold after the forest is burnt down.
Oh look it's more of exactly what AI skeptics said would happen: low effort bullshit generated at scale making life hell for people actually trying to make things. That's wild.
Edit: it is genuinely wild, I don't know of another product category that selects so perfectly for the WORST type of person to be it's enthusiast. Just every single person I see hyped about AI is fucking insufferable on at least one and usually multiple axis.
Web3 is the closest analogue in recent memory, but if you go back further to the pre-enlightenment era (and some pockets of more recent history, particularly in isolated rural/colonial regions) you can see similar behaviors. It's mad religious fervor coupled with poor education. They see what their beliefs tell them they should see, and lack the mental rigor to analyze the actual data. Not their fault! It's our fault for letting them into the profession. Other disciplines are much better at keeping these folks outside the gates.
I think people would be more interested in listening to "AI skeptics" if they offered realistic solutions to the problems they predict. Pandora's box has been opened, let's deal with the consequences now instead of trying to shut the box which cannot be shut.
> I think people would be more interested in listening to "AI skeptics" if they offered realistic solutions to the problems they predict.
AI is the fucking problem. Yes, it has (some) uses. It is not nearly the number advertised. And more and more the median use case seems to be, again, overloading people actually trying to do work with an avalanche of bullshit.
The solution is exactly what the linked article says: shut it down. The AI people have ruined another good thing that was both beneficial to the project, and to a number of individuals.
> forget about the shutting it down and think of something actually realistic.
Why is it not realistic? Small teams do excellent work. Keep your team small and trusted. Only accept contributions from your team, and people outside your team who are personally vouched for by someone on your team. It's like climbing mountains or sailing or any other type of inherently risky activity--you don't go out with people you don't trust. It's eminently possible, you just don't like the idea of it.
Right, so the Github "open contributions" model where anyone can open an issue or a PR or otherwise waste a maintainer's time is broken. Fundamentally insecure under this type of attack. Now that the exploit is being used widely, and costing us immensely, we need to put a lid on it. If the only way to guarantee an AI bot (or its meatspace sock puppet) doesn't waste your time is to move to a "look but don't touch" model, then that's what we need to do. I think this would be a reasonable default:
Public repos are read only except for contributors who have been given specific permission, and those permissions are granular e.g. in order of increasing damage potential:
This response is incredibly annoying and insufferable. It's only "impossible" at this point because people continually ignored skeptics and anyone warning about exactly these outcomes.
Now that doom is here, it's too late to do anything about it. Just accept the doom!
My only objection is calling it doom. It isn't and calling it that gives this stupid shit used by low-effort people far, far too much credit. It's just slop.
But it does suck, you know? Part of what makes OSS so great is that anyone could contribute. If someone uses a thing, and finds something broken or a way to make it better, they could do that and then push it back up to the project and ideally have it merged so everyone can use it. That's what makes it awesome. The project benefits, the maintainer benefits, the coder benefits, the users benefit.
Now we have to stop that because lazy people can't stop shitting it up with generated PRs and trying to get money for not fucking doing anything.
The critics didn't do themselves any favors. Part think the Terminator has something useful to say on the subject, part invent contrived scenarios like self-driving cars having to resolve trolley problems. Reality turned out to be much more boring.
But yes, what you said but unironically. Like it or not it's here, it's not going away, so all the remaining options have to assume that.
> The critics didn't do themselves any favors. Part think the Terminator has something useful to say on the subject, part invent contrived scenarios like self-driving cars having to resolve trolley problems. Reality turned out to be much more boring.
Isn't there some alternative approach? I.e when someone submit ai slop they get a strike. Three strikes and you are suspended from submitting to the bug bounty for x months/years?
*Edit - I get it. It seems like the authentication is a challenge.
How about "It costs $1000 to submit a bug bounty for approval", and raise the reward to $2000 (or $5000 if it's in the cards, since that will have a deterrant impact on non-AI responses).
I think that's entirely sensible. Doesn't even have to be that expensive, just expensive enough to deter people who go "oooh, free money", and expensive enough to compensate for having to review slop far enough to realize it's slop.
you still need to spend effort reviewing the code to figure out when you can give a strike. Thrice for an actual ban. This would still waste precious maintainer time.
They mentioned they had identified alternatives but it would be costly to implement them. One can imagine that ban evading by generating a new user account would be easy for an LLM agent. It's going to be a long, long game if whack-a-mole.
This probably gets solved outside of the level of an individual project. No small team can handle this without building a whole product just to handle the bug bounty.
Which goes on to prove that bottleneck isn't in writing the code. It is in reading and understanding the code.
We all had that one "productive" engineer in our teams who would write huge PRs that would have large swaths of refactoring whether warranted or not and that was way before anyone even could imagine in their wildest dreams that neural networks could generate that huge amounts of code.
The net effect of such a "productive" engineer always was that instead of increasing the team velocity, team would come to a crawling pace because either his PR had to be reviewed in detail eating up all the time and/or if you just did cursory LGTM then they blew up in production meanwhile forcing everyone back to the drawing board but project architecture would have shifted so rapidly due to his "productivity" that no one had a clear picture of the codebase such as what's where except that one "super smart talented productive loyal to the company goals" guy.
Sounds a like a tactical tornado, made me think of this paragraph:
“Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.” - John Ousterhout, A Philosophy of Software Design
I have seen precisely zero consequences for these people because they usually leave after not too long and go somewhere else, sometimes for higher pay. The slower folks end up getting the worse code and no raises in exchange for comradery.
But also I have no idea how that situation arises unless the slower folks are just auto-approving PRs. You kind of did that to yourself if you let the new person get away with it.
> You kind of did that to yourself if you let the new person get away with it.
In theory, sure, but in practice, to echo the others, you often don't have a choice, because of power dynamics/politics.
Its easy to say "its management's fault", but the principle is the same - these guys are spammers and quacks (and deserve nothing less than to be confined to the level of hell reserved for spammers), they just have to spam long enough and something will get through (volume over quality). And after their "success" i.e. fraud, they can ditch the company and move onto the next. I've seen multiple "seniors" like this, not actually very good at the work, but great at pushing half-baked slop.
I can tell you how that situation comes about.
You start by rejecting those PRs, saying "write more maintainable code, not quick hacks".
Management starts pressuring the original developer "why is it not merged yet, I thought you had it working".
That developer hits back with "well, it failed code review, they want me to refactor it".
Management goes back to the reviewer, "why did you fail this? It meets coding standards right? Pipeline is green".
Reviewer says "Well, yes it technically meets coding standards but it's full of hacks and is not future proof, it will bite us."
Management says "If we coded for tomorrow we'd never get anything done. Don't be so awkward". And then code gets merged.
Then you learn to just let these people go wild. If it hurts in the future you have a nice little "I told you so". But in my experience, management doesn't actually care if it hurts us in the future, it's not their problem. They just say "Well give me bigger estimates if you need to refactor". Fair enough, it's not a big deal but it is a pointless slog of picking up the pieces.
The other way it comes about is when the original developer just isn't really that good of a developer. So you end up in such an endless feedback loop trying to get the code in a good state that you piss everyone off and it's just easier to merge it.
Some hills just aren't worth dying on. And these guys can be exploited for your own advantage if you want to get code merged quickly ;)
In objective terms, these tactical tornadoes are among the most valuable headcount at a big company to the extent that they can rapidly patch production issues and restore service, by any means possible.
The problem is allowing this kind of frantic tactical development even in "peace time".
They'll often exploit a power dynamic. If you're less senior than them, in some organizations it can be very difficult to stand up to that behavior. I experienced that at my previous employer as a relatively senior engineer, even. Top down organizations look unfavorably on feedback that runs upwards. Your pushback will be seen as standing in the way of progress.
AI can be the ultimate tactical tornado.
But it really doesn't have to be like this.
For their bug bounty program, the company can just charge 5-10$ per submission to guarantee everything gets thoroughly reviewed by a human, and it completely eliminates bot slop DDoS submissions overnight. If your bug and PR was actually good, then you get 10+1000$, and If it wasn't good, then you need to do better due diligence next time, and the skilled human feedback you received on why it wasn't good, was a valuable lesson for your SW career, and it only cost you the price of a Starbucks latte. This way everyone wins.
I said it before and I'll say it again, for opportunities open to the entire world on the internet, adding monetary friction is THE ONLY WAY to filter out serious people from bad actors doing spray-and-pray hoping they make some money, or get that job, through weaponizing AI bots. You can't rely on honor systems and a high trust society on the anonymous open internet, you need to gatekeep to save yourself and your sanity and make sure the honest serious people you want to engage with don't end up drowning in the noise of the scammers.
We can't shut ourselves down just because we refuse to apply solutions to AI slop DDoS.
This is a great strategy idea, I like it. I'm not good at thinking out the curse of the monkey's paw, so I'm curious if folks can think of any downsides.
The bots spam even when there's no bug bounty program. The emails start out with "I received $500 for a similar reported on another site"
I said it before and I'll say it again, for opportunities open to the entire world on the internet, adding monetary friction is the only way to filter out serious people from bad actors doing spray-and-pray hoping they make some money or get that job through weaponizing AI bots and sucking all the air in the room.
So many problems can be solved that way, including customer support. Instead of having to post a sob story on Twitter and HN when the AI at BigCo bans my account for no reason, why not charge me $100 for access to human support that is empowered to triage and escalate genuine issues? Then, issue a refund if the problem is on their end.
I don't understand why this isn't a thing.
This is profound and beautiful description. Thank you for sharing. Totally can relate to that. Been there, seen that.
Do people like that exist?
Totally.
But seriously, I guarantee you the opposite is more common- the incompetent devs which can't manage shipping anything, keep trying to do "surgical and small edits" after 1 week of thinking about them and then have them blow up in prod for someone else to fix quickly because if it's up to them, it'll take 2-3 sprints
10 years ago I was a lot closer to what y'all talking about. After having more and more colleagues I can no longer agree and suspect this is mostly the opinion of incompetents which try to discredit regular devs.
Another thing they always lack is the ability to see when a large change is necessary because that's just what is necessary to achieve the feature in a stable manner. Sorry to say this, but starting of this discussion while trying to discredit large change sets in the age of ai is incredibly inept.
When you wrote your software well, large changes are possible and increase stability when you actually need to add a fundamental change of behavior. Which can come from a miniscule requirement.
But to close off on the topic of this article: they made the right call. In the open source context you cannot have this kind of incentive anymore with openclaw continuously shitting out one PR after another
too real
> Which goes on to prove that bottleneck isn't in writing the code. It is in reading and understanding the code.
No, it's that quite literally the PR submissions are SPAMmed. Whoever made them is acting like a SPAMmer, sending out lots of garbage and hoping it sticks. IE, they're not doing their part in reviewing what the AI found and generated.
"[...] bottleneck isn't in writing the code. It is in reading and understanding the code". 100% agreed! Furthermore, the more code is generated by AI, the fewer people will actually understand it!
Generally, software engineers already have little to no understanding of the code that's actually being executed. We're so used to high- and higher-level abstractions like C, Go, Python, and JavaScript that we forget that we're already working with mostly-deterministic symbolism in a process that more closely resembles invoking magic spells than writing machine code. One more level of abstraction is not the end of software engineering.
This argument comes up a lot. The point is that with unreviewed AI nobody understood the code at any time (including the AI). This is completely different to a C compiler wherein the writers and maintainers deeply understand the code. This means that even though I don't understand it, I can use it with some confidence.
Your point about AI being another abstraction similar to the "mostly deterministic" C compiler also comes up often but there are many arguments against it. If you think the determinism of a compiler and an AI are similar then I'm not sure whether you know anything about how either of them work or have even compared examples of what they produce.
To the extent that's true, it's already a problem plaguing the profession.
I wouldn't advocate for using different tools, but everyone should be able to reason about the machine instructions underlying their code. Both in the immediate sense of the assembly a simple function turns into, and the tricks language runtimes use to enable their neat features.
The attitude that things are magic is poison. There is a difference between feeling confident something is comprehensible and not yet needing to go learn it, vs resigning to a position of powerlessness.
I agree in principle, but every time I run a debugger on modern C++ it makes it clear that, rather than being a simple and cutesy transformation, "compiler optimization" is actually black magic.
I was (almost) just that guy for one PR. Removed something like 20% or more of the codebase by leveraging the libraries and external tools we already had in use better, but it meant almost every single thing we were doing had to use the library function instead of the one we wrote. But assuming you have good regression tests and linters, so you know the code works and it's not terrible, the review should be more about overall high level quality instead of poring over every character to check correctness. It was still a pain to review, though
You’re not an example of what we’re taking about here. Congratulations!
A better example would be if you’d changed the behavior of the library as you did this work, and the library changes introduced hard-to-detect bugs across the application.
Yes exactly. the GP isn't what we are talking about it and huge PR isn't what we are talking about either.
PR can be huge that's OK. For example, codebases that moved from Python 2 to Python 3 would have had huge PRs but the cognitive load was well understood.
Admirable effort. But why did you have to do it in one PR?
> almost every single thing we were doing had to use the library function instead of the one we wrote
Why not proceed file-by-file or directory by directory?
As per the other person's comment, yeah basically I could have broken it up but it would've been an arbitrary demarcation. I just deleted our functions and fixed everything that yelled. Admittedly that could've been one and then leveraging the libraries better could've been another, but they would've been 2 PRs that changed almost every line. So done as one to mitigate review time.
Arbitrary demarcations can still be valuable! Just because something is arbitrary doesn't mean that it's not helpful. Working in chunks will let you take more time to review each callsite individually, and increase your confidence in the changes
In the future, I would definitely encourage you to explore a more iterative solution—fix the first 50 occurrences first, or maybe all the occurrences of a handful of functions. For example, if you have utility functions A, B, C and D, maybe fix functions A and B first, and then C and D second.
Ultimately, at the end of the day it's going to depend on how much code you're touching. If you're only touching 100 library calls, then it's probably easy to do them in one PR. But if you're updating 1000 library calls, you'll need to take a more iterative approach. Building those skills now will serve you well in the future when working on bigger codebases and harder refactors.
Why not leave your functions but have them invoke the libraries instead?
It depends a bit. But it would now mean that there are multiple ways of doing the same. Call your internal function or call the library directly. You need to put up some linting around it that people only use your function or the library function.
Otherwise you may get that you have your function, you think everywhere is using it, you make it fix a bug. And poof, you introduced bugs at the other call sites.
I don't understand why one wouldn't just auto reject big PRs and tell them to make smaller ones. Sounds like it's a communication and social problem, not a technological one.
Even with AI, just tell it to make smaller self contained PRs. I do this with Claude or GPT models and they do just fine.
Power dynamics. Usually the person making the giant PRs is the one with all the sway. An earlier-career engineer is unlikely to push back against that level of influence.
It can be a company wide policy rather than trying to target a single individual even if the outcome is that they are targeted. This is something that should be addressed to them through a manager etc or if not, it's time to leave while they ruin the product over time.
This "you should leave" thing is a very boring and tired take and it should be said regularly that almost no engineer can afford it nowadays.
Beautiful theory, but only that.
PRs are all about power dynamics and (un)spoken deals…
If you rubberstamp some people‘s PRs all the time, you can then get them to greenlight your unpleasant PRs via pm instantly.
The other way round, retaliation: I once added some serious review notes to the PR of a very senior engineer because it was a dangerous topic. He would then spend the next months nitpicking every single PR I created. Had to post my PR in slack whenever he was not online to get them merged. After that I never seriously reviewed his PRs again. Too much of a headache.
Because sometimes in order to add or change a feature, huge changes need to be made all at once
> Even with AI, just tell it to make smaller self contained PRs.
Do you want one big PR or 100 small ones? You can't escape the sheer volume of code it's going to produce.
I want a set of smaller ones if it's practical.
100 small ones for sure. There could be a way to auto reject new PRs from an author if they have X open ones unreviewed.
Context is everything for massive PRs.
If you don't ever have a massive PR from a dynamite session, then you cannot ever be better than "average and plodding". So the question is, what's the context of the massive PR and how should it be handled?
* Mature product making money, intermediate engineer just refactored everything so it's "better"? Shut the fuck up, kindly please, you will have to demonstrate that you understand why things are this way and why it's better before we even have this conversation.
* Greenfield dev, trusted engineer getting from 0 -> 1 on something big? Maybe it shouldn't be held up in committee for 2 weeks. Maybe most objections will be superficial stylistic concerns.
Obviously there are many other contexts and these are 2 extremes in a multi-dimensional space. But if the process is "we litigate every line", then that's just not an innovative place to be. Yes, most PRs should be small, targeted, easy to review and tied to a ticket but if you're innovating? By definition it's a little different.
> you will have to demonstrate that you understand why things are this way and why it's better before we even have this conversation
I can fling that back to you: very often the team hates the conclusion I arrive it which "It worked during your initial crunch and then everyone is just afraid to change it, which means your test coverage is far from good -- why is it not enriched?"
I am not trying to be an arse on purpose but the inertia and cargo-culting and tribe-defending practices I've seen during my contracting years (10-11) made me almost physically sick. Programmers are a fiercely territorial bunch and it's often to the detriment of the organization.
Of course the reverse cases exist: where the domain is difficult and ugly hacks had to be done so the project works and makes money. Absolutely. I love receiving this knowledge and integrating it; makes for interesting engineering discussions.
> Greenfield dev, trusted engineer getting from 0 -> 1 on something big? Maybe it shouldn't be held up in committee for 2 weeks. Maybe most objections will be superficial stylistic concerns.
Yep, full agree. And often times these stylistic concerns are not even that; they are often "I suffered here at the beginning, this green-horn should suffer as well!" which is honestly pathetic and it also happens quite a lot.
> If you don't ever have a massive PR from a dynamite session, then you cannot ever be better than "average and plodding".
That's just cope to avoid learning how to turn a big change into a well organized patch series.
Without AI, both writing and reading code are bottlenecks.
How many times have you reviewed your old code and been appalled at the terrible quality? You personally created slop; it's no different from GenAI output except that a human had to spend precious time crafting it. You likely were indeed bottlenecked by your ability to churn out code that you just had to get to work, for one reason or another.
The real issue is in the asymmetry when one party can use automation to create more code than another party can possibly manually verify.
That guy is now running twenty agents in parallel and really scaling up his wonderful impact.
Maybe "Hurricane Hacker" who produces "tactical tornadoes" via agents?
With AI everybody gets to be “that guy” now.
Which goes on to prove that bottleneck isn't in writing the code. It is in reading and understanding the code.
So all we have to do is write code without reading or understanding it! Larry Wall was right all along!
Exactly! They should have set [your agentic AI toolkit could be here!] loose on these issues and 100x'd their output, all while actually shipping fixes to these issues instead of closing them. These Luddites are going to be left in the dust as AI is here to stay!
The reality is somewhere in the middle. Features are shipping 2x to 5x faster at a lot of organizations, with solid code still being produced and reviewed.
Anyone trying to suggest that AI hasn't sped up quality code production is just insisting on keeping their head in the sand, IMO.
They're just working at companies with mature products where people are in meetings all day -- they say so! Startups very much want to crank shit out faster.
Good time to mention this fantastic repo acting as a bot honeypot:
https://github.com/UnsafeLabs/Bounty-Hunters
The corresponding leaderboard:
https://clankers-leaderboard.pages.dev
I don't understand this. If that project is not offering a bug bounty, why are they getting so many PRs? What possible incentive is there to spend real money on tokens just to push junk PRs? Are the PRs spamming a product or something?
Why does every programming job application ask for your GitHub profile? The industry used open source contributions as a proxy for candidate quality, and this is Goodhart's law in action.
My mind absolutely doesn't bend that way but I'd suppose clout and popularity?
Maybe once the account has enough stars and reputation, the human behind it will use it to try to get an actual paying job.
Almost every time someone on HN asks how to increase their chances of employment, the response is to contribute to other people's Git* projects.
That's a great project!
It's likely to get blacklisted by AI bots, soon enough, though.
There's an AI bot blacklist? How do I get all my projects onto that?
Closing the program is totally reasonable. However, there is another option: Make submitters pay a nominal fee that is returned in the case that a real bug is found.
> Make submitters pay
Asking people to pay to submit bugs would start a firestorm of internet drama about asking people to do free work for the company and pay for the privilege. It doesn’t matter if the program actually paid out.
If they got even one report closed incorrectly we would never hear the end of it.
Unfortunately this isn't all black-and-white. There are some bug bounty where the company is very eager not to pay any bounty, aggressively marking vulnerabilities as out-of-scope or working-as-intended.
In those case you already lose time, but in the future you would also lose money.
Unfortunately you don't know how a company will react before submitting, especially if it's a small one.
It already doesn't stand on face value. These people are spending money to open PRs via their token costs
No, not them, but those who provide subsidies to AI labs, which is why such people spend almost nothing
Moving money is not free, and managing payments/etc can be a huuge headache. Sometimes it’s easy, but sometimes it’s not.
This is one of the cases where crypto works well.
There are many cryptocurrencies that allow anyone to move money quickly, cheaply, and on the same day in less than a minute and requires zero bank accounts.
At this point there isn't an excuse.
And which are trivial to convert back and forth between real money and cryptocurrency? And hold their value with sufficient stability that you can convert USD into the currency, make a transaction, wait a few weeks, make a transaction the other direction and then convert back into USD, with roughly no loss in value?
For this use case, that's a virtuous proof-of-work requirement.
That would add administrative overhead, and even higher incentive for submitters to endlessly argue they're right.
Price it right. At the right price, it pays for everything you are talking about. At an even higher price, it is basically closing the program.
I'm not trying to suggest they _need_ to implement it. Like I said, closing it is reasonable. Completely aside from any other considerations, one could just decide that they don't feel like dealing with it. But there are other options.
It sounds like the bug bounty requires the user to extend the simulator, to cover the type of bug they found. Maybe the they could require a full run of the simulator test suite before submission? This serves as a nice check (that they didn’t break the simulator), and maybe it could also produce some proof-of-work artifact as a side-effect… (is this possible? I don’t know security).
The problem with that approach is that it will also deter genuine submissions, probably moreso than a "no bounty" system.
For those who encounter bugs as part of their employment, they'd now need to convince their employer to fork over money up front. For most employers, getting them to spend even insignificant money is like pulling teeth.
But even for the self-employed or hobbyists, gambling real money on "are they going to be a jerk about my exploit report". No offense towards Turso, but the bulk of software firms are TERRIBLE about handling reports like that. Many already have unstated policies of screwing people out of deserved bug bounties at every step.
To submit such reports today already requires you to accept that your work is statistically, just going to be a bunch of free labour that you gave away for the betterment of the product's users. Adding a cash fee just further deters submissions, especially once people haven't gotten their money back a few times. (Consider how many "AI detection tools" are themselves incredibly unreliable machine learning or sometimes even LLM systems)
Easily exploitable without much stretch of a thought.
I'd say closing a program which doesn't work anymore is a better idea.
How so? These bot systems work on volume – there's no regard for how much reviewer time they gobble up. The idea is to make producing reports basically free, so getting 1 in 1000 positives is still a success if you have no regard for externalities.
If they have to pay for reviewer time for each of 1000 reports, then the scheme stops being viable.
The majority of the exploits I can think of are fixed by setting the correct price. Other suggestions in this thread of denominating in bitcoin fix the other exploitation: chargebacks.
If you can think of something that isn't solved by one of those two mechanisms, I'd be interested in hearing them enumerated.
Honestly I think this is a great idea. My only suggestion is instead of being very nominal, it should be "reasonable" (so $10 and not $1).
It's even possible to directly link this to maintainers/employees - if you can review 10 such AI/real things per hour (likely more if it's AI slop that's easy to detect), you're generating another revenue stream. Now, I have no idea if these guys are based in SF Bay or a 3rd world country with low COL but as an "add on", $100 an hour isn't too shabby (and can be on the "low end" if one's good at spotting AI crap.)
Side note, isn't it possible to have some way to verify if the "vulns" are actual vulns or not? ...Heck why not throw an LLM at it, powered by a single $10 submission fee?
Sounds like a startup idea to me! Admittedly, the friction and the fact that you have to pay would prevent a lot of legitimate people from participation which sucks.
AI is really throwing a wrench in the economics of software development, isn’t it?
cool idea
it seems we all will slowly learn to live within new contexts; i really appreciate their openness about it and it gives me insights to munch on thanks to you all also to ring in with dev-style annecdotes (i'm stilllearning everyday, and hope to continue for a long time): those big-prs and tactical tornadoes stories are helping keep the crafts and thinking afloat, somehow.
Possibly stupid question (this is outside my wheelhouse): is there any way a final full run of the simulator test cases (presumably required to make sure the submitted simulator changes don’t break the thing) could act as a proof-of-work?
Can't they just beat them at their own game and deploy their own AI bots to pre-screen the PRs?
from the article:
> It is possible to set up automated systems to gatekeep this, but with a non-negligible dollar value attached to it, the incentive is just too great for the AIs to just keep arguing, reopening the same PR, etc.
Or, make sure their program doesn't corrupt data so easily, so they don't have to pay out $1000 for every issue others find for them...
I wonder what Hacktoberfest would look like now if they were still giving out t-shirts to everyone. Probably not enough cotton in the world.
It can't be on individual maintainers to stop this, imo its on Github (and Gitlab) to stop these sort of accounts from even getting to the point of submitting PRs. Its essentially spam.
Look at the user who created the first PR they reference https://github.com/Samuelsills. This is not an account that should be allowed to do anything close to opening a PR against a well known repo.
An account with zero activity doing nothing shouldn't be allowed to continue doing nothing? Did you share the wrong account here maybe?
Its the account that posted this PR https://github.com/tursodatabase/turso/pull/6257
I could have swear it said "0 contributions in the last year" when I opened the profile before, and it showed no activity in the contribution graph, but now I check again and it shows a bunch of stuff... Guess some request failed last time or something, my bad.
An interesting "conundrum" (at least from my outsider perspective): how many of those bot requests are from agents that utilize Turso on their backends?
Has anyone used Turso in production? It's an SQLite compatible rewrite in Rust but with added features like multiple writer support and being open to external contributions which SQLite is not.
I was thinking of using it for my full stack Rust apps just so everything works with cargo and I don't have to bring in SQLite separately.
It's alright as a dropin sqlite replacement. I ran into a bunch of problems with libsql on windows a year or two ago when I tried it but I'd assume it's fixed now. They also offer turso db as service with a very generous free plan which was my main reason to try it.
Being a verifiable human identity (not as-in age verification or whatever) but as in having a known, public, reputation online will go a long way in this new slop-first world.
There is hardly a bright line between real and fake. An influencer is just a person who rents out their identity. Can you imagine getting a real PR from a human engineer you trust, but the description says "This pull request was sponsored by Skeezy Software Inc."?
We sorely need a way to reliably detect AI slop, but unfortunately it doesn't seem possible and it's just getting harder and harder.
Last month I tried my hand at finding a way to tell whether an OSS project is slop or not, based on the amount of "human attention" it received vs the amount of code it contains. The idea is that a 100k LOC project which received 3 days' worth of attention from a human is most certainly slop.
The approach doesn't work very well, though¹, mostly because it's hard to gauge the amount of attention that was given. If I see one commit with +3000 LOC, I can assume it's AI-generated, but maybe you're just the type of dev that commits infrequently.
Maybe we need some sort of "proof of human attention" for digital artifacts, that guarantees that a human spent X time working on it.
¹ I wrote about it here https://pscanf.com/s/352/
I suspect that it will be impossible, soon. People will just train LLMs to "act human," and pass the various turing tests we throw at them.
I stay pretty busy[0], and have been accused of "gaming" my GH repos.
That's not the case. I'm retired, experienced, and working on software all day, every day. I just don't get paid for it.
I also don't especially care, whether or not anyone thinks I'm a bot. I eat my own dogfood. Most of my work is on modules that I use in my own projects.
[0] https://github.com/ChrisMarshallNY#github-stuff
There's no reason to care that a human spent time on it.
Humans are bad at writing code. Garbage PRs and slop have been a problem in open source and bug bounty programs since long before AI came on the scene.
We need better AI so that there's no need to solicit external bug fixes, and better AI so other contributions can be evaluated for usefulness and quality.
What do you care if a human ever looked at it at all? It implies that humans are adding value to the process. It's possible for a human to add value. The right human can add tremendous value. But I'll take a completely autonomous AI over 99% of the human software engineers and 99% of the people contributing PRs and bugfixes.
It was hard to keep up with slop before. It's a lot harder now. AI will help weed through the garbage.
Reasonable logic but I bet you get downvoted
we automated finding bugs. then we automated submitting bugs. now we're automating rejecting submissions. at no point did anyone automate fixing the bugs.
Be careful for what you wish.
Someone automated rewriting Bun in Rust, allegedly fixing the bugs.
Fixing bugs has been automated by claude code and other tools for a while. We've merged a bunch of bug-fix PRs by claude, some of which were found by other bots.
Bots are using real tokens for this. So, ultimate honeypot idea: post heavily commented skeleton code in a github repo, promise a generous money reward for closing issues and never pay anyone. See the bots swarm and burn their tokens to write code for you.
:D https://github.com/UnsafeLabs/Bounty-Hunters
It's a bit odd that this comes today after so many other projects reverse this finding.
AI can find useful exploits but the highly publicized ones are among a sea of false positives and the successes I've read were found by people who were already experts. I can 100% see a public bug bounty program being inundated with garbage even if there are diamonds in the rough.
The weird thing is it can't be that economically feasible to burn a ton of tokens in the hopes that you might get a bounty.. seems like a great way to set money on fire.
i was just thinking about that. i don't think anyone is systematically making money with that. it's probably just people who set up openclawd or whatever and told it to solve bounties.
I'm sorry but I find the slop PR's hilarious.
>the author just injected garbage bytes manually into the database header, and then argued that this corrupted the database
>Steps to reproduce: Modified cli/main.rs to include a Vec with limited capacity. Forced a volatile write beyond the allocated bounds using std::ptr::write_volatile.
>author claims to have found a critical vulnerability that allows for the execution of arbitrary SQL statements. Imagine that? A SQL database that allows the execution of SQL statements. How can we ever recover from this.
I wonder why are they even doing this. Do any of these PRs ever win any money? It feels like they are burning down a forest thinking they'll find gold if they do it, without any evidence that there will be any gold after the forest is burnt down.
Definitely feels like we're heading towards an eternal september (or already arrived).
...large swaths of approaches on online engagement just becoming non-viable
Oh look it's more of exactly what AI skeptics said would happen: low effort bullshit generated at scale making life hell for people actually trying to make things. That's wild.
Edit: it is genuinely wild, I don't know of another product category that selects so perfectly for the WORST type of person to be it's enthusiast. Just every single person I see hyped about AI is fucking insufferable on at least one and usually multiple axis.
To be fair, you're not making a compelling case for your team either.
Sure if you close your eyes and ignore the evidence all around you...
Web3 is the closest analogue in recent memory, but if you go back further to the pre-enlightenment era (and some pockets of more recent history, particularly in isolated rural/colonial regions) you can see similar behaviors. It's mad religious fervor coupled with poor education. They see what their beliefs tell them they should see, and lack the mental rigor to analyze the actual data. Not their fault! It's our fault for letting them into the profession. Other disciplines are much better at keeping these folks outside the gates.
I think people would be more interested in listening to "AI skeptics" if they offered realistic solutions to the problems they predict. Pandora's box has been opened, let's deal with the consequences now instead of trying to shut the box which cannot be shut.
> I think people would be more interested in listening to "AI skeptics" if they offered realistic solutions to the problems they predict.
AI is the fucking problem. Yes, it has (some) uses. It is not nearly the number advertised. And more and more the median use case seems to be, again, overloading people actually trying to do work with an avalanche of bullshit.
The solution is exactly what the linked article says: shut it down. The AI people have ruined another good thing that was both beneficial to the project, and to a number of individuals.
> The solution is exactly what the linked article says: shut it down.
At this point it's impossible, so I concur with the parent: forget about the shutting it down and think of something actually realistic.
> forget about the shutting it down and think of something actually realistic.
Why is it not realistic? Small teams do excellent work. Keep your team small and trusted. Only accept contributions from your team, and people outside your team who are personally vouched for by someone on your team. It's like climbing mountains or sailing or any other type of inherently risky activity--you don't go out with people you don't trust. It's eminently possible, you just don't like the idea of it.
> It's eminently possible, you just don't like the idea of it.
Sounds like you can't accept AI is here to stay
That's not shutting anything down, that's just being selective with what you accept, and everyone did that already to some extent.
Even pre-AI it was obvious that contributions have to be vetted for a bunch of reasons.
Right, so the Github "open contributions" model where anyone can open an issue or a PR or otherwise waste a maintainer's time is broken. Fundamentally insecure under this type of attack. Now that the exploit is being used widely, and costing us immensely, we need to put a lid on it. If the only way to guarantee an AI bot (or its meatspace sock puppet) doesn't waste your time is to move to a "look but don't touch" model, then that's what we need to do. I think this would be a reasonable default:
Public repos are read only except for contributors who have been given specific permission, and those permissions are granular e.g. in order of increasing damage potential:
- comment on issue
- create issue
- comment on PR
- create PR
- run CI against PR
- etc.
In other words, shut it down.
I think I saw this on here yesterday: https://github.com/mitchellh/vouch
Not great for privacy or ad-hoc contributions, but I don't see a way out of the muck without some kind of trust net.
This response is incredibly annoying and insufferable. It's only "impossible" at this point because people continually ignored skeptics and anyone warning about exactly these outcomes.
Now that doom is here, it's too late to do anything about it. Just accept the doom!
My only objection is calling it doom. It isn't and calling it that gives this stupid shit used by low-effort people far, far too much credit. It's just slop.
But it does suck, you know? Part of what makes OSS so great is that anyone could contribute. If someone uses a thing, and finds something broken or a way to make it better, they could do that and then push it back up to the project and ideally have it merged so everyone can use it. That's what makes it awesome. The project benefits, the maintainer benefits, the coder benefits, the users benefit.
Now we have to stop that because lazy people can't stop shitting it up with generated PRs and trying to get money for not fucking doing anything.
The critics didn't do themselves any favors. Part think the Terminator has something useful to say on the subject, part invent contrived scenarios like self-driving cars having to resolve trolley problems. Reality turned out to be much more boring.
But yes, what you said but unironically. Like it or not it's here, it's not going away, so all the remaining options have to assume that.
> The critics didn't do themselves any favors. Part think the Terminator has something useful to say on the subject, part invent contrived scenarios like self-driving cars having to resolve trolley problems. Reality turned out to be much more boring.
You do very well in battles against straw men.
I guarantee you AI will not end the world
> Now that doom is here, it's too late to do anything about it. Just accept the doom!
What doom? This is a mildly annoying problem that will likely be self correcting long term.
What an unreasonably maximalist opinion.
for every person that's hyping AI there are another 10 just using it to get stuff done without talking about it incessantly
Yeah, those 10 people are getting really useful stuff done, like submitting nonsense bugs to Turso in an attempt to get bug bounties.
Isn't there some alternative approach? I.e when someone submit ai slop they get a strike. Three strikes and you are suspended from submitting to the bug bounty for x months/years?
*Edit - I get it. It seems like the authentication is a challenge.
https://en.wikipedia.org/wiki/Sybil_attack
New identities are cheap.
I think that's the problem, or at least a problem, and a growing one.
How about "It costs $1000 to submit a bug bounty for approval", and raise the reward to $2000 (or $5000 if it's in the cards, since that will have a deterrant impact on non-AI responses).
Denominated in BTC to avoid chargebacks etc.
I think that's entirely sensible. Doesn't even have to be that expensive, just expensive enough to deter people who go "oooh, free money", and expensive enough to compensate for having to review slop far enough to realize it's slop.
Wouldn't be surprised if a dollar per entry already made a whole lot of difference.
you still need to spend effort reviewing the code to figure out when you can give a strike. Thrice for an actual ban. This would still waste precious maintainer time.
They mentioned they had identified alternatives but it would be costly to implement them. One can imagine that ban evading by generating a new user account would be easy for an LLM agent. It's going to be a long, long game if whack-a-mole.
Such a person can just make a new account and go back at it
This probably gets solved outside of the level of an individual project. No small team can handle this without building a whole product just to handle the bug bounty.