I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.
Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.
Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).
> Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable.
Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
A non-technical friend of mine has just won some hospital contracts after vibecoding w/ Claude an inventory management solution for them. They gave him access to IT dept servers and he called me extremely lost on how to deploy (cant connect Claude to them) and also frustrated because the app has some sort of interesting data/state issues.
What concerns me about this is that as these stories multiply and circulate people will just completely stop buying software/SAAS from startups, because 90% or more will be this same thing. It will completely kill the market.
Or you end up with a certification process, which will of course introduce it's own problems but startups doing things the right way and not just "moveing fast and breaking things" can thrive.
This hospital will learn some hard lessons. I hope their backup strategy is good. I'm surprised they can field software from an entity that isn't SOC2 & HIPAA certified.
This might not pan out as the glorious victory of human craft as you’re imagining it to be.
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
Frankly this is what everyone is counting on whether they know it or not. The question though is not “will the models get good enough?”. The question is does the repo even contain enough accurate information content to determine what the system is even supposed to be doing.
> Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first...
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.
It's kind of like producing code is becoming more like farming.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
Prayers are for weather. Pretty much all farmed plant, animal, and fungus species have been selectively bred or genetically modified. Farmers know what's going to grow.
I'm pretty sure he's talking about companies and people outsourcing their decision making and thinking to AI and not really about using AI itself.
I don't think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.
These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.
This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can't let it do the thinking and decision making.
Correct. I use AI a ton and I'm having more fun every day than I ever did before thanks to it (on average, highs are higher, lows are lower). Your characterization is all very accurate. Thank you.
I thinking that it’s quite a different experience going all Jackson Pollock with AI in your own studio on your own terms, compared to the sorry state of affairs of having 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota.
The way I put this to myself is that AI gives “correct correct answers and incorrect correct answers”.
They almost always generate logically correct text, but sometimes that text has a set of incorrect implicit assumptions and decisions that may not be valid for the use case.
Generating a correct correct solution requires proper definition of the problem, which is arguably more challenging than creating the solution.
It’s simpler than that - it’s a guessing machine that has superior access to a whole load of information and capacity to process at a speed at which we humans cannot compete.
Does it make it better than us? No because ultimately the thing itself doesn’t ‘know’ right from wrong.
Several people I know have already gone through phases like this. When you're doing it alone there is a moderating factor when their friends and family start calling them out on their behavior or weird things they say.
I can't imagine how bad it would be if your employer started doing this from the leadership. You'd be pressured to get on board or fear getting fired. Nobody would be trying to moderate your thinking except your coworkers who disagree with it, but those people are going to leave or be fired. If you want to keep your job, you have to play along.
> if you just prompt the AI and believe what it tell you then you have AI psychosis
This is the right definition. LLM outputs have undefined truth value. They’re mechanized Frankfurtian Bullshiters. Which can be valuable! If you have the tools or taste to filter the things that happen to be true from the rest of the dross.
However! We need a nicer word for it. Suggesting someone has “AI psychosis” feels a bit too impolitic.
Maybe we reclaim “toked out” from our misspent youths?
e.g. “This piece feels a little toked out. Let’s verify a few of Claude’s claims”
I wouldn’t say they have an undefined truth value. Their source of truth is their training data. The problem is that human text is not tightly coupled to the capital T truth.
He uses AI himself, so I agree he doesn't see AI use as black/white.
Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.
Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it.
Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed.
Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind.
Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around.
For drafting, brainstorming, or casual questions, ease off and match the task.
---
Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).
While you have to think about things objectively no matter what, when I start researching topics like physics, using AI as suggested in that article has proven very useful.
> companies and people outsourcing their decision making and thinking to AI
It's so interesting how easy it is to steer the LLM's based on context to arriving at whatever conclusion you engineer out of it. They really are like improv actors, and the first rule of improv is "yes, and".
So part of the psychosis is when these people unknowingly steer their LLM into their own conclusions and biases, and then they get magnified and solidified. It's gonna end in disaster.
It’s almost as if we haven’t learned anything from Hans the horse, Ouija boards, "facilitated communication", or the countless examples of the folly of surrounding yourself with yes men. The point about improv is spot on.
I didn’t think just offloading your thinking to AI was AI psychosis.
To me AI psychosis is the handful of friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover, the one guy who won’t speak to his family directly but has them talk to ChatGPT first and then has ChatGPT generate his response, or the two who are confident that they have discovered that physics and mathematics are incorrect and have discovered the truth of reality through their conversations with the models.
But language is a shared technology so maybe the term is being used for less egregious behavior than I was using it for.
> friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover
I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't
they morn that loss?
The fact that they were hurt by that sudden loss is totally healthy. It's just part of moving on. The real problem was getting into an unhealthy relationship with a fictitious partner under the control of an abusive company willing to exploit their loneliness in exchange for money.
Hopefully they now know better, but people (especially desperate ones) make poor choices all the time to get what's missing in their lives or to distract themselves from it.
> I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn the loss of that?
Ah, I forgot about the ai relationship companies. No this guy was using the browser based ChatGPT for coding and ended up in love with the model. No relationship was sold at all.
This post calls out how you can't argue with these people because they say its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
the top reply is from someone doing exactly that, arguing "but the agents are so fast!"
Yeah: If the tools aren't good enough and fast enough to fix the bugs before release, what makes anyone think they'll be able to so easily catch up afterwards?
Maybe they're assuming that doubling the code-base/features is more beneficial versus the damage from doubling the number of bugs... Well, at least for this quarter's news to investors...
> It's game theory. Someone will do it, and you'll be forced to do it, too.
You'll be forced to do it, or lose. The unstated assumptions are that, first, it will work, and second, that you can't afford to lose. But let's just assume those for the sake of argument.
> It can't be that bad
That does not follow at all. It can in fact be that bad. That was what made the game theory of MAD different from the game theory of most other things.
> The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
Oof. Potential "bad" outcomes of "game theory" should be calibrated to include all the bloody wars and genocides throughout recorded history.
Why did the Foi-ites kill every man, woman and child of the conquered Bar-ite city? Because if they didn't, then they'd be at a disadvantage if the Bar-ites didn't reciprocate in the cities they conquered...
yes, I was never so happy to work in Germany. People used to joke about the proverbial fax machine still being a thing but I've never been so glad to work in a culture where this mania doesn't exist. Reading HN is like entering Alice's Wonderland of token maxxers and AI psychotics. Genuinely don't know a single person here who is forced to work like this.
It is absolutely going to be a competitive advantage if it isn't already. When your competitors' products suck because they are using LLMs to write them, and yours work because you aren't, customers notice.
The AI psychosis is not the anti-opinion to the use of AI.
I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
The general laziness of looking for a perfect library on CPAN so that I don't have to do this work (often taking longer to not find a library than writing it by hand).
Have written thousands of lines of code with AI tool which ended up in prod and mostly it feels natural, because since 2017 I've been telling people to write code instead of typing it all on my own & setting up pitfalls to catch bad code in testing.
But one thing it doesn't do is "write less code"[1].
Bug reports also go down when people lose faith that they will be fixed, because reporting them is often a substantial time commitment. You see it happen pretty regularly as trust in a group/company collapses.
Add this the real possibility that significant part of reports that get filed might be AI generated or rewritten. With high possibility of being misreported because of that. Or have incorrect parts... So attack on multiple sides.
And we do not get even get into potential adversarial tactics. If you have no morals what is better than using agents to flood your competitor with fake bug reports.
Just let AI filter out the fake reports! Then let AI work on the real ones. See, there's really no problem "more AI" can't solve (as long as you're willing to ignore all of the underlying ones). "Pay us to create the problems you'll have to pay us to fix for you" is one hell of a business model. It basically prints money.
I agree, and I'd like to point out that this problem isn't unique to AI driven projects. I think much, if not all, of what Mitchell has been observing can readily happen without AI in the mix.
I'm starting to long for the age after AI. When the generative euphoria has settled and all outputs are formally verified based on exquisite architectures and standards.
They are expressing the idea that AI is so effective that it will make human work redundant necessitating a decoupling of resource allocation as a reward for performing work.
Because of the concerns you cite, I think working out the basic economic systems and incentives for paying people is a much more pressing concern than building magnificent machinery that we don't even own. There has been no effort on their end to demonstrate good faith nor to uphold their end of the social contract, which is why it's in our hands to demand the fundamentals to lead a life of dignity.
Most CEOs in my feed are convinced that AI makes people the equivalent of entire departments. AI should make your life easier, but instead it’s the opposite for a lot of people in the work force, which makes me really sad.
"Just use autoresearch and it will fix your app's memory leaks in an hour" is what I was nonchalantly told by someone who has never written a line of code ever.
I guess what I relate to the most is how dismissive people get about real software engineering work.
I may have skill issues, but I am yet to reach the level of autonomous engineering people tend to expect out of AI these days.
This is a critical communications issue that is becoming what I believe the defining characteristic of "This Age": nobody knows how to discuss disagreement, and because it cannot even be discussed communication ends, followed by blind obedience, forced bullying, retreat and abandonment. This is going to be a hell of a ride, because nobody can really discuss the situation with a rational tone.
I don't doubt there are companies totally misusing coding agents and LLMs in production. There are also real companies with real revenue and solid architecture using LLMs to deliver products. There are also companies with real revenue and rapidly accumulating tech debt.
Eventually the companies that can't cope with undisciplined engineering will succumb to unacceptable reliability and be outcompeted, just like in the "move fast and break things" era.
Mitchellh is on to something. Some of the AI products I've seen seem like psychosis hallucinatory fever dreams, using terms and concepts that have no meaning. Funding? $50,000,000 pre-seed.
at least at my BigCo, AI is being used for everything - writing slop, writing tests, code reviews, etc.
it would make sense to use AI for writing code, but human code review. or, human code, but AI test cases... or whatever combination of cross-checking, trust-but-verify, human in the loop, etc. people prefer.
i think once it gets used for everything, people have lost the plot, it's the inmates running the asylum.
I was rewatching Rich Hickey's "Simple Made Easy" talk (as one does) and there was a great line about full test coverage.
"What's true about all bugs in production? (pause for dramatic effect) They all passed the tests!" (well, he said typechecker but I think the point stands)
Deprecating immature workflows (LLM agents in this case) is much simpler and faster than building them from scratch. Many companies get this rush assessment right. The case where being wrong is much more costly than being right.
Sounds pretty accurate. Bunch of comments on this thread sound like AI is some kind of a new doomsday cult. The most annoying thing I find personally is that all engineering principles are getting crushed by non techies. Management counting token usage, forcing agent use, reducing headcount in the name of productivity gain. Devs building bridges but nobody knows what the bridge is, what are the standards to which it was built, how it works and how to maintain it. VCs counting extra money claiming chasing the holy profit is the future. The abundance of engineering apathy is disturbing.
I shut down AI Agent fanatics on the regular. But chop one head off there and two take its place. And I say that as someone working with Claude and Codex daily. While they are both incredibly good at clearly described and defined atomic tasks, application scope makes them lose their minds and the slop ensues.
First DEI, then COVID, then Ukraine, then AI. The US always needs its three to five years mass psychosis and then moves to the next shiny object. Many people and corporations get rich in each cycle.
AI exacerbates the problem since vulnerable tech people develop individual AI psychosis and participate in the mass psychosis.
Companies have figured out that no other population group is as gullible as tech people (they were instrumental in pushing all of the above four issues), so they exploit it again and again.
I think you're mixing up "psychosis" with fads, trends, or excuses to do layoffs.
A feature of psychosis is being unable to distinguish between external ideas and internal ones. For example, if a brown-nosing Yes-Man machine keeps reflecting your own leading questions back at you as if they were confirmed independent wisdom.
In contrast, I'm pretty sure COVID and the invasion of Ukraine are actual external phenomena that affect businesses and economies.
The lists of who's, what's, why's, and when's always change but when the decades pass it's never one narrow type of people or the "not me's" which are gullible - it's just human nature + regional timing. The targeted groups are the only ones who are really easy to break out.
I have a ton of respect for Mitchell - I didn't really know who he was until Ghostty but his writings and viewpoints on AI seem really grounded and make the most sense to me. Including this one.
Many people on this forum are suffering under this same psychosis.
Mitchell aches because his career has been solving broadly scoped problems by building a collection of thoughtful primitives for others to extend. LLMs seem to do the opposite but at great speed, and it hurts to watch.
Reading more, it seems part of his point is “if you’re making these primitives, it’s up to adopters to deploy, so mean-time-to-recovery isn’t that relevant.” Which is valid I guess.
But equally, like, do people need Terraform if they can just tell codex “put it live”, and does that hurt to see?
Codex is freakin hot-to-trot to churn out test coverage for every single thing it implements, and some of it is very esoteric and highly prescriptive (regexes for days) BUT .. after a while, it dawned on me that LLM-driven test coverage is less about proving “code correctness” (you’re better off writing those tests yourself alongside them), and more about just trying to ensure that whatever gets bolted on stays bolted on. For better or worse, obviously, since if you bolt on trash, trash you shall have.
Wholeheartedly agree, but in fairness, I trust the tests of the best AI models more than those of the average human developer. There's a lot of people around that combine high diligence with complete intellectual laziness, producing tons of useless tests.
Actually no, cancel that. I realise now that I trust AIs more than the average developer, period. At this point they do produce better code than most people I've dealt with.
Anyone who's taken VC funding has no choice. More money has been spent on AI commercialization than the atomic bomb, the US interstate build-out, the ISS and the Apollo program combined. Failure is going to be catastrophic and therefore, one tied to this ship cannot accept a world in which it fails.
Or anyone who even wants VC funding. 90+% of investors only want to invest in AI companies.
If you're not doing AI there's an incredibly limited pool of people who will give you $$$ ... and you're competing with EVERY OTHER NON-AI COMPANY for their attention.
I'm going through a mixed experience regarding this, personally.
Management is really pushing AI. It's obnoxious, and their idea on how it fits into my team's job specifically is completely, hilariously detached from reality. On the off chance someone says something reasonable, unless it fits the mold, it's immediately discarded. The mold being "spec driven development". We're not even a product team for crying out loud. I straight up started skipping these meetings for the sake of my sanity. It's mindwash, and it's genuinely dizzying. The other reason I stopped attending is because it ironically makes me more disinterested in AI, which I consider to be against my personal interests on the long run overall.
On the flipside, I love using Claude (in moderation). It keeps pulling off several very nice things, some of which Mitchell touched on in this post (the last one):
- I write scripts and automation from time to time; Claude fleshes them out way better with way more safety features, feature flags, and logging than I'd otherwise have capacity to spend time on
- Claude catches missed refactors and preexisting defects, and does a generally solid pass checking for defects as a whole
- Claude routinely helps with doing things I'd basically never be able to justify spending time on. Yesterday, I one-shotted an entire utility application with a GUI to boot, and it worked first try; I was beyond impressed.
- Claude helped me and a colleague do some partisan cross-team investigation in secret. We're migrating <thing> and we were evaluating <differences>. There was a lot of them. Management was in a limbo, unsure what to do, flip-flopping between bad options. In a desperate moment, I figured, hey, we kinda have a thing now for investigating an inhuman amount of stuff in detail - so I've put together a care package for my colleague with all our code, a bunch of context, a capture of all the input data for the past one week, and all the logs generated. Colleague put his team's side of the story next to it, and with the help of Claude, did some extremely nice cross-functional investigation. Over the course of a few weeks, he was able to confirm like a dozen showstopper bugs, many of which would have been absolutely fiendish if not impossible to fix (or even catch) if we went live without knowing about them. One even culminated in a whole-ass solution re-architecturing. We essentially tore down a silo wall with Claude's help in doing this.
So ultimately, it really is a mixed bag, with some really deep lowpoints and some really nice higlights. I also just generally find it weird that a technical tool [category] is being pushed down people's throats with a technical reasoning, but by management. One would think this goes bottom up, or is at least a lot more exploratory. The frenzy is real.
Assuming he’s right, I don’t see how that constitutes “psychosis”, as opposed to this beyond yet another of a billion examples of companies jumping on a bandwagon / cargo cult, and then learning they took it too far.
And also, he might not be right. But the good news is, we’ll all get to find out together!
This doesn’t constitute AI psychosis. His argument is that we need to retain understanding of the systems we use, but there’s no compelling argument as to why that is the case. (I get that people are going to be offended by that statement, but agents are already better than the average software engineer. I don’t see why we need to fight this, except for economic insecurity caused by mass layoffs.)
It all just feels like horse drawn carriage operators trying to convince automobile drivers to stop driving.
If you want to draw that line of argument - it's more like horse riders being convinced to give up their horses in favour of trains: You're travelling faster, don't have to navigate yourself, or think about every boulder on the way; but there are destinations you can't go, overcrowded trains slowing down the journey, hefty ticket prices, and instead of enjoying the freedom, you're degraded to a passive passenger.
Very funny, this. Did we need forward deployed engineers to convince people that they absolutely need to use the trains in order to "not be left behind"? Or otherwise hype? Or was it sort of obvious and did not need to explained so much - like a bad joke called LLMs ?
Actually- absolutely! Initially, people were really afraid of trains, fearing they wouldn’t be able to breathe at those speeds. It took a lot of convincing to establish trust in the technology.
I am sure you will feel that this is missing the point of your analogy, but we would not have gotten very far with automobiles if we didn't know how they worked.
You are breaking the analogy because automobiles are machines for transportation, and understanding them is important to make them move. LLMs are machines to understand, and well, if they do the understanding you don't need to.
The thing we're worried about not understanding here is the software the LLMs write, not the LLMs themselves.
The direct analogy to automobiles would be for each automobile to be a oneoff design filled with bad and bizarre decisions, excessively redundant parts, insane routing of wires, lines, ducts, etc., generally poor serviceability, and so on. IMO the big question going forward is whether the consistent availability of LLMs can render these kinds of post-delivery issues moot (they will reliably [catch and] fix problems in the software they wrote before any real damage is caused), or whether human reliance on LLMs and abdication of understanding will just make software worse because LLMs' ability to fix their own mistakes, and the consequences thereof, generally breaks down in the same contexts/complexities where they made those mistakes in the first place.
My own observations are that moderately complex software written in the mode of "vibe coding" or "agentic engineering" tends to regress to barely-functional dogshit as features are piled on, and that once this state is reached, the teams behind it are unable to, or perhaps simply uninterested in, unfuck[ing] it. I have stopped using software that has gone down this path, not because I have some philosophical objection to it, but because it has become _literally unusable_. But you will certainly not catch me claiming to know what the future holds.
We're definitely in the mess around phase of AI adoption.
I don't think it's super clear what we'll find out.
We've all built the moat of our careers out of our expertise.
It is also very possible that expertise will be rendered significantly less valuable as the models improve.
Nobody ever cared what the code looked like. They only ever cared if it solved their problem and it was bug free. Maybe everything falls apart, or maybe AI agents ship code that's good enough.
Given the state of the industry were clearly going to find out one way or the other, hah!
"its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
Hmm, I agree with the point OP is making, but I'm not so sure this is the best supporting argument.
The bottleneck is finding the bugs and if he'd criticized people saying AI will be the panacea to that I'd be with him, but people saying agents are fast and good at fixing human found bugs is nothing I'd object to.
Agents are fixing bugs so quickly and at a scale humans can't do already.
The tweet is criticizing over-reliance on the "agents will fix it anyway".
The fact that we can fix things faster now doesn't mean that we should throw away caution and prevention. The specific point of his tweet is that we're seeing a lot of people starting to skip proper release engineering.
Agents are quick to fix bugs, yes, but it doesn't mean that users will tolerate software that gets completely broken after each new feature is introduced and takes a certain number of days to heal each time.
You got downvoted for speaking the truth. HN has a strong anti-AI contingent. They won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this codebase”. We’re not there yet, but soon we will be. Then what?
More likely people thought GP was missing the point; "MTTR-optimized YOLO deployment" only succeeds against recoverable errors and acceptable periods of downtime against errors that are detected quickly. You could have a bug silently corrupting data for months, and that data may only be used by 1 critical process that runs once every quarter. So you could introduce a timebomb that can't be gracefully recovered from (depending on the nature of the data corruption).
So the point is not that agents cannot find bugs (they certainly can), it's whether you can shirk reviewing for bugs if MTTR is fast enough. There are circumstances where YOLO is appropriate, but they aren't the production environment of a mature application.
I don't think I missed the point, that is why I said I agree with the general point (and with what you said in your comment).
What I wanted to say is that the particular people that think "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" are not the best argument for it.
But I won't die on this hill, maybe I'm just reading the sentence differently then others.
I think there is an implication in context that the people being discussed aren't being reasonable (that the claim is employed as a rationalization), but I agree with your take. I should've said, "the downvotes were more likely because GP was perceived as missing the point". (I didn't downvote your comment fwiw.)
> won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this
But this is just holding the Slop Companies to the standard they declared themselves! Just recently, the CEO of OpenAI babbled some nonsense on twitter about how he hands over tasks to Codex who according to him, finishes them flawlessly while he is playing with his kid outside.
> but soon we will be.
Ah yes, in the 3-6 months, right? This time next year Rodney, we'll be millionaires!
I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.
Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.
Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).
> Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable.
Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
A non-technical friend of mine has just won some hospital contracts after vibecoding w/ Claude an inventory management solution for them. They gave him access to IT dept servers and he called me extremely lost on how to deploy (cant connect Claude to them) and also frustrated because the app has some sort of interesting data/state issues.
What concerns me about this is that as these stories multiply and circulate people will just completely stop buying software/SAAS from startups, because 90% or more will be this same thing. It will completely kill the market.
Or you end up with a certification process, which will of course introduce it's own problems but startups doing things the right way and not just "moveing fast and breaking things" can thrive.
This hospital will learn some hard lessons. I hope their backup strategy is good. I'm surprised they can field software from an entity that isn't SOC2 & HIPAA certified.
jfc lmao
This might not pan out as the glorious victory of human craft as you’re imagining it to be.
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
Reminds me of the quote in the original Westworld movie:
“ These are highly complicated pieces of equipment… almost as complicated as living organisms.
In some cases, they’ve been designed by other computers.
We don’t know exactly how they work.”
Now how did that work out ;-)
However Michael Crichton imagined it would.
I guess that “well” wouldn’t have sold many books.
[delayed]
As the models keep improving, wouldn’t you be able to task a newer AI to “clean up this mess”?
Frankly this is what everyone is counting on whether they know it or not. The question though is not “will the models get good enough?”. The question is does the repo even contain enough accurate information content to determine what the system is even supposed to be doing.
Ai runs `rm -rf`
Beyond the Singularity, we reach the Nullarity.
How could anyone answer that with any level of certainty?
> Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first...
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.
AI janitors
Not janitors. Hazmat cleanup crews.
It's kind of like producing code is becoming more like farming.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
Prayers are for weather. Pretty much all farmed plant, animal, and fungus species have been selectively bred or genetically modified. Farmers know what's going to grow.
I'm pretty sure he's talking about companies and people outsourcing their decision making and thinking to AI and not really about using AI itself.
I don't think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.
These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.
This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can't let it do the thinking and decision making.
Correct. I use AI a ton and I'm having more fun every day than I ever did before thanks to it (on average, highs are higher, lows are lower). Your characterization is all very accurate. Thank you.
Here's some other topics I've written on it:
- https://mitchellh.com/writing/my-ai-adoption-journey
- https://mitchellh.com/writing/building-block-economy
- https://mitchellh.com/writing/simdutf-no-libcxx (complex change thanks to AI, shows how I approach it rationally)
I thinking that it’s quite a different experience going all Jackson Pollock with AI in your own studio on your own terms, compared to the sorry state of affairs of having 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota.
I very much like this metaphor.
[delayed]
The way I put this to myself is that AI gives “correct correct answers and incorrect correct answers”.
They almost always generate logically correct text, but sometimes that text has a set of incorrect implicit assumptions and decisions that may not be valid for the use case.
Generating a correct correct solution requires proper definition of the problem, which is arguably more challenging than creating the solution.
It’s simpler than that - it’s a guessing machine that has superior access to a whole load of information and capacity to process at a speed at which we humans cannot compete.
Does it make it better than us? No because ultimately the thing itself doesn’t ‘know’ right from wrong.
Several people I know have already gone through phases like this. When you're doing it alone there is a moderating factor when their friends and family start calling them out on their behavior or weird things they say.
I can't imagine how bad it would be if your employer started doing this from the leadership. You'd be pressured to get on board or fear getting fired. Nobody would be trying to moderate your thinking except your coworkers who disagree with it, but those people are going to leave or be fired. If you want to keep your job, you have to play along.
I wonder how different this is from having companies let Fortune or Inc magazine do their thinking for them.
Or random consultants.
Is "AI said it was a good idea" and worse than "we were following industry trends"?
> if you just prompt the AI and believe what it tell you then you have AI psychosis
This is the right definition. LLM outputs have undefined truth value. They’re mechanized Frankfurtian Bullshiters. Which can be valuable! If you have the tools or taste to filter the things that happen to be true from the rest of the dross.
However! We need a nicer word for it. Suggesting someone has “AI psychosis” feels a bit too impolitic.
Maybe we reclaim “toked out” from our misspent youths?
e.g. “This piece feels a little toked out. Let’s verify a few of Claude’s claims”
I wouldn’t say they have an undefined truth value. Their source of truth is their training data. The problem is that human text is not tightly coupled to the capital T truth.
He uses AI himself, so I agree he doesn't see AI use as black/white.
Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.
share the prompt!
https://claude.ai/settings/general (Instructions for Claude)
---
Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it. Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed. Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind. Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around. For drafting, brainstorming, or casual questions, ease off and match the task.
---
Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).
For a start, invert - ask about the exact opposite in a separate session.
I digress; this article actually has helped identify useful knowledge gaps around topics I have researched. https://drensin.medium.com/elephants-goldfish-and-the-new-go...
While you have to think about things objectively no matter what, when I start researching topics like physics, using AI as suggested in that article has proven very useful.
> companies and people outsourcing their decision making and thinking to AI
It's so interesting how easy it is to steer the LLM's based on context to arriving at whatever conclusion you engineer out of it. They really are like improv actors, and the first rule of improv is "yes, and".
So part of the psychosis is when these people unknowingly steer their LLM into their own conclusions and biases, and then they get magnified and solidified. It's gonna end in disaster.
It’s almost as if we haven’t learned anything from Hans the horse, Ouija boards, "facilitated communication", or the countless examples of the folly of surrounding yourself with yes men. The point about improv is spot on.
I didn’t think just offloading your thinking to AI was AI psychosis.
To me AI psychosis is the handful of friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover, the one guy who won’t speak to his family directly but has them talk to ChatGPT first and then has ChatGPT generate his response, or the two who are confident that they have discovered that physics and mathematics are incorrect and have discovered the truth of reality through their conversations with the models.
But language is a shared technology so maybe the term is being used for less egregious behavior than I was using it for.
> friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover
I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn that loss?
The fact that they were hurt by that sudden loss is totally healthy. It's just part of moving on. The real problem was getting into an unhealthy relationship with a fictitious partner under the control of an abusive company willing to exploit their loneliness in exchange for money.
Hopefully they now know better, but people (especially desperate ones) make poor choices all the time to get what's missing in their lives or to distract themselves from it.
> I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn the loss of that?
Ah, I forgot about the ai relationship companies. No this guy was using the browser based ChatGPT for coding and ended up in love with the model. No relationship was sold at all.
If you feel this way, you might like my new CLI tool, Burn, Baby, Burn (those tokens) (https://github.com/dtnewman/burn-baby-burn/tree/main).
Show HN here: https://news.ycombinator.com/item?id=48151287
This post calls out how you can't argue with these people because they say its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
the top reply is from someone doing exactly that, arguing "but the agents are so fast!"
Yeah: If the tools aren't good enough and fast enough to fix the bugs before release, what makes anyone think they'll be able to so easily catch up afterwards?
Maybe they're assuming that doubling the code-base/features is more beneficial versus the damage from doubling the number of bugs... Well, at least for this quarter's news to investors...
I was talking with a friend in the early days of AI boom. I argued that over-reliance in AI will create all kinds of catastrophes.
The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
I mean, yes, logic is useful, but ignorance of risks? Assuming that moving blazingly fast and pulverizing things will result in good eventually?
This AI thing is not progressing well. I don't like this.
An interesting ethical framework, your friend has.
"Interesting" is a very brave and British way to put it, but yeah.
Let's say I'm polar opposite of them, and we're on the same page with you.
> It's game theory. Someone will do it, and you'll be forced to do it, too.
You'll be forced to do it, or lose. The unstated assumptions are that, first, it will work, and second, that you can't afford to lose. But let's just assume those for the sake of argument.
> It can't be that bad
That does not follow at all. It can in fact be that bad. That was what made the game theory of MAD different from the game theory of most other things.
> The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
Oof. Potential "bad" outcomes of "game theory" should be calibrated to include all the bloody wars and genocides throughout recorded history.
Why did the Foi-ites kill every man, woman and child of the conquered Bar-ite city? Because if they didn't, then they'd be at a disadvantage if the Bar-ites didn't reciprocate in the cities they conquered...
Yeah, I know. I had counter arguments more targeted towards his thinking style, but he preferred to think straight like a machine, in a bad way.
The problem was not him, but the fact that the number of people who thinks like him. They may word it in a more benign form, but the idea is the same.
So obsessed with being the first mover and winning the battle, never thinking whether they should, or what would happen with that scenario.
Missing the whole forest and beyond for a single branch of a single tree.
reliance, not resilience
Yep, you're right. I'm a bit tired and my fingers had a mind of their own.
Thanks. :)
My very large employer has always been glacially slow on modernization and tech adoption. It may now, oddly enough, become a competitive advantage.
Literally the plot of Battlestar Galactica! Life imitates art indeed...
Or Mr Krabs' fear of robot overloads keeping technology at bay in the Krusty Krab!
who is the Starbuck of AI?
plot twist: it's Starbuck
yes, I was never so happy to work in Germany. People used to joke about the proverbial fax machine still being a thing but I've never been so glad to work in a culture where this mania doesn't exist. Reading HN is like entering Alice's Wonderland of token maxxers and AI psychotics. Genuinely don't know a single person here who is forced to work like this.
Ah so it's like 2000 again. Germany will go even farther behind it seems
Germany is standing at the abyss. America is one step ahead.
If the people that walk before you go into the abyss, staying behind isn't wrong.
Spoiler: it's not
It is absolutely going to be a competitive advantage if it isn't already. When your competitors' products suck because they are using LLMs to write them, and yours work because you aren't, customers notice.
The AI psychosis is not the anti-opinion to the use of AI.
I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
The general laziness of looking for a perfect library on CPAN so that I don't have to do this work (often taking longer to not find a library than writing it by hand).
Have written thousands of lines of code with AI tool which ended up in prod and mostly it feels natural, because since 2017 I've been telling people to write code instead of typing it all on my own & setting up pitfalls to catch bad code in testing.
But one thing it doesn't do is "write less code"[1].
[1] - https://xcancel.com/t3rmin4t0r/status/2019277780517781522/
https://xcancel.com/mitchellh/status/2055380239711457578
https://hachyderm.io/@mitchellh/116580433508108130
<https://twiiit.com/mitchellh/status/2055380239711457578> – will redirect to a currently-working Nitter instance.
Seems broken. It just throws up an anime cat girl for me.
> anime cat girl
seems like it's working ideally to me!
Bug reports also go down when people lose faith that they will be fixed, because reporting them is often a substantial time commitment. You see it happen pretty regularly as trust in a group/company collapses.
Add this the real possibility that significant part of reports that get filed might be AI generated or rewritten. With high possibility of being misreported because of that. Or have incorrect parts... So attack on multiple sides.
And we do not get even get into potential adversarial tactics. If you have no morals what is better than using agents to flood your competitor with fake bug reports.
Just let AI filter out the fake reports! Then let AI work on the real ones. See, there's really no problem "more AI" can't solve (as long as you're willing to ignore all of the underlying ones). "Pay us to create the problems you'll have to pay us to fix for you" is one hell of a business model. It basically prints money.
Just let AI report the bugs. Problem solved!
I agree, and I'd like to point out that this problem isn't unique to AI driven projects. I think much, if not all, of what Mitchell has been observing can readily happen without AI in the mix.
I'm starting to long for the age after AI. When the generative euphoria has settled and all outputs are formally verified based on exquisite architectures and standards.
> When [...] all outputs are formally verified based on exquisite architectures and standards
and we all live in a green utopia of flying cars and peace upon the world.
if all the resources spent in useless wars were poured into working towards this goal, we would be there for some time already
Sure, but we should probably plan for what’s actually going to happen
Can't come fast enough
They are being developed, but it takes over a decade for this to happen normally
Will never happen, for the exact reason that we’ve almost never done that for human output either.
it is required now, or all civilization collapses.
Civilization collapses unless people stop being short-sighted and greedy, trying to cut corners whenever possible?
I know which outcome I'd put my money on.
You're going to have to expand on this one.
They are expressing the idea that AI is so effective that it will make human work redundant necessitating a decoupling of resource allocation as a reward for performing work.
I don’t agree, but that’s the thinking
Another argument for less human-like AI then, I guess.
That’s literally just software though.
There was not a renaissance to move back to Assembly when Java sucked. Instead more Java developers were created.
Well a 2008 and a 2000 level financial crash is required for this. It is always during euphoric levels of delusion such events then occur.
...and it also needs more so-called AI companies present in the wreckage in this crash.
AI psychosis is undeniably real.
The entire stock market is undergoing AI psychosis.
This is the new normal. AI will continue to reduce the need for human workers until a Universal Basic Income is established.
At the end of the day robots can do the vast vast majority of jobs better and faster. If not now, very soon.
I only worry our economic systems won’t keep up
Because of the concerns you cite, I think working out the basic economic systems and incentives for paying people is a much more pressing concern than building magnificent machinery that we don't even own. There has been no effort on their end to demonstrate good faith nor to uphold their end of the social contract, which is why it's in our hands to demand the fundamentals to lead a life of dignity.
The exact same thing was meant to happen when the desktop computer became prevalent. Then the internet. Look at us now.
You’re forgetting the energy part of the equation.
Humans can already have 4 hour work week without productivity loss.
But I only see mass layoffs and those who are working - are working longer and harder then before.
Most CEOs in my feed are convinced that AI makes people the equivalent of entire departments. AI should make your life easier, but instead it’s the opposite for a lot of people in the work force, which makes me really sad.
I think that’s called "hopium". Or wishful thinking, in less trendy language.
"Just use autoresearch and it will fix your app's memory leaks in an hour" is what I was nonchalantly told by someone who has never written a line of code ever.
I guess what I relate to the most is how dismissive people get about real software engineering work.
I may have skill issues, but I am yet to reach the level of autonomous engineering people tend to expect out of AI these days.
This is a critical communications issue that is becoming what I believe the defining characteristic of "This Age": nobody knows how to discuss disagreement, and because it cannot even be discussed communication ends, followed by blind obedience, forced bullying, retreat and abandonment. This is going to be a hell of a ride, because nobody can really discuss the situation with a rational tone.
I don't doubt there are companies totally misusing coding agents and LLMs in production. There are also real companies with real revenue and solid architecture using LLMs to deliver products. There are also companies with real revenue and rapidly accumulating tech debt.
Eventually the companies that can't cope with undisciplined engineering will succumb to unacceptable reliability and be outcompeted, just like in the "move fast and break things" era.
Mitchellh is on to something. Some of the AI products I've seen seem like psychosis hallucinatory fever dreams, using terms and concepts that have no meaning. Funding? $50,000,000 pre-seed.
The only way many people learn that the stove is hot is by burning their hands on it.
Let them.
Most labs are shilling “AI worker” dreams to these very companies
Why do you all still submit twitter.com links when that domain does not even work?
Saying the _quiet_ part out loud.
"no no, it has full test coverage"
at least at my BigCo, AI is being used for everything - writing slop, writing tests, code reviews, etc.
it would make sense to use AI for writing code, but human code review. or, human code, but AI test cases... or whatever combination of cross-checking, trust-but-verify, human in the loop, etc. people prefer.
i think once it gets used for everything, people have lost the plot, it's the inmates running the asylum.
I was rewatching Rich Hickey's "Simple Made Easy" talk (as one does) and there was a great line about full test coverage.
"What's true about all bugs in production? (pause for dramatic effect) They all passed the tests!" (well, he said typechecker but I think the point stands)
Deprecating immature workflows (LLM agents in this case) is much simpler and faster than building them from scratch. Many companies get this rush assessment right. The case where being wrong is much more costly than being right.
I'm not convinced. There's a ton of cost to adopting a radically different workflow.
This is... Not what psychosis means? Being wrong is not psychosis
being wrong and insisting on being wrong is
Sounds pretty accurate. Bunch of comments on this thread sound like AI is some kind of a new doomsday cult. The most annoying thing I find personally is that all engineering principles are getting crushed by non techies. Management counting token usage, forcing agent use, reducing headcount in the name of productivity gain. Devs building bridges but nobody knows what the bridge is, what are the standards to which it was built, how it works and how to maintain it. VCs counting extra money claiming chasing the holy profit is the future. The abundance of engineering apathy is disturbing.
I shut down AI Agent fanatics on the regular. But chop one head off there and two take its place. And I say that as someone working with Claude and Codex daily. While they are both incredibly good at clearly described and defined atomic tasks, application scope makes them lose their minds and the slop ensues.
First DEI, then COVID, then Ukraine, then AI. The US always needs its three to five years mass psychosis and then moves to the next shiny object. Many people and corporations get rich in each cycle.
AI exacerbates the problem since vulnerable tech people develop individual AI psychosis and participate in the mass psychosis.
Companies have figured out that no other population group is as gullible as tech people (they were instrumental in pushing all of the above four issues), so they exploit it again and again.
I think you're mixing up "psychosis" with fads, trends, or excuses to do layoffs.
A feature of psychosis is being unable to distinguish between external ideas and internal ones. For example, if a brown-nosing Yes-Man machine keeps reflecting your own leading questions back at you as if they were confirmed independent wisdom.
In contrast, I'm pretty sure COVID and the invasion of Ukraine are actual external phenomena that affect businesses and economies.
The lists of who's, what's, why's, and when's always change but when the decades pass it's never one narrow type of people or the "not me's" which are gullible - it's just human nature + regional timing. The targeted groups are the only ones who are really easy to break out.
I have a ton of respect for Mitchell - I didn't really know who he was until Ghostty but his writings and viewpoints on AI seem really grounded and make the most sense to me. Including this one.
Many people on this forum are suffering under this same psychosis.
I'm guessing you've never heard of Hashicorp (Terraform, Vault) then? Mitchell == Hashicorp.
Welcome to the club, Mitchell! Pizza's to the right.
In all seriousness...well, yeah. AI is a monkey's paw, and that's how monkey paws work. So many movies and books warned us!
Mitchell aches because his career has been solving broadly scoped problems by building a collection of thoughtful primitives for others to extend. LLMs seem to do the opposite but at great speed, and it hurts to watch.
Reading more, it seems part of his point is “if you’re making these primitives, it’s up to adopters to deploy, so mean-time-to-recovery isn’t that relevant.” Which is valid I guess.
But equally, like, do people need Terraform if they can just tell codex “put it live”, and does that hurt to see?
Either this or we humans are out of the picture soon.
Occams' razor would assume the former.
When war psychosis is not enough....
> "no no, it has full test coverage"
i don't have enough fingers (and toes) to count how many times i've demonstrated that "100% coverage" is almost universally bullshit.
Codex is freakin hot-to-trot to churn out test coverage for every single thing it implements, and some of it is very esoteric and highly prescriptive (regexes for days) BUT .. after a while, it dawned on me that LLM-driven test coverage is less about proving “code correctness” (you’re better off writing those tests yourself alongside them), and more about just trying to ensure that whatever gets bolted on stays bolted on. For better or worse, obviously, since if you bolt on trash, trash you shall have.
Wholeheartedly agree, but in fairness, I trust the tests of the best AI models more than those of the average human developer. There's a lot of people around that combine high diligence with complete intellectual laziness, producing tons of useless tests.
Actually no, cancel that. I realise now that I trust AIs more than the average developer, period. At this point they do produce better code than most people I've dealt with.
I am really looking for more reasoned approaches to AI.
I am very close to using it as a pair programmer, but with me actually coding. I am just so tired of fixing its mistakes.
Isn't going to happen without the regulation hammer being thrown down.
Probably from the EU because they seem to be the sane ones of this generation.
Anyone who's taken VC funding has no choice. More money has been spent on AI commercialization than the atomic bomb, the US interstate build-out, the ISS and the Apollo program combined. Failure is going to be catastrophic and therefore, one tied to this ship cannot accept a world in which it fails.
Or anyone who even wants VC funding. 90+% of investors only want to invest in AI companies.
If you're not doing AI there's an incredibly limited pool of people who will give you $$$ ... and you're competing with EVERY OTHER NON-AI COMPANY for their attention.
On the bright side, my guillotine & rope startup is going to make a killing (no pun intended).
I'm going through a mixed experience regarding this, personally.
Management is really pushing AI. It's obnoxious, and their idea on how it fits into my team's job specifically is completely, hilariously detached from reality. On the off chance someone says something reasonable, unless it fits the mold, it's immediately discarded. The mold being "spec driven development". We're not even a product team for crying out loud. I straight up started skipping these meetings for the sake of my sanity. It's mindwash, and it's genuinely dizzying. The other reason I stopped attending is because it ironically makes me more disinterested in AI, which I consider to be against my personal interests on the long run overall.
On the flipside, I love using Claude (in moderation). It keeps pulling off several very nice things, some of which Mitchell touched on in this post (the last one):
- I write scripts and automation from time to time; Claude fleshes them out way better with way more safety features, feature flags, and logging than I'd otherwise have capacity to spend time on
- Claude catches missed refactors and preexisting defects, and does a generally solid pass checking for defects as a whole
- Claude routinely helps with doing things I'd basically never be able to justify spending time on. Yesterday, I one-shotted an entire utility application with a GUI to boot, and it worked first try; I was beyond impressed.
- Claude helped me and a colleague do some partisan cross-team investigation in secret. We're migrating <thing> and we were evaluating <differences>. There was a lot of them. Management was in a limbo, unsure what to do, flip-flopping between bad options. In a desperate moment, I figured, hey, we kinda have a thing now for investigating an inhuman amount of stuff in detail - so I've put together a care package for my colleague with all our code, a bunch of context, a capture of all the input data for the past one week, and all the logs generated. Colleague put his team's side of the story next to it, and with the help of Claude, did some extremely nice cross-functional investigation. Over the course of a few weeks, he was able to confirm like a dozen showstopper bugs, many of which would have been absolutely fiendish if not impossible to fix (or even catch) if we went live without knowing about them. One even culminated in a whole-ass solution re-architecturing. We essentially tore down a silo wall with Claude's help in doing this.
So ultimately, it really is a mixed bag, with some really deep lowpoints and some really nice higlights. I also just generally find it weird that a technical tool [category] is being pushed down people's throats with a technical reasoning, but by management. One would think this goes bottom up, or is at least a lot more exploratory. The frenzy is real.
Assuming he’s right, I don’t see how that constitutes “psychosis”, as opposed to this beyond yet another of a billion examples of companies jumping on a bandwagon / cargo cult, and then learning they took it too far.
And also, he might not be right. But the good news is, we’ll all get to find out together!
I do not believe 'AI psychosis' is an actual thing.
This doesn’t constitute AI psychosis. His argument is that we need to retain understanding of the systems we use, but there’s no compelling argument as to why that is the case. (I get that people are going to be offended by that statement, but agents are already better than the average software engineer. I don’t see why we need to fight this, except for economic insecurity caused by mass layoffs.)
It all just feels like horse drawn carriage operators trying to convince automobile drivers to stop driving.
If you want to draw that line of argument - it's more like horse riders being convinced to give up their horses in favour of trains: You're travelling faster, don't have to navigate yourself, or think about every boulder on the way; but there are destinations you can't go, overcrowded trains slowing down the journey, hefty ticket prices, and instead of enjoying the freedom, you're degraded to a passive passenger.
Very funny, this. Did we need forward deployed engineers to convince people that they absolutely need to use the trains in order to "not be left behind"? Or otherwise hype? Or was it sort of obvious and did not need to explained so much - like a bad joke called LLMs ?
Actually- absolutely! Initially, people were really afraid of trains, fearing they wouldn’t be able to breathe at those speeds. It took a lot of convincing to establish trust in the technology.
Ever heard of subsidising? :’)
I am sure you will feel that this is missing the point of your analogy, but we would not have gotten very far with automobiles if we didn't know how they worked.
You are breaking the analogy because automobiles are machines for transportation, and understanding them is important to make them move. LLMs are machines to understand, and well, if they do the understanding you don't need to.
The thing we're worried about not understanding here is the software the LLMs write, not the LLMs themselves.
The direct analogy to automobiles would be for each automobile to be a oneoff design filled with bad and bizarre decisions, excessively redundant parts, insane routing of wires, lines, ducts, etc., generally poor serviceability, and so on. IMO the big question going forward is whether the consistent availability of LLMs can render these kinds of post-delivery issues moot (they will reliably [catch and] fix problems in the software they wrote before any real damage is caused), or whether human reliance on LLMs and abdication of understanding will just make software worse because LLMs' ability to fix their own mistakes, and the consequences thereof, generally breaks down in the same contexts/complexities where they made those mistakes in the first place.
My own observations are that moderately complex software written in the mode of "vibe coding" or "agentic engineering" tends to regress to barely-functional dogshit as features are piled on, and that once this state is reached, the teams behind it are unable to, or perhaps simply uninterested in, unfuck[ing] it. I have stopped using software that has gone down this path, not because I have some philosophical objection to it, but because it has become _literally unusable_. But you will certainly not catch me claiming to know what the future holds.
agreed completely
We're definitely in the mess around phase of AI adoption.
I don't think it's super clear what we'll find out.
We've all built the moat of our careers out of our expertise.
It is also very possible that expertise will be rendered significantly less valuable as the models improve.
Nobody ever cared what the code looked like. They only ever cared if it solved their problem and it was bug free. Maybe everything falls apart, or maybe AI agents ship code that's good enough.
Given the state of the industry were clearly going to find out one way or the other, hah!
"its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
Hmm, I agree with the point OP is making, but I'm not so sure this is the best supporting argument. The bottleneck is finding the bugs and if he'd criticized people saying AI will be the panacea to that I'd be with him, but people saying agents are fast and good at fixing human found bugs is nothing I'd object to.
Agents are fixing bugs so quickly and at a scale humans can't do already.
> Agents are fixing bugs so quickly and at a scale humans can't do already.
The metric is how many defects are introduced per defect fixed. Being fast is bad if this ratio is above one.
The tweet is criticizing over-reliance on the "agents will fix it anyway".
The fact that we can fix things faster now doesn't mean that we should throw away caution and prevention. The specific point of his tweet is that we're seeing a lot of people starting to skip proper release engineering.
Agents are quick to fix bugs, yes, but it doesn't mean that users will tolerate software that gets completely broken after each new feature is introduced and takes a certain number of days to heal each time.
You got downvoted for speaking the truth. HN has a strong anti-AI contingent. They won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this codebase”. We’re not there yet, but soon we will be. Then what?
More likely people thought GP was missing the point; "MTTR-optimized YOLO deployment" only succeeds against recoverable errors and acceptable periods of downtime against errors that are detected quickly. You could have a bug silently corrupting data for months, and that data may only be used by 1 critical process that runs once every quarter. So you could introduce a timebomb that can't be gracefully recovered from (depending on the nature of the data corruption).
So the point is not that agents cannot find bugs (they certainly can), it's whether you can shirk reviewing for bugs if MTTR is fast enough. There are circumstances where YOLO is appropriate, but they aren't the production environment of a mature application.
I don't think I missed the point, that is why I said I agree with the general point (and with what you said in your comment).
What I wanted to say is that the particular people that think "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" are not the best argument for it.
But I won't die on this hill, maybe I'm just reading the sentence differently then others.
I think there is an implication in context that the people being discussed aren't being reasonable (that the claim is employed as a rationalization), but I agree with your take. I should've said, "the downvotes were more likely because GP was perceived as missing the point". (I didn't downvote your comment fwiw.)
> won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this
But this is just holding the Slop Companies to the standard they declared themselves! Just recently, the CEO of OpenAI babbled some nonsense on twitter about how he hands over tasks to Codex who according to him, finishes them flawlessly while he is playing with his kid outside.
> but soon we will be.
Ah yes, in the 3-6 months, right? This time next year Rodney, we'll be millionaires!