> Ideally, codebases would grow by adding data (e.g. a json describing endpoints, UIs, etc), not repetitive code.
The problem with this configuration based approach is that now the actual code that executes has to be able to change its functionality arbitrarily in response to new configuration, and the code (and configuration format) needs to be extremely abstracted and incomprehensible. In the real world, someone figures out that things get way easier if you just put a few programming language concepts into the configuration format, and now you're back where you started but with a much worse programming language (shoehorned into a configuration format) than you were using before.
Boilerplate may be cumbersome, but it effectively gives you a very large number of places to "hook into" the framework to make it do what you need. AI makes boilerplate much less painful to write.
Working on a system like this (mostly configured with complex yaml, but extended with a little DSL/rule engine to handle more complex situations) a long while ago, I introduced a bug that cost the company quite a bit of money by using `True` instead of `true`—something that would have been readily caught in a proper language with real tooling.
>> Ideally, codebases would grow by adding data (e.g. a json describing endpoints, UIs, etc), not repetitive code.
Be very careful with this approach. there are many ways it can go completely wrong. i've seen a codebase like this and it was a disaster to debug. because you can't set breakpoints in data. it was a disaster.
It may not look compact or elegant but I'd rather see debuggable and comprehensible boiler point even if it's repetitive rather than a mess
- Commodity work, such as CRUD, integrations, infra plumbing, standard patterns.
- Local novelty, i.e. a new feature for your product (but not new to the world).
- Frontier novelty, as in, genuinely new algorithms/research-like work.
The overwhelming majority of software development is in the first two categories. Even people who think they are doing new and groundbreaking stuff are almost certainly doing variations of things that have been done in other contexts.
Yes, there is A LOT of boilerplate that is sped up by AI. Every time I interface with a new service or API, I don't have to carefully read the documentation and write it by hand (or copypaste examples), I can literally have the AI do the first draft, grok it, test it, and iterate. Often times the AI misses latest developments, and I have to research things myself and fix code, explain the new capabilities, then the AI can be used again, but in the end it's still about 20x faster.
Can you say anything about where you were five years ago? Or how your ability to write and ship code from 2015 to 2020 changed? It’s hard to believe all these 50x AI productivity claims, especially without an idea of what that multiplier applies to and what the right side of the equation even means. Is it lines written, dollar amounts of impact, pull requests merged?
My favorite part of AI is red-teaming and finding bugs. Just copypaste diffs and ask it for regressions. Press it over and over until it can't find any.
I so wish just one of these posts with these insane ~20-50x claims would include a narrated video of someone using it, or show us a repository with the code, or anything tangible.
I get 10-20x done but that’s because most of what I need to get done isn’t particularly difficult.
E.g. I have an aws agent that does all my devops for me. It isn’t doing anything insane but when I need to investigate something or make a terraform change, I send it off and go do something else. I do similar things 20-100 times a day now. Go write this script, do this report, update this documentation, etc.
I think if you are a high level swe it takes a lot of effort to get “better than doing it myself”. If you have a more generalist role, AI is easily a 10x productivity booster if you are knowledgeable about using it.
Yes... I've asked for the same - show us the goods with a Destroy All Software style screencast; otherwise the default position is that this entire HN post is just more AI generated hallucination.
I'd happily demonstrate this kind of workflow on my day job if not for company trade-secrets.
That's as legacy as it gets, 20+ year old code base with several "strata" of different technologies and approaches.
Claude Opus handily navigates around it, and produces working bug fixes with minimal guidance.
I'm not going to claim it's 20x or 50x yet, there's still navigation and babysitting involved, but it's definitely capable of working on complex problems, I just don't trust it to run it in YOLO mode.
The key thing is, you need domain knowledge. You need to know where to correct it, and which direction to point it in.
It's not magic, and it will have bad ideas. The key picking out the good ideas from the bad.
Hi eterm, this is very relevant to me as I'm building a self-hosted open-source tool for legacy code comprehension (AI/ML final project).
You mentioned "navigation and babysitting", could you share what that looks like in practice? Do you have to spend time reconstructing context or correcting Claude's misunderstandings? Do you still need to interrupt colleagues for some tacit knowledge, or has that changed?
I don't know. There's lots of options. At the extreme ends it would be interesting to see these agents work on something like boost, or metamath/set.mm, to choose deliberately obtuse candidates. Perhaps a web browser.
I appreciate the video. I see that most of the time it's the dialogue with the AI that is in the focus, the code or the actual product rarely shows up compared to that (or maybe it does? I can't read the text on the video due to the quality).
Haven't you really tried an agentic code like Claude Code, Codex, AMP...? For "really try" I mean investing some time with it like 10-20 hours min to see the things they are good with and how to use them well. They are tools and have the corresponding learning curve.
I have passed through a phase of copy/pasting from the ChatGPT web to autocomplete tools, but the real feeling of "shit, this is going to really change how I code" came with Claude Code.
Maybe a good difference between "senior" at ten years and senior at twenty years is whether you think amount of code produced is a positive metric or a negative one.
No, that sounds right. First you don't know how to write something, then you know one way, then you know multiple ways, then later you know multiple ways and have the instinct to pick the best one first.
I really appreciate that he is up-front about "Yes. Vibe coding has lots of dangerous problems that you must learn to control if you are to go whole-hog like this."
Has anyone read his Vibe Coding book? The Amazon reviews make it sound like it's heavy on inspiration but light on techniques.
The real benefit is due to discipline. These models have removed a certain amount of friction you had. However, you’re now more of an editor than a writer. Keep that in mind and don’t lose the writing muscle.
Sad that this was flagged. I am a noob but nevertheless a perfectionist. I am not afraid of using AI, because I proofread everything. So it would have been of interest to me how this exactly works. A video maybe.
Cherrypicking the most tedious parts, like boilerplate to get up and running, or porting code to other adapters (making mysqlite and postgres adapters for instance)
What's the rationale behind writing PHP as if it was JS? Unless I am mistaken it's like someone just did a transliteration from JS to PHP without even converting the JSDoc to PHPDoc.
I actually based it on an existing PHP adapter for MySQL. Together with the AI, I went over all the features that needed refactoring, and what would work in Postgres and Sqlite the same way, etc. There were a ton of nuances.
So your actual output has not increased by 20-50x, just some parts of it? What's your speedup like over an entire project, not just cherrypicked parts?
The code is stable and partially tested. Needs a lot more testing before committing it to main. This is mainly because the primary ORM adapter for MySQL has been rewritten, and a lot of apps use it.
I think in 2026 the automation will reach the testing, closing the loop. At that point, no humans in the loop will make software development extremely fast.
> Suppose you keep this up for 15 years; how will you grapple with all the cruft you have generated?
I can't speak for the OP but the worst software developer I ever worked with was myself from 1 year ago. Provided what "cruft" I'm generating meets my current code standards, it's unlikely to be any worse than anything else past me has done.
Why wait 15 years? If it's becoming too big of a ball of mud, tell the AI that, and have it ultrathink a plan on what to refactor together and better follow DRY principles of clean code, or whatever you adhere to. Tell it hey, I noticed that we're reimplementing the same thing a couple of times, let's make it a class that can be reused instead. It'll happily generate cruft. The human in the loop should call that out so it doesn't get out of control. Maybe that's what we'll be paying the human to do still in 15 years.
There's a small but seemingly tireless brigade of "you're not actually moving faster, you're just fooling yourself" pundits on this site that feel compelled to chime in every time someone mentions that they get any benefit from AI coding tools. I'm just not going to engage with them anymore.
That said... I jumped to a few random moments in your video and had an "oh my god" reaction because you really were not kidding when you said that you were pasting code.
I'm pretty much begging you to install and use Cursor. Whatever boost you're getting from your current workflow, you will see triple through use of their Agent/Plan/Debug modes, especially when using Opus 4.5. I promise you: it's a before electricity vs after electricity scenario. I'm actually excited for you.
A lot of folks will tell you to use Claude Code. I personally find that it doesn't make sense for the sorts of projects I work on; I would 100% start with Cursor either way.
Can you actually provide any proof, even top-line stats from GitHub or other software forges that show the productivity boost you’re claiming?
It’s not up to the skeptics to prove this tech doesn’t work, it’s up to the proponents to show it does and does so with a similar effect size as cigarettes cause lung cancer.
There are a tremendous amount of LLM productivity stans on HN but the plural of anecdote is not data.
Certainly these tools are useful, but the extent to which they are useful today is not nearly as open and shut as you and others would claim. I’d say that these tools make me 5% more productive on a code base I know well.
I’m totally open to opposing evidence that isn’t just anecdote
I think it’s pretty obvious that is the OP automates this manual part of their workflow that it will improve their iteration speed. The thread root is just saying stop copy and pasting and use the built in tooling to communicate with the LLM apis
They aren’t responding to thread roots extended comment, just the first part about the tone and rhetoric of AI proponents. Your comment isnt really a response to anything in their comment.
Isn't Claude Code the same as Cursor agent mode? I really don't get why anyone would want to lock yourself to one LLM creator in the former vs having all the LLMs in the latter. How do you stop yourself from bursting through the quota with Opus? That's my biggest worry and it keeps me from using it over Sonnet in my Cursor.
Honestly, it depends on what you mean by "the same as". Both are (in my case, at least) running Opus 4.5 instances. After that, it's like using a CNC or a shop full of hand tools. They are both great, and people who know one often know both. The process is wildly different, however.
Not busting my quota is simply not my top priority. I'm on their $200/month plan and I have it locked to a $1000/month overage limit, though the most I've ever gone through using it every day, all day is about $700. That probably sounds like a lot if you're optimizing for a $20/month token budget, but it's budgeted for. That $10-12k/year is excellent value for the silly amount of functionality that I've been able to create.
Sonnet is a really good LLM, and you can build great things with it. However, if you're using this for serious work, IMO you probably want to use the most productive tools available.
Opus 4.1 was, to be real, punishingly expensive. It made me sweat. Thank goodness that Opus 4.5 is somehow both much better and much cheaper.
What do you see as the difference between Claude Code and Cursor agent mode, since you said Claude Code doesn't work for your type of project so I'm curious why that is.
Edit, I see you answered this in another response, thanks.
While I am vaguely aware that CC has started to move past its CLI-first roots, I still think of it as a process that you do in a terminal window vs something you do in an IDE like VSCode or Cursor.
I don't have any interest in yucking anyone's yum, but for me, I find working in and IDE to be vastly more productive than trying to remember dozens of vim and tmux shortcuts.
Claude Code has a Cursor and VSCode extension so it replaces their chat sidebars, which is how many people use it today over just the CLI. What I'm trying to understand is how they're different if so, but it seems like what I'm learning is that they're basically converging in functionality and commoditizing for now and it comes down to personal preference. Personally I still use Cursor because it has a lot of other models than just Claude ones, but I guess some people trust Anthropic enough to not want any other models that may arise in the future too.
Yep, that's exactly what I meant when I said that CC is moving past it's CLI-first roots.
I haven't personally tried the CC extension because like you, I concluded that it sounds like a single-company Cursor with way fewer points of integration into the IDE.
I hate bikeshedding and rarely do I switch tooling unless something is demonstrably better; preferably 10x better. For me, the Cursor IDE experience is easily 10x better than copying and pasting from ChatGPT, which is why I created this thread in the first place.
CC seems best suited to situations where one or both of the following are true:
- presence of CI infrastructure
- the ability for the agent to run/test outputs from the run loop
If you're primarily working on embedded hardware, human-in-the-loop is not optional. In real terms, I am the CI infrastructure.
Also, working on hardware means that I am often discussing the circuit with the LLM in a much more collaborative way than what most folks seem to do with CC requirements. There are MCP servers for KiCAD but they don't seem to do more than integrate with BOM management. The LLMs understand EE better than many engineers do, but they can only understand my circuit (and the decisions embedded in it) as well as I can explain/screencap it to them.
The SDK and tooling for the MCUs also just makes an IDE with extensions a much more ergonomic fit than trying to do everything through CLI command switches.
As other pointed out, you may very well have a more productive setup this way, but providing numbers is rage-bait because we don't know where your baseline come from.
Do you ship 20x more PRs ? Did you solve 20x more bugs ? Did you add 20x more features ? Did you provide 20x more ARR, more values, etc...
You should be using an agent instead of the browser in-app chat, it can likely improve the efficiency and easiness of performing your workflow by another order of magnitude. Try Claude Code.
That could be said about compiling higher-level languages instead of rolling your own assembly and garbage collector. It's just working on a higher level. You're a lot more productive with, say, PHP than you are writing assembly.
I architect of it and go through many iterations. The machine makes mistakes, when I test I have to come back and work through the issues. I often correct the machine about stuff it doesn't know, or missed due to its training.
And ultimately I'm responsible for the code quality, I'm still in the loop all the time. But rather than writing everything by hand, following documentation and make a mistake, I have the machine do the code generation and edits for a lot of the code. There are still mistakes that need to be corrected until everything works, but the loop is a lot faster.
For example, I was able to port our MySQL adapter to PostGres AND Sqlite, something that I had been putting off for years, in about 3-5 hours total, including testing and bugfixes and massive refactoring. And it's still not in the main branch because there is more testing I want to have done before it's merged: https://github.com/Qbix/Platform/tree/refactor/DbQuery/platf...
Does a director make a movie? Serious question. Can we say that Steven Spielberg made Jurassic Park? Can we say the George Lucas made Star Wars? Directors rarely act in their own movies, write their own scripts, operate the cameras, operate the lights, operate the mics, edit the final cuts, write the scores, play the scores, create the VFX, do the film printing or the marketing. They prompt Biological Thought Models to do those things and cobble the results together. Really nothing a director traditionally does is actually physically making a film.
And yet, I don't see a problem with saying directors made their movies. Sure, it was the work of a lot of talented individuals contributing collectively to produce the final product, and most of those individuals probably contributed more physical "creation" to the film than the director did. But the director is a film maker. So I wouldn't be so confident asserting that someone who coordinates and architects an application by way of various automation tools isn't still a programmer or "writing software"
His language is LLM prompts. If he can check them into git and get reasonably consistent results if he ran the prompts multiple times, just like we expect from our JavaScript or C or assembly or machine code, I don't see the problem.
I knew a guy who could patch a running program by flipping switches on the front panel of a computer. He didn't argue my C language output 'is not writing a program'...
> His language is LLM prompts. If he can check them into git and get reasonably consistent results if he ran the prompts multiple times
You're joking, right? There's nothing "reasonably consistent" about LLMs. You can input the same prompt with the same context, and get wildly different results every time. This is from a single prompt. The idea that you can get anything close to consistent results across a sequence of prompts is delusional.
You can try prompt "hacks" like STRONGLY EMPHASIZING correct behaviour (or threaten to murder kittens like in the old days), but the tool will eventually disregard an instruction, and then "apologize" profusely for it.
Comparing this to what a compiler does is absurd.[1]
Sometimes it feels like users of these tools are in entirely separate universes given the wildly different perspectives we have.
[1]: Spare me the examples of obscure compiler inconsistencies. These are leagues apart in every possible way.
Not to be too rude, but this looks like a bunch of 2010-style procedural PHP string concatenation spaghetti. Data mappers and ORMs are pretty much a solved problem. Why write your own as part of your platform? If you’re writing this much boilerplate to reimplement solved problems, it’s no wonder you’re able to get a lot of help from an LLM.
Even with scripting languages the discussion is always about what people want to program in, not what users wish the language was. AI seems the same way, it's always about how much stuff is being generated, never about how much happier actual users are with whatever the output is.
We badly need to set some sort of standards on what constitutes a product for the purposes of "productivity." It's well-defined economically, revenue earned by a firm per labor hour. Are you earning 50x the revenue per hour of work or have you just pushed 50x lines of code into a repo with an automated build pipeline? Do you have 50x the users? Any users at all?
I try to think of what this would look like at my company and I can't even really conceive of what this dream scenario is supposed to be. We have maybe 6-10 legitimately revenue-earning products with a meaningful user base and it took about a decade to get there. There is no reasonable world in which you say we could have done that in 10 weeks instead, which would be roughly 1/50th the time. It takes at least that long typically just to get a contract completed once a prospective customer decides they even want to make a purchase. Writing code faster won't speed that process up. Can we push features 50x faster? No, we can't, because they come as a response to feature requests from users, which means we need to wait to have users who make such requests, and you can't just compress 10 years of that happening into 10 weeks. That's to say nothing of the fact that what we work on now is a response to market and ecosystem conditions now, not conditions as they were 10 years ago. If we'd pushed what we were doing now to having done it then instead, we'd have just been working on the wrong things.
Think about what it would mean to produce cars 50x faster than using current processes. What good would that even do? The current processes already produce all the cars the world needs. Making them 50x faster wouldn't give you a larger customer base. You'd just be making things no one needs and then throwing them away. The only sensible version of this is doing the same thing at roughly the same speed but at 1/50th the cost. I don't doubt that faster code generation can cut cost but not to 1/50th. Too much of the cost in creating and running a company has nothing at all to do with output.
Show us the financial statements from your company you started in 2019 and your company today. I would be absolutely thrilled to see somebody concretely show they earn the same revenue for 1/50the the cost, or 50x revenue for the same cost. The fact that you push 50x the number of commits or lines of code to Github means nothing to me.
With proper management of windows and screen real-estate as well as minimizing or even eliminating mouse usage I can hypothesize a 5000x speedup due to your greater ability to orchestrate and coordinate agents at scale.
I don't know. You may as well say that after reading Uncle Bob's Clean Code and adding 50 layers of indirection, you are now writing at "enterprise scale." Perhaps you even hired an Agile SCUM consultant, and now look at your velocity (at least they're measuring something)!
Use my abstract factory factories and inversion of control containers. With Haskell your entire solution is just a 20-line mapreduce in a monad transformer stack over IO. In J, it's 20 characters.
I don't see how AI differs. Rather, the last study of significance found that devs were gaslighting themselves into believing they were more productive, when the data actually bore the opposite conclusion [0].
Lots of good stuff in /newest gets missed. So, HN has an algo that selects some posts for a second chance. Looks like your post was selected for resurrection.
Anyone else just thoroughly sick of this AI bullshit? I feel like after this post OP probably went and fapped to their portrait. Lets be real here. If AI is making you "50x faster" you're probably not working on hard problems.
Surely we've reached peak hype now and it will start to get better? Surely...
Most likely you are creating boilerplate at 20x/50x, as opposed to genuinely new concepts, mechanisms, etc.
To be fair, most web/mobile frameworks expect you to do that.
Ideally, codebases would grow by adding data (e.g. a json describing endpoints, UIs, etc), not repetitive code.
> Ideally, codebases would grow by adding data (e.g. a json describing endpoints, UIs, etc), not repetitive code.
The problem with this configuration based approach is that now the actual code that executes has to be able to change its functionality arbitrarily in response to new configuration, and the code (and configuration format) needs to be extremely abstracted and incomprehensible. In the real world, someone figures out that things get way easier if you just put a few programming language concepts into the configuration format, and now you're back where you started but with a much worse programming language (shoehorned into a configuration format) than you were using before.
Boilerplate may be cumbersome, but it effectively gives you a very large number of places to "hook into" the framework to make it do what you need. AI makes boilerplate much less painful to write.
There are always middle grounds to be explored. The way I see it, 80% of a "codebase" would be data and 20%, code.
Both worlds can be cleanly composed. For instance, for backend development, it's common to define an array (data) of middleware (code).
At a smaller scale, this is already a reality in the Clojure ecosystem - most sql is data (honeysql library), and most html is data (Hiccup library).
Working on a system like this (mostly configured with complex yaml, but extended with a little DSL/rule engine to handle more complex situations) a long while ago, I introduced a bug that cost the company quite a bit of money by using `True` instead of `true`—something that would have been readily caught in a proper language with real tooling.
That would be caught by any schema validation system at runtime, e.g. Zod in typescript, Malli in Clojure, and so on.
>> Ideally, codebases would grow by adding data (e.g. a json describing endpoints, UIs, etc), not repetitive code.
Be very careful with this approach. there are many ways it can go completely wrong. i've seen a codebase like this and it was a disaster to debug. because you can't set breakpoints in data. it was a disaster.
It may not look compact or elegant but I'd rather see debuggable and comprehensible boiler point even if it's repetitive rather than a mess
There's three types of work in software:
- Commodity work, such as CRUD, integrations, infra plumbing, standard patterns.
- Local novelty, i.e. a new feature for your product (but not new to the world).
- Frontier novelty, as in, genuinely new algorithms/research-like work.
The overwhelming majority of software development is in the first two categories. Even people who think they are doing new and groundbreaking stuff are almost certainly doing variations of things that have been done in other contexts.
Precisely what I was going to say. As domain specificity increases, LLM output quality rapidly decreases.
Yes, there is A LOT of boilerplate that is sped up by AI. Every time I interface with a new service or API, I don't have to carefully read the documentation and write it by hand (or copypaste examples), I can literally have the AI do the first draft, grok it, test it, and iterate. Often times the AI misses latest developments, and I have to research things myself and fix code, explain the new capabilities, then the AI can be used again, but in the end it's still about 20x faster.
Most of the post reads like "boilerplate created at 20x/50x" to me, too, frankly.
To be clear you believe that you do a year's worth of work in one week? Every week? So halfway through this year you will have done 25 years of work?
He was just not doing his job 95% of the time.
50x more code? Absolutely plausible. 50x more ideas implemented, or 50x better ideas? Doubtful. Generating code doesn't generate value.
OP has got 200 years of experience under their belt now.
That's a FAANG level resume right there
In a sense, yes. If you compare how long it would take me to do it manually about 15 years ago.
Can you say anything about where you were five years ago? Or how your ability to write and ship code from 2015 to 2020 changed? It’s hard to believe all these 50x AI productivity claims, especially without an idea of what that multiplier applies to and what the right side of the equation even means. Is it lines written, dollar amounts of impact, pull requests merged?
Yes, you can follow my code from 5 and 10 years ago here:
https://github.com/Qbix/Platform-History-v1
https://github.com/Qbix/Platform-History-v2
And you can see the latest code here:
https://github.com/Qbix
Documentation can be created a lot faster, including for normies:
https://community.qbix.com/t/membership-plans-and-discounts/...
My favorite part of AI is red-teaming and finding bugs. Just copypaste diffs and ask it for regressions. Press it over and over until it can't find any.
Here is a speedrun from a few days ago:
https://www.youtube.com/watch?v=Yg6UFyIPYNY
I so wish just one of these posts with these insane ~20-50x claims would include a narrated video of someone using it, or show us a repository with the code, or anything tangible.
I get 10-20x done but that’s because most of what I need to get done isn’t particularly difficult.
E.g. I have an aws agent that does all my devops for me. It isn’t doing anything insane but when I need to investigate something or make a terraform change, I send it off and go do something else. I do similar things 20-100 times a day now. Go write this script, do this report, update this documentation, etc.
I think if you are a high level swe it takes a lot of effort to get “better than doing it myself”. If you have a more generalist role, AI is easily a 10x productivity booster if you are knowledgeable about using it.
Yes... I've asked for the same - show us the goods with a Destroy All Software style screencast; otherwise the default position is that this entire HN post is just more AI generated hallucination.
Nobody's taken me up on this offer yet. [0]
[0] https://news.ycombinator.com/item?id=46325469
Do you have a legacy code-base in mind?
I'd happily demonstrate this kind of workflow on my day job if not for company trade-secrets.
That's as legacy as it gets, 20+ year old code base with several "strata" of different technologies and approaches.
Claude Opus handily navigates around it, and produces working bug fixes with minimal guidance.
I'm not going to claim it's 20x or 50x yet, there's still navigation and babysitting involved, but it's definitely capable of working on complex problems, I just don't trust it to run it in YOLO mode.
The key thing is, you need domain knowledge. You need to know where to correct it, and which direction to point it in.
It's not magic, and it will have bad ideas. The key picking out the good ideas from the bad.
Hi eterm, this is very relevant to me as I'm building a self-hosted open-source tool for legacy code comprehension (AI/ML final project).
You mentioned "navigation and babysitting", could you share what that looks like in practice? Do you have to spend time reconstructing context or correcting Claude's misunderstandings? Do you still need to interrupt colleagues for some tacit knowledge, or has that changed?
I don't know. There's lots of options. At the extreme ends it would be interesting to see these agents work on something like boost, or metamath/set.mm, to choose deliberately obtuse candidates. Perhaps a web browser.
Sure. Here is my latest 4-hour speedrun: https://www.youtube.com/watch?v=Yg6UFyIPYNY
I appreciate the video. I see that most of the time it's the dialogue with the AI that is in the focus, the code or the actual product rarely shows up compared to that (or maybe it does? I can't read the text on the video due to the quality).
It's hard to tell where this 20-50x increase is.
Haven't you really tried an agentic code like Claude Code, Codex, AMP...? For "really try" I mean investing some time with it like 10-20 hours min to see the things they are good with and how to use them well. They are tools and have the corresponding learning curve.
I have passed through a phase of copy/pasting from the ChatGPT web to autocomplete tools, but the real feeling of "shit, this is going to really change how I code" came with Claude Code.
Maybe a good difference between "senior" at ten years and senior at twenty years is whether you think amount of code produced is a positive metric or a negative one.
As I get older (more experienced) I really write less code, but more of the code I write is correct. Am I doing something wrong?
No, that sounds right. First you don't know how to write something, then you know one way, then you know multiple ways, then later you know multiple ways and have the instinct to pick the best one first.
No, you are doing it exactly right. Code is a liability not an asset, hence, technical debt.
In other words, in a single year you did the amount of work that would take 20-50 years of a regular software engineer to complete?
So, it would take roughly two months to complete a project at the scale of SQLite or TeX.
Makes me wonder what Terry A. Davis would have to say about AI if he was alive today.
I imagine he would have a few choice words of the usual sort, but would he recognise that seemingly anyone can write their own OS like he did?
Personally I'm sceptical because few people can even fit a high-level overview of such a project in their heads, much less the entire codebase.
Yep, wait until he's done replacing COBOL completely in 5 years
I curious how you paste diffs into the Ai. And wouldnt a coding assistant in the IDE be a much more convenient solution?
If you have the changes as a GitHub PR you can add .diff to the end of the URL to get a single page of all the changes
Citation: https://stackoverflow.com/a/6188624
Not convenient for me. I just do git diff | mate and copypaste.
This is similar to the approach touted by Steve Yegge in this interview https://www.youtube.com/watch?v=zuJyJP517Uw
I really appreciate that he is up-front about "Yes. Vibe coding has lots of dangerous problems that you must learn to control if you are to go whole-hog like this."
Has anyone read his Vibe Coding book? The Amazon reviews make it sound like it's heavy on inspiration but light on techniques.
If you can do this and you do not build your own company you are wasting your time in some way, IMHO.
It has lots of potential. Let me, though, have my own... doubts about it. Thanks for sharing anyway.
I already built 2 companies and building a third.
https://linkedin.com/in/magarshak
Well, then if that has worked, you have not been wasting your time.
The real benefit is due to discipline. These models have removed a certain amount of friction you had. However, you’re now more of an editor than a writer. Keep that in mind and don’t lose the writing muscle.
Sad that this was flagged. I am a noob but nevertheless a perfectionist. I am not afraid of using AI, because I proofread everything. So it would have been of interest to me how this exactly works. A video maybe.
Out of interest, how did you arrive at the 20-50x speedup numbers?
Cherrypicking the most tedious parts, like boilerplate to get up and running, or porting code to other adapters (making mysqlite and postgres adapters for instance)
This was done in about 3 hours for instance: https://github.com/Qbix/Platform/tree/refactor/DbQuery/platf...
You can see the speed for yourself. Here is my first speedrun livestreamed: https://www.youtube.com/watch?v=Yg6UFyIPYNY
That code has a lot of smell IMHO :/
What's the rationale behind writing PHP as if it was JS? Unless I am mistaken it's like someone just did a transliteration from JS to PHP without even converting the JSDoc to PHPDoc.
And are there any tests for the code?
I actually based it on an existing PHP adapter for MySQL. Together with the AI, I went over all the features that needed refactoring, and what would work in Postgres and Sqlite the same way, etc. There were a ton of nuances.
So your actual output has not increased by 20-50x, just some parts of it? What's your speedup like over an entire project, not just cherrypicked parts?
I think those parts were the big bottleneck, so technically, yes it has.
Why not merged to main? What is the definition of done being applied here?
The code is stable and partially tested. Needs a lot more testing before committing it to main. This is mainly because the primary ORM adapter for MySQL has been rewritten, and a lot of apps use it.
I think in 2026 the automation will reach the testing, closing the loop. At that point, no humans in the loop will make software development extremely fast.
vibes, obviously
Suppose you keep this up for 15 years; how will you grapple with all the cruft you have generated?
Or just periodically throw it all away and start from scratch?
What if something becomes successful (has users) so that you can't just throw it away?
> Suppose you keep this up for 15 years; how will you grapple with all the cruft you have generated?
I can't speak for the OP but the worst software developer I ever worked with was myself from 1 year ago. Provided what "cruft" I'm generating meets my current code standards, it's unlikely to be any worse than anything else past me has done.
But it's produced "20-50x" faster.
Why wait 15 years? If it's becoming too big of a ball of mud, tell the AI that, and have it ultrathink a plan on what to refactor together and better follow DRY principles of clean code, or whatever you adhere to. Tell it hey, I noticed that we're reimplementing the same thing a couple of times, let's make it a class that can be reused instead. It'll happily generate cruft. The human in the loop should call that out so it doesn't get out of control. Maybe that's what we'll be paying the human to do still in 15 years.
There's a small but seemingly tireless brigade of "you're not actually moving faster, you're just fooling yourself" pundits on this site that feel compelled to chime in every time someone mentions that they get any benefit from AI coding tools. I'm just not going to engage with them anymore.
That said... I jumped to a few random moments in your video and had an "oh my god" reaction because you really were not kidding when you said that you were pasting code.
I'm pretty much begging you to install and use Cursor. Whatever boost you're getting from your current workflow, you will see triple through use of their Agent/Plan/Debug modes, especially when using Opus 4.5. I promise you: it's a before electricity vs after electricity scenario. I'm actually excited for you.
A lot of folks will tell you to use Claude Code. I personally find that it doesn't make sense for the sorts of projects I work on; I would 100% start with Cursor either way.
Can you actually provide any proof, even top-line stats from GitHub or other software forges that show the productivity boost you’re claiming?
It’s not up to the skeptics to prove this tech doesn’t work, it’s up to the proponents to show it does and does so with a similar effect size as cigarettes cause lung cancer.
There are a tremendous amount of LLM productivity stans on HN but the plural of anecdote is not data.
Certainly these tools are useful, but the extent to which they are useful today is not nearly as open and shut as you and others would claim. I’d say that these tools make me 5% more productive on a code base I know well.
I’m totally open to opposing evidence that isn’t just anecdote
I think it’s pretty obvious that is the OP automates this manual part of their workflow that it will improve their iteration speed. The thread root is just saying stop copy and pasting and use the built in tooling to communicate with the LLM apis
They aren’t responding to thread roots extended comment, just the first part about the tone and rhetoric of AI proponents. Your comment isnt really a response to anything in their comment.
Isn't Claude Code the same as Cursor agent mode? I really don't get why anyone would want to lock yourself to one LLM creator in the former vs having all the LLMs in the latter. How do you stop yourself from bursting through the quota with Opus? That's my biggest worry and it keeps me from using it over Sonnet in my Cursor.
Honestly, it depends on what you mean by "the same as". Both are (in my case, at least) running Opus 4.5 instances. After that, it's like using a CNC or a shop full of hand tools. They are both great, and people who know one often know both. The process is wildly different, however.
Not busting my quota is simply not my top priority. I'm on their $200/month plan and I have it locked to a $1000/month overage limit, though the most I've ever gone through using it every day, all day is about $700. That probably sounds like a lot if you're optimizing for a $20/month token budget, but it's budgeted for. That $10-12k/year is excellent value for the silly amount of functionality that I've been able to create.
Sonnet is a really good LLM, and you can build great things with it. However, if you're using this for serious work, IMO you probably want to use the most productive tools available.
Opus 4.1 was, to be real, punishingly expensive. It made me sweat. Thank goodness that Opus 4.5 is somehow both much better and much cheaper.
What do you see as the difference between Claude Code and Cursor agent mode, since you said Claude Code doesn't work for your type of project so I'm curious why that is.
Edit, I see you answered this in another response, thanks.
While I am vaguely aware that CC has started to move past its CLI-first roots, I still think of it as a process that you do in a terminal window vs something you do in an IDE like VSCode or Cursor.
I don't have any interest in yucking anyone's yum, but for me, I find working in and IDE to be vastly more productive than trying to remember dozens of vim and tmux shortcuts.
Claude Code has a Cursor and VSCode extension so it replaces their chat sidebars, which is how many people use it today over just the CLI. What I'm trying to understand is how they're different if so, but it seems like what I'm learning is that they're basically converging in functionality and commoditizing for now and it comes down to personal preference. Personally I still use Cursor because it has a lot of other models than just Claude ones, but I guess some people trust Anthropic enough to not want any other models that may arise in the future too.
Yep, that's exactly what I meant when I said that CC is moving past it's CLI-first roots.
I haven't personally tried the CC extension because like you, I concluded that it sounds like a single-company Cursor with way fewer points of integration into the IDE.
I hate bikeshedding and rarely do I switch tooling unless something is demonstrably better; preferably 10x better. For me, the Cursor IDE experience is easily 10x better than copying and pasting from ChatGPT, which is why I created this thread in the first place.
"it doesn't make sense" is an odd statement to make for choosing Claude Code vs Cursor.
Would you be willing to go into more detail about that claim?
BTW: it's a statement, not a claim.
The framing of your question as though I might possibly be hallucinating my own situation might be correlated to your lack of reply.
Happy to!
CC seems best suited to situations where one or both of the following are true:
- presence of CI infrastructure
- the ability for the agent to run/test outputs from the run loop
If you're primarily working on embedded hardware, human-in-the-loop is not optional. In real terms, I am the CI infrastructure.
Also, working on hardware means that I am often discussing the circuit with the LLM in a much more collaborative way than what most folks seem to do with CC requirements. There are MCP servers for KiCAD but they don't seem to do more than integrate with BOM management. The LLMs understand EE better than many engineers do, but they can only understand my circuit (and the decisions embedded in it) as well as I can explain/screencap it to them.
The SDK and tooling for the MCUs also just makes an IDE with extensions a much more ergonomic fit than trying to do everything through CLI command switches.
As other pointed out, you may very well have a more productive setup this way, but providing numbers is rage-bait because we don't know where your baseline come from.
Do you ship 20x more PRs ? Did you solve 20x more bugs ? Did you add 20x more features ? Did you provide 20x more ARR, more values, etc...
You should be using an agent instead of the browser in-app chat, it can likely improve the efficiency and easiness of performing your workflow by another order of magnitude. Try Claude Code.
https://news.ycombinator.com/item?id=46510369
But you aren't writing code. You are getting a machine to do it.
He's not even generating his own electricity
That could be said about compiling higher-level languages instead of rolling your own assembly and garbage collector. It's just working on a higher level. You're a lot more productive with, say, PHP than you are writing assembly.
I architect of it and go through many iterations. The machine makes mistakes, when I test I have to come back and work through the issues. I often correct the machine about stuff it doesn't know, or missed due to its training.
And ultimately I'm responsible for the code quality, I'm still in the loop all the time. But rather than writing everything by hand, following documentation and make a mistake, I have the machine do the code generation and edits for a lot of the code. There are still mistakes that need to be corrected until everything works, but the loop is a lot faster.
For example, I was able to port our MySQL adapter to PostGres AND Sqlite, something that I had been putting off for years, in about 3-5 hours total, including testing and bugfixes and massive refactoring. And it's still not in the main branch because there is more testing I want to have done before it's merged: https://github.com/Qbix/Platform/tree/refactor/DbQuery/platf...
Here is my first speedrun: https://www.youtube.com/watch?v=Yg6UFyIPYNY
> That could be said about compiling higher-level languages
You write the program as source code.
Prompting an LLM to cobble together lines from other people's work is not writing a program.
Does a director make a movie? Serious question. Can we say that Steven Spielberg made Jurassic Park? Can we say the George Lucas made Star Wars? Directors rarely act in their own movies, write their own scripts, operate the cameras, operate the lights, operate the mics, edit the final cuts, write the scores, play the scores, create the VFX, do the film printing or the marketing. They prompt Biological Thought Models to do those things and cobble the results together. Really nothing a director traditionally does is actually physically making a film.
And yet, I don't see a problem with saying directors made their movies. Sure, it was the work of a lot of talented individuals contributing collectively to produce the final product, and most of those individuals probably contributed more physical "creation" to the film than the director did. But the director is a film maker. So I wouldn't be so confident asserting that someone who coordinates and architects an application by way of various automation tools isn't still a programmer or "writing software"
"You write the program as source code."
His language is LLM prompts. If he can check them into git and get reasonably consistent results if he ran the prompts multiple times, just like we expect from our JavaScript or C or assembly or machine code, I don't see the problem.
I knew a guy who could patch a running program by flipping switches on the front panel of a computer. He didn't argue my C language output 'is not writing a program'...
> His language is LLM prompts. If he can check them into git and get reasonably consistent results if he ran the prompts multiple times
You're joking, right? There's nothing "reasonably consistent" about LLMs. You can input the same prompt with the same context, and get wildly different results every time. This is from a single prompt. The idea that you can get anything close to consistent results across a sequence of prompts is delusional.
You can try prompt "hacks" like STRONGLY EMPHASIZING correct behaviour (or threaten to murder kittens like in the old days), but the tool will eventually disregard an instruction, and then "apologize" profusely for it.
Comparing this to what a compiler does is absurd.[1]
Sometimes it feels like users of these tools are in entirely separate universes given the wildly different perspectives we have.
[1]: Spare me the examples of obscure compiler inconsistencies. These are leagues apart in every possible way.
Not to be too rude, but this looks like a bunch of 2010-style procedural PHP string concatenation spaghetti. Data mappers and ORMs are pretty much a solved problem. Why write your own as part of your platform? If you’re writing this much boilerplate to reimplement solved problems, it’s no wonder you’re able to get a lot of help from an LLM.
Also, AI is great but it hasn’t improved by ability to solve Leetcode problem any faster.
Anyone experiencing this problem as well?
Are you saying AI for commercial day job or is this something for personal project workflows?
Even with scripting languages the discussion is always about what people want to program in, not what users wish the language was. AI seems the same way, it's always about how much stuff is being generated, never about how much happier actual users are with whatever the output is.
We badly need to set some sort of standards on what constitutes a product for the purposes of "productivity." It's well-defined economically, revenue earned by a firm per labor hour. Are you earning 50x the revenue per hour of work or have you just pushed 50x lines of code into a repo with an automated build pipeline? Do you have 50x the users? Any users at all?
I try to think of what this would look like at my company and I can't even really conceive of what this dream scenario is supposed to be. We have maybe 6-10 legitimately revenue-earning products with a meaningful user base and it took about a decade to get there. There is no reasonable world in which you say we could have done that in 10 weeks instead, which would be roughly 1/50th the time. It takes at least that long typically just to get a contract completed once a prospective customer decides they even want to make a purchase. Writing code faster won't speed that process up. Can we push features 50x faster? No, we can't, because they come as a response to feature requests from users, which means we need to wait to have users who make such requests, and you can't just compress 10 years of that happening into 10 weeks. That's to say nothing of the fact that what we work on now is a response to market and ecosystem conditions now, not conditions as they were 10 years ago. If we'd pushed what we were doing now to having done it then instead, we'd have just been working on the wrong things.
Think about what it would mean to produce cars 50x faster than using current processes. What good would that even do? The current processes already produce all the cars the world needs. Making them 50x faster wouldn't give you a larger customer base. You'd just be making things no one needs and then throwing them away. The only sensible version of this is doing the same thing at roughly the same speed but at 1/50th the cost. I don't doubt that faster code generation can cut cost but not to 1/50th. Too much of the cost in creating and running a company has nothing at all to do with output.
Show us the financial statements from your company you started in 2019 and your company today. I would be absolutely thrilled to see somebody concretely show they earn the same revenue for 1/50the the cost, or 50x revenue for the same cost. The fact that you push 50x the number of commits or lines of code to Github means nothing to me.
Faster doesn't mean more money, it just means less time. The bottleneck in revenue is very rarely purely software related.
Thanks for sharing your methodology. And ignore the rude people that it might attract please.
I might start using a second LLM to review the diffs. Something like Gemini 3 Fast. Sounds good.
But I don't want to give up on a fancy IDE to use browser tabs.
So I think I will ask the second LLM to review the `git diff`.
Here is what it looks like, if I were to livestream it for 4 hours: https://www.youtube.com/watch?v=Yg6UFyIPYNY
With proper management of windows and screen real-estate as well as minimizing or even eliminating mouse usage I can hypothesize a 5000x speedup due to your greater ability to orchestrate and coordinate agents at scale.
thanks. I'll take a look later
> One AI that acts as a “builder”.
Not to be a dick, but only one? I won't brag about how many I do have running, but it's more than one.
So, you effectively have 20-50 developers of your caliber doing work for you for free? You must be running your own company now.
The reality is probably simpler: you've automated a good deal of busywork that you would never have done otherwise.
I don't know. You may as well say that after reading Uncle Bob's Clean Code and adding 50 layers of indirection, you are now writing at "enterprise scale." Perhaps you even hired an Agile SCUM consultant, and now look at your velocity (at least they're measuring something)!
Use my abstract factory factories and inversion of control containers. With Haskell your entire solution is just a 20-line mapreduce in a monad transformer stack over IO. In J, it's 20 characters.
I don't see how AI differs. Rather, the last study of significance found that devs were gaslighting themselves into believing they were more productive, when the data actually bore the opposite conclusion [0].
[0] https://news.ycombinator.com/item?id=44522772
actual lol
Wait how did this appear 3 hours ago, and also got flagged? I posted this many days ago! Something is wrong with HN timestamps.
Lots of good stuff in /newest gets missed. So, HN has an algo that selects some posts for a second chance. Looks like your post was selected for resurrection.
It has lots of potential
great testimony it would be great to hear about what was produced.
hope you are getting paid 50x more than 5 years ago
hope you are getting 50x more value
:) good luck you will need it :)
Anyone else just thoroughly sick of this AI bullshit? I feel like after this post OP probably went and fapped to their portrait. Lets be real here. If AI is making you "50x faster" you're probably not working on hard problems.
Surely we've reached peak hype now and it will start to get better? Surely...
[flagged]