> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules.
That’s the hard part of coding. If you have an architecture then writing the code is dead simple. If you aren’t writing the code you aren’t going to notice when you architected an API that allows nulls but then your database doesn’t. Or that it does allow that but you realize some other small issue you never accounted for.
I do not know how you can write this article and not realize the problem is the AI. Not that you let it architect, but that you weren’t paying attention to every single thing it does. It’s a glorified code generator. You need to be checking every thing it does.
The hard part of software engineering was never writing code. Junior devs know how to write code. The hard part is everything else.
> doing the __design work__ myself, by hand, before any code gets written.
So... Claude still is generating the code I guess?
And seriously, I can't understand that they thought their vibe coded project works fine and even bought a domain for the project without ever looking at source code it generated, FOR 7 MONTHS??
Can't you just ask AI to break up large files into smaller ones and also explain how the code works so you can understand it, instead of start over from scratch?
That was actually the first thing I tried. It did a good jov at explaining the code base mess and the architecture. Then I ran 3-4 refactor attempts. Each one broke things in ways that were harder to debug than the original mess. The god object had so many implicit dependencies that pulling one thread unraveled something else. And each attempt burned through my daily Claude usage limit before the refactor was stable.
And I'm sure the rewrite is going to teach me a whole different set of lessons...
I'm currently working on the discovery phase of a larger refactor and have pretty quickly realized that AI can actually often be pretty useless even if you've encoded the rules in an unambiguous, programmatic way.
For example, consider a lint rule that bans Kysely queries on certain tables from existing outside of a specific folder. You'd write a rule like this in an effort to pull reads and writes on a certain domain into one place, hoping you can just hand the lint violations to your AI agent and it would split your queries into service calls as needed.
And at first, it will appear to have Just Worked™. You are feeling the AGI. Right up until you start to review the output carefully. Because there are now little discrepancies in the new queries written (like not distinguishing between calls to the primary vs. the replica, missing the point of a certain LIMIT or ORDER BY clause, failing to appropriately rewrite a condition or SELECT, etc.) You run a few more reviewer agent passes over it, but realize your efforts are entirely in vain... because even if the reviewer agent fixes 10 or 20 or 30 of the issues, you can still never fully trust the output.
As someone with experience in doing this kind of thing before AI, I went back to doing it the old way: using a codemod to rewrite the code automatically using a series of rules. AI can write the codemod, AI can help me evaluate the results, but actually having it apply all of the few hundred changes automatically led to a lack of my ability to trust the output. And I suspect that will continue to be true for some time.
This industry needs a "verification layer" that, as far as I know, it does not have yet. Some part of me hopes that someone will reply to this comment with a counterexample, because I could sorely use one.
When people talk about codebases being "incomprehensible", it's not always hyperbole. Sometimes the architecture literally cannot be broken up or understood.
This reads too much like it was LLM generated. I can't say for sure if it was but I have an allergic reaction to the short snappy know-it-all LLM writing style.
So what you really mean is you are going to do better and more detailed skills files so you can get an architecture that you've thought through rather than something random?
Partly, but the order matters. The CLAUDE.md constraints only work if you designed the architecture first. They're just how you communicate it to the AI. The mistake I made wasn't writing bad skills files, it was not designing anything at all and expecting the AI to make coherent structural decisions across 30 sessions.
The rewrite is me sitting down with a blank doc and drawing the boxes before any code exists. Then the CLAUDE.md enforces what I already decided. Whether that actually holds up as the project grows, I genuinely don't know yet.
Are you really saving any time at all using AI at all then? If you have to write the architecture for it, write all the rules you want it to follow, check everything it's written, and then reprompt it because it's not how you want it?
Yes. I do all of this and I'd estimate 50-100% coding time savings. A lot of that comes from better multitasking over single-workstream throughput, which I suppose might compromise the gains depending on what you're doing. For me it amplifies the speedup by allowing some of my "coding time" to be spent on non-coding tasks too.
> I'm rewriting k10s in Rust. Not because Rust is better but, because it's the language I can steer. I've written enough of it to feel when something's wrong before I can articulate why. That instinct is the one thing vibe-coding can't replace. The AI hands you plausible-looking code. You need a nose for when it's garbage.
Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically.
> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules. The architecture decisions that the AI kept making wrong are now made in writing before the first prompt.
This post is good to grasp the difference between "vibe-coding" and using the AI to help with design and architectural choices done by a competent programmer (I am not saying you are not one). Lately I feel that Opus 4.7 involves the user a lot more, even when given a prompt to one-shot a particular piece of software.
Go reads fine whether the architecture is good or bad, and I couldn't tell the difference until I was in trouble. Rust is harder to read but harder to misuse. The borrow checker would have caught that data race at compile time. I've also just written more Rust. That familiarity matters separately.
+1 on Open 4.7 involving the user a lot more. Rn I'm trying to get to a state where I can codify my design + decision preferences as agents personas and push myself out of the dev loop.
Right, thank you. Personally I think reading all the code that the AI produces is impossible and kind of defeats the purpose of using it. The key is to devise a structured way to interact with it (skills and similar) and use extensive testing along the way to verify the work at all steps.
AI writes what you ask it to write, you need to talk to it about architecture. You should have an architecture doc so AI can shape the code based on that, you can get the AI to make the architecture doc also. If using claude you can use the software architecture mode for this.
You don't need to go back to coding by hand if you know how to do it already. There is a middle ground.
If you understand good software architecture, architect it. Create a markdown document just as you would if you had a team of engineers working with you and would hand off to them. Be specific.
Let the AI do the implementation of your architecture.
Does ‘writing code by hand’ mean you’re not going to use compilers to generate assembly?
Now I do feel lucky that I started learning coding about four years before the LLM revolution, but these things are really just natural language compilers, aren’t they? We’re just in that period - the 1980s, the greybeards tell me - where companies charged thousands of dollars per compiler instance, right? And now, I myself have never paid for a compiler.
This whole investor bubble will blow up in the face of the rentier-finance capitalists and I’ll be laughing my head off while it happens.
7 months ago was early November. Coding assistants were getting very good back then, but they were still significantly poorer at making good architectural decisions in my experience. They tended to just force features into the existing code base without much thought or care.
Today I've noticed assistants tend to spot architectural smells while working and will ask you whether they should try to address it, but even then they're probably never going to suggest a full refactor of the codebase (which probably is generally the correct heuristic).
My guess is that if you built this today with AI that you wouldn't run into so many of these problems. That's not to say you should build blind, but the first thing that stood out to me was that you starting building 7 months ago and coding assistants were only just becoming decent at that time, and undirected would still generally generate total slop.
> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules.
That’s the hard part of coding. If you have an architecture then writing the code is dead simple. If you aren’t writing the code you aren’t going to notice when you architected an API that allows nulls but then your database doesn’t. Or that it does allow that but you realize some other small issue you never accounted for.
I do not know how you can write this article and not realize the problem is the AI. Not that you let it architect, but that you weren’t paying attention to every single thing it does. It’s a glorified code generator. You need to be checking every thing it does.
The hard part of software engineering was never writing code. Junior devs know how to write code. The hard part is everything else.
Title says
> back to writing code by hand
But what they are doing is
> doing the __design work__ myself, by hand, before any code gets written.
So... Claude still is generating the code I guess?
And seriously, I can't understand that they thought their vibe coded project works fine and even bought a domain for the project without ever looking at source code it generated, FOR 7 MONTHS??
So you're not actually writing code by hand? I'm very confused by the difference between the title and the conclusion here.
Can't you just ask AI to break up large files into smaller ones and also explain how the code works so you can understand it, instead of start over from scratch?
That was actually the first thing I tried. It did a good jov at explaining the code base mess and the architecture. Then I ran 3-4 refactor attempts. Each one broke things in ways that were harder to debug than the original mess. The god object had so many implicit dependencies that pulling one thread unraveled something else. And each attempt burned through my daily Claude usage limit before the refactor was stable.
And I'm sure the rewrite is going to teach me a whole different set of lessons...
I'm currently working on the discovery phase of a larger refactor and have pretty quickly realized that AI can actually often be pretty useless even if you've encoded the rules in an unambiguous, programmatic way.
For example, consider a lint rule that bans Kysely queries on certain tables from existing outside of a specific folder. You'd write a rule like this in an effort to pull reads and writes on a certain domain into one place, hoping you can just hand the lint violations to your AI agent and it would split your queries into service calls as needed.
And at first, it will appear to have Just Worked™. You are feeling the AGI. Right up until you start to review the output carefully. Because there are now little discrepancies in the new queries written (like not distinguishing between calls to the primary vs. the replica, missing the point of a certain LIMIT or ORDER BY clause, failing to appropriately rewrite a condition or SELECT, etc.) You run a few more reviewer agent passes over it, but realize your efforts are entirely in vain... because even if the reviewer agent fixes 10 or 20 or 30 of the issues, you can still never fully trust the output.
As someone with experience in doing this kind of thing before AI, I went back to doing it the old way: using a codemod to rewrite the code automatically using a series of rules. AI can write the codemod, AI can help me evaluate the results, but actually having it apply all of the few hundred changes automatically led to a lack of my ability to trust the output. And I suspect that will continue to be true for some time.
This industry needs a "verification layer" that, as far as I know, it does not have yet. Some part of me hopes that someone will reply to this comment with a counterexample, because I could sorely use one.
Rewrite following a new architecture plan could get finished pretty quickly, treating the original as a prototype.
When people talk about codebases being "incomprehensible", it's not always hyperbole. Sometimes the architecture literally cannot be broken up or understood.
I find that really hard to believe. It's not like curing cancer
No but it can be a rube goldberg machine of insanity
This reads too much like it was LLM generated. I can't say for sure if it was but I have an allergic reaction to the short snappy know-it-all LLM writing style.
So what you really mean is you are going to do better and more detailed skills files so you can get an architecture that you've thought through rather than something random?
Partly, but the order matters. The CLAUDE.md constraints only work if you designed the architecture first. They're just how you communicate it to the AI. The mistake I made wasn't writing bad skills files, it was not designing anything at all and expecting the AI to make coherent structural decisions across 30 sessions.
The rewrite is me sitting down with a blank doc and drawing the boxes before any code exists. Then the CLAUDE.md enforces what I already decided. Whether that actually holds up as the project grows, I genuinely don't know yet.
Are you really saving any time at all using AI at all then? If you have to write the architecture for it, write all the rules you want it to follow, check everything it's written, and then reprompt it because it's not how you want it?
Yes. I do all of this and I'd estimate 50-100% coding time savings. A lot of that comes from better multitasking over single-workstream throughput, which I suppose might compromise the gains depending on what you're doing. For me it amplifies the speedup by allowing some of my "coding time" to be spent on non-coding tasks too.
But even if coding time is reduced by half, is that worth the downsides? Coding has never really been a major percentage of my time.
> I'm rewriting k10s in Rust. Not because Rust is better but, because it's the language I can steer. I've written enough of it to feel when something's wrong before I can articulate why. That instinct is the one thing vibe-coding can't replace. The AI hands you plausible-looking code. You need a nose for when it's garbage.
Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically.
> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules. The architecture decisions that the AI kept making wrong are now made in writing before the first prompt.
This post is good to grasp the difference between "vibe-coding" and using the AI to help with design and architectural choices done by a competent programmer (I am not saying you are not one). Lately I feel that Opus 4.7 involves the user a lot more, even when given a prompt to one-shot a particular piece of software.
Go reads fine whether the architecture is good or bad, and I couldn't tell the difference until I was in trouble. Rust is harder to read but harder to misuse. The borrow checker would have caught that data race at compile time. I've also just written more Rust. That familiarity matters separately.
+1 on Open 4.7 involving the user a lot more. Rn I'm trying to get to a state where I can codify my design + decision preferences as agents personas and push myself out of the dev loop.
Gotcha, that implies you are going to read the code that the AI produces anyways.
> Go reads fine whether the architecture is good or bad
Were you reading the Golang code all along and got fooled or did you review it after it failed? Sorry I admit I didn't read the whole article.
He was NOT reading the code: "For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote."
Right, thank you. Personally I think reading all the code that the AI produces is impossible and kind of defeats the purpose of using it. The key is to devise a structured way to interact with it (skills and similar) and use extensive testing along the way to verify the work at all steps.
AI writes what you ask it to write, you need to talk to it about architecture. You should have an architecture doc so AI can shape the code based on that, you can get the AI to make the architecture doc also. If using claude you can use the software architecture mode for this.
When he mentions I push commits at work for as long as my tokens last I can understand that. Managing tokens has become an important skill.
You don't need to go back to coding by hand if you know how to do it already. There is a middle ground.
If you understand good software architecture, architect it. Create a markdown document just as you would if you had a team of engineers working with you and would hand off to them. Be specific.
Let the AI do the implementation of your architecture.
Strict SDD might help to constrain and harness the process.
Writing code by hand is an oxymoron. You don’t write code with AI, AI doesn’t write, it generates.
Outright lie clickbait. As he states himself, he's doing the design work by hand, and will likely still use AI to write code.
Not sure if just me, but this post feels AI written?
Feels a bit too long winded to be AI generated.
Does ‘writing code by hand’ mean you’re not going to use compilers to generate assembly?
Now I do feel lucky that I started learning coding about four years before the LLM revolution, but these things are really just natural language compilers, aren’t they? We’re just in that period - the 1980s, the greybeards tell me - where companies charged thousands of dollars per compiler instance, right? And now, I myself have never paid for a compiler.
This whole investor bubble will blow up in the face of the rentier-finance capitalists and I’ll be laughing my head off while it happens.
> I learned over these 7 months
7 months ago was early November. Coding assistants were getting very good back then, but they were still significantly poorer at making good architectural decisions in my experience. They tended to just force features into the existing code base without much thought or care.
Today I've noticed assistants tend to spot architectural smells while working and will ask you whether they should try to address it, but even then they're probably never going to suggest a full refactor of the codebase (which probably is generally the correct heuristic).
My guess is that if you built this today with AI that you wouldn't run into so many of these problems. That's not to say you should build blind, but the first thing that stood out to me was that you starting building 7 months ago and coding assistants were only just becoming decent at that time, and undirected would still generally generate total slop.
have another drink and drive yourself home
This doesnt make much sense the article itself is AI written
It would have been easy to run a few ai agents to review the code and find these issues as well and architect it clearly