I use Cursor as my base editor + Cline as my main agentic tool. I have not tried Windsurf so alas I can't comment here but the Cursor + Cline combo works brilliantly for me:
* Cursor's Cmk-K edit-inline feature (with Claude 3.7 as my base model there) works brilliantly for "I just need this one line/method fixed/improved"
* Cursor's tab-complete (neé SuperMaven) is great and better than any other I've used.
* Cline w/ Gemini 2.5 is absolutely the best I've tried when it comes to full agentic workflow. I throw a paragraph of idea at it and it comes up with a totally workable and working plan & implementation
Fundamentally, and this may be my issue to get over and not actually real, I like that Cline is a bring-your-own-API-key system and an open source project, because their incentives are to generate the best prompt, max out the context, and get the best results (because everyone working on it wants it to work well). Cursor's incentive is to get you the best results....within their budget (of $.05 per request for the max models and within your monthly spend/usage allotment for the others). That means they're going to try to trim context or drop things or do other clever/fancy cost saving techniques for Cursor, Inc.. That's at odds with getting the best results, even if it only provides minor friction.
Just use codex and machtiani (mct). Both are open source. Machtiani was open sourced today. Mct can find context in a hay stack, and it’s efficient with tokens. Its embeddings are locally generated because of its hybrid indexing and localization strategy. No file chunking. No internet, if you want to be hardcore. Use any inference provider, even local. The demo video shows solving an issue VSCode codebase (of 133,000 commits and over 8000 files) with only Qwen 2.5 coder 7B. But you can use anything you want, like Claude 3.7. I never max out context in my prompts - not even close.
Say I'm chatting in a git project directory `undici`. I can show you a few ways how I work with codex.
1. Follow up with Codex.
`mct "fix bad response on h2 server" --model anthropic/claude-3.7-sonnet:thinking`
Machtiani will stream the answer, then also apply git patches suggested in the convo automatically.
Then I could follow up with codex.
`codex "See unstaged git changes. Run tests to make sure it works and fix and problems with the changes if necessary."
2. Codex and MCT together
`codex "$(mct 'fix bad response on h2 server' --model deepseek/deepseek-r1 --mode answer-only)"`
In this case codex will dutifully implement the suggested changes of codex, saving tokens and time.
The key for the second example is `--mode answer-only`. Without this flagged argument, mct will itself try and apply patches. But in this case codex will do it as mct withholds the patches with the aforementioned flagged arg.
3. Refer codex to the chat.
Say you did this
`mct "fix bad response on h2 server" --model gpt-4o-mini --mode chat`
Here, I used `--mode chat`, which tells mct to stream the answer and save the chat convo, but not to apply git changes (differrent than --mode answer-only).
You'll see mct will printout that something like
`Response saved to .machtiani/chat/fix_bad_server_resonse.md`
Now you can just tell codex.
`codex "See .machtiani/chat/fix_bad_server_resonse.md, and do this or that...."`
*Conclusion*
The example concepts should cover day-to-day use cases. There are other exciting workflows, but I should really post a video on that. You could do anything with unix philosophy!
I skipped using aider, but I heard good things. I needed to work with large, complex repos, not vibe codebases. And agents require always top-notch models that are expensive and can't run locally well. So when Codex came out, it skipped to that.
But mct leverages the weak models well, do things not possible otherwise. And it does even better with stronger models. Rewards stronger models, but doesn't punish smaller models.
So basically, you can use save money and do more using mct + codex. But I hear aider is terminal tool so maybe try and mct + aider?
Cursor does something with truncating context to save costs on their end, you dont get the same with Cline because you're paying for each transaction - so depending on complexity I find Cline works significantly better.
I still use cursor chat with agent mode though, but I've always been indecisive. Like the others said though, its nice to see how cline behaves to assist with creating your own agentic workflows.
> Cursor does something with truncating context to save costs on their end
I have seen mentioning of this but is there actually a source to back it up? Tried Cline every now and then. While it's great, I don't find it better than Cursor (nor worse in any clear way)
Totally anecdotal of course so take this with a grain of salt, but I've seen and experienced this when Cursor chats start to get very long (eg the context starts to really fill up). It suddenly starts "forgetting" things you talked about earlier or producing code that's at odds with code it already produced. I think it's partly why they suggest but don't enforce starting a new chat when things start to really grow.
It's actually very easy to see for yourself. When the agent "looks" at a file it will say the number of lines it looks at, almost always its the top 0-250 or 0-500 but might depend on model selected and if MAX mode is utilized.
Zed. They've upped their game in the AI integration and so far it's the best one I've seen (external from work). Cursor and VSCode+Copilot always felt slow and janky, Zed is much less janky feels like pretty mature software, and I can just plug in my Gemini API key and use that for free/cheap instead of paying for the editor's own integration.
Overall Zed is super nice and opposite of janky, but still found a few of defaults were off and Python support still was missing in a few key ways for my daily workflow.
I'll second the zed recommendation, sent from my M4 macbook. I don't know why exactly it's doing this for you but mine is idling with ~500MB RAM (about as little as you can get with a reasonably-sized Rust codebase and a language server) and 0% CPU.
I have also really appreciated something that felt much less janky, had better vim bindings, and wasn't slow to start even on a very fast computer. You can completely botch Cursor if you type really fast. On an older mid-range laptop, I ran into problems with a bunch of its auto-pair stuff of all things.
I don't think Zeta is quite up to windsurf's completion quality/speed.
I get that this would go against their business model, but maybe people would pay for this - it could in theory be the fastest completion since it would run locally.
> the fastest completion since it would run locally
We are living in a strange age that local is slower than the cloud. Due to the sheer amount of compute we need to do. Compute takes hundreds of milliseconds (if not seconds) on local hardware, making 100ms of network latency irrelevant.
Even for a 7B model your expensive Mac or 4090 can't beat, for example, a box with 8x A100s running FOSS serving stack (sglang) with TP=8, in latency.
Running models locally is very expensive in terms of memory and scheduling requirements, maybe instead they should host their model on the Cloudflare AI network which is distributed all around the world and can have lower latency
For the agentic stuff I think every solution can be hit or miss. I've tried claude code, aider, cline, cursor, zed, roo, windsurf, etc. To me it is more about using the right models for the job, which is also constantly in flux because the big players are constantly updating their models and sometimes that is good and sometimes that is bad.
But I daily drive Cursor because the main LLM feature I use is tab-complete, and here Cursor blows the competition out of the water. It understands what I want to do next about 95% of the time when I'm in the middle of something, including comprehensive multi-line/multi-file changes. Github Copilot, Zed, Windsurf, and Cody aren't at the same level imo.
Do they actually improve the model you can get without cursor? Or in reality all the development goes to cursor's autocomplete without making this available to supermaven subscribers? It is hard to make sure of that from their website and lack of info online.
Aider! Use the editor of your choice and leave your coding assistant separate. Plus, it's open source and will stay like this, so no risk to see it suddenly become expensive or dissappear.
I used to be religiously pro-Aider. But after a while those little frictions flicking backwards and forwards between the terminal and VS Code, and adding and dropping from the context myself, have worn down my appetite to use it. The `--watch` mode is a neat solution but harms performance. The LLM gets distracted by deleting its own comment.
I suspect that if you're a vim user those friction points are a bit different. For me, Aider's git auto commit and /undo command are what sells it for me at this current junction of technology. OpenHands looks promising, though rather complex.
The (relative) simplicity is what sells aider for me (it also helps that I use neovim in tmux).
It was easy to figure out exactly what it's sending to the LLM, and I like that it does one thing at a time. I want to babysit my LLMs and those "agentic" tools that go off and do dozens of things in a loop make me feel out of control.
I like your framing about “feeling out of control”.
For the occasional frontend task, I don’t mind being out of control when using agentic tools. I guess this is the origin of Karpathy’s vibe coding moniker: you surrender to the LLM’s coding decisions.
For backend tasks, which is my bread and butter, I certainly want to know what it’s sending to the LLM so it’s just easier to use the chat interface directly.
This way I am fully in control. I can cherry pick the good bits out of whatever the LLM suggests or redo my prompt to get better suggestions.
So this part of my workflow is intentionally fairly labor intensive because it involves lots of copy-pasting between my IDE and the chat interface in a browser.
From the linked comment: > Mandatory reminder that "agentic coding" works way worse than just using the LLM directly
just isn't true. If everything was equal, that might possibly be true, but it turns out that system prompts are quite powerful in influencing how an LLM behaves. ChatGPT with a blank user entered system prompt behaves differently (read: poorer at coding) than one with a tuned system prompt. Aider/Copilot/Windsurf/etc all have custom system prompts that make them more powerful rather than less, compared to using a raw web browser, and also don't involve the overhead of copy pasting.
Approximately how much does it cost in practice to use Aider? My understanding is that Aider itself is free, but you have to pay per token when using an API key for your LLM of choice. I can look up for myself the prices of the various LLMs, but it doesn't help much, since I have no intuition whatsoever about how many tokens I am likely to consume. The attraction of something like Zed or Cursor for me is that I just have a fixed monthly cost to worry about. I'd love to try Aider, as I suspect it suits my style of work better, but without having any idea how much it would cost me, I'm afraid of trying.
I'm using Gemini 2.5 Pro with Aider and Cline for work. I'd say when working for 8 full hours without any meetings or other interruptions, I'd hit around $2. In practice, I average at $0.50 and hit $1 once in the last weeks.
I'd be really keen to know more about what you're using it for, how you typically prompt it, and how many times you're reaching for it. I've had some success at keeping spend low but can also easily spend $4 from a single prompt so I don't tend to use tools like Aider much. I'd be much more likely to use them if I knew I could reliably keep the spend down.
I'm using VSC for most edits, tab-completion is done via Copilot, I don't use it that much though, as I find the prediction to be subpar or too wordy in case of commenting.
I use Aider for rubber-ducking and implementing small to mid-scope changes. Normally, I add the required files, change to architect or ask mode (depends on the problem I want to solve), explain what my problem is and how I want it to be solved. If the Aider answer satisfies me, I change to coding mode and allow the changes.
No magic, I have no idea how a single prompt can generate $4. I wouldn't be surprised if I'm only scratching on the surface with my approach though, maybe there is a better but more costly strategy yielding better results which I just didn't realize yet.
Huh, I didn't configure anything for saving, honestly. I just add the whole repo and do my stuff.
How do you get to $10/h? I probably couldn't even provoke this.
Yup, choose your model and pay as you go, like commodities like rice and water. The others played games with me to minimize context and use cheaper models (such as 3 modes, daily credits etc, using most expensive model etc).
Also the --watch mode is the most productive interface of using your editor, no need of extra textboxes with robot faces.
fwiw. Gemini-*, which is available in Aider, isn't Pay As You Go (payg) but post paid, which means you get a bill at the end of the month and not the OpenAI/others model of charging up credits before you can use the service.
For a time windsurf was way ahead of cursor in full agentic coding, but now I hear cursor has caught up. I have yet to switch back to try out cursor again but starting to get frustrated with Windsurf being restricted to gathering context only 100-200 lines at a time.
So many of the bugs and poor results that it can introduce are simply due to improper context. When forcibly giving it the necessary context you can clearly see it’s not a model problem but it’s a problem with the approach of gathering disparate 100 line snippets at a time.
Also, it struggles with files over 800ish lines which is extremely annoying
We need some smart deepseek-like innovation in context gathering since the hardware and cost of tokens is the real bottleneck here.
Wait, are these 800 lines of code? Am I the only one seeing that as a major code smell? Assuming these are code files, the issue is not AI processing power but rather bread and butter coding practices related to file organisation and modularisation.
For daily work - neither. They basically promote the style of work where you end up with mediocre code that you don't fully understand, and with time the situation gets worse.
I get much better result by asking specific question to a model that has huge context (Gemini) and analyzing the generated code carefully. That's the opposite of the style of work you get with Cursor or Windsurf.
Is it less efficient? If you are paid by LoCs, sure. But for me the quality and long-term maintainability are far more important. And especially the Tab autocomplete feature was driving me nuts, being wrong roughly half of the time and basically just interrupting my flow.
I wrote a simple Python script that I run in any directory that gets the context I usually need and copies to the clipboard/paste buffer. A short custom script let's you adjust to your own needs.
Legal issues aside (you are the legal owner of that code or you checked with one), and provided it's small enough, just ask an LLM to write a script to do so . If the code base is too big, you might have luck choosing the right parts. The right balance of inclusions and exclusions can work miracles here.
I’ve been using Zed Agent with GitHub Copilot’s models, but with GitHub planning to limit usage, I’m exploring alternatives.
Now I'm testing Claude Code’s $100 Max plan. It feels like magic - editing code and fixing compile errors until it builds. The downside is I’m reviewing the code a lot less since I just let the agent run.
So far, I’ve only tried it on vibe coding game development, where every model I’ve tested struggles. It says “I rewrote X to be more robust and fixed the bug you mentioned,” yet the bug still remains.
I suspect it will work better for backend web development I do for work: write a failing unit test, then ask the agent to implement the feature and make the test pass.
Also, give Zed’s Edit Predictions a try. When refactoring, I often just keep hitting Tab to accept suggestions throughout the file.
It feels like magic when it works and it at least gets the code to compile. Other models* would usually return a broken code. Specially when using a new release of a library. All the models use the old function signatures, but Claud Code then sees compile error and fixes it.
Compared to Zed Agent, Claude Code is:
- Better at editing files. Zed would sometimes return the file content in the chatbox instead of updating it. Zed Agent also inserted a new function in the middle of the existing function.
- Better at running tests/compiling. Zed struggled with nix environment and I don't remember it going to the update code -> run code -> update code feedback loop.
With this you can leave Claude Code alone for a few minutes, check back and give additional instructions. With Zed Agent it was more of a constantly monitoring / copy pasting and manually verifying everything.
*I haven't tested many of the other tools mentioned here, this is mostly my experience with Zed and copy/pasting code to AI.
I plan to test other tools when my Claude Code subscription expires next month.
Zed's agentic editing with Claude 3.7 + thinking does what you're describing testing out with the $100 Claude Code tool. Why leave the Zed editor and pay more to do something you can run for free/cheap within it instead?
I'm with Cursor for the simple reason it is in practice unlimited. Honestly the slow requests after 500 per month are fast enough. Will I stay with Cursor? No, ill switch the second something better comes along.
20€ seems totally subsidized considering the amount of tokens. Pricing cheaply to be competitive but users will jump to the next one when they inevitably hike the price up.
Or when it arbitrarily decides to rewrite half the content on your website and not mention it.
Or, my favorite: when you’ve been zeroing in on something actually interesting and it says at the last minute, “let’s simplify our approach”. It then proceeds to rip out all the code you’ve written for the last 15 minutes and insert a trivial simulacrum of the feature you’ve been working on that does 2% of what you originally specified.
$5 to anyone who can share a rules.md file that consistently guides Sonnet 3.7 to give up and hand back control when it has no idea what it’s doing, rather than churn hopelessly and begin slicing out nearby unrelated code like it’s trying to cut out margins around a melanoma.
I wish it was unlimited for me. I got 500 fast requests, about 500 slow requests, then at some point it started some kind of exponential backoff, and became unbearably slow. 60+ second hangs with every prompt, at least, sometimes 5 minutes. I used that period to try out windsurf, vscode copilot, etc and found they weren't as good. Finally the month refreshed and I'm back to fast requests. I'm hoping they get the capacity to actually become usably unlimited.
Cursor is acceptable because for the price it's unbeatable. Free, unlimited requests are great. But by itself, Cursor is not anything special. It's only interesting because they pay Claude or Gemini from their pockets.
Ideally, things like RooCode + Claude are much better, but you need infinite money glitch.
I built a minimal agentic framework (with editing capability) that works for a lot of my tasks with just seven tools: read, write, diff, browse, command, ask and think.
One thing I'm proud of is the ability to have it be more proactive in making changes and taking next action by just disabling the `ask` tool.
I won't say it is better than any of the VSCode forks, but it works for 70% of my tasks in an understandable manner. As for the remaining stuff, I can always use Cursor/Windsurf in a complementary manner.
I love vim and but I am playing with this stuff too...
There are a couple of neovim projects that allow this ... Advante come to mind right now.
I will say this: it is a different thought process to get an llm to write code for you. And right now, the biggest issue for me is the interface. It is wrong some how, my attention not being directed to the most important part of what is going on....
Cursor: Autocomplete is really good. At a time when I compared them, it was without a doubt better than Githib Copilot autocomplete. Cmd-K - insert/edit snippet at cursor - is good when you use good old Sonnet 3.5. ;;; Agent mode, is, honestly, quite disappointing; it doesn't feel like they put a lot of thought into prompting and wrapping LLM calls. Sometimes it just fails to submit code changes. Which is especially bad as they charge you for every request. Also I think they over-charge for Gemini, and Gemini integration is especially poor.
My reference for agent mode is Claude Code. It's far from perfect, but it uses sub-tasks and summarization using smaller haiku model. That feels way more like a coherent solution compared to Cursor. Also Aider ain't bad when you're OK with more manual process.
Windsurf: Have only used it briefly, but agent mode seems somewhat better thought out. For example, they present possible next steps as buttons. Some reviews say it's even more expensive than Cursor in agent mode.
Also something to consider: I have a script I wrote myself which just feeds selected files as a context to LLM and then either writes a response to the stdout, or extracts a file out of it.
That seems to be often better than using Cursor. I don't really understand why it calls tools when I selected entire file to be used a context - tool calls seem to be unnecessary distraction in this case, making calls more expensive. Also Gemini less neurotic when I use it with very basic prompts -- either Cursor prompts make it worse, or the need to juggle tool calls distract it from calls.
Since this topic is closely related to my new project, I’d love to hear your opinion on it.
I’m thinking of building an AI IDE that helps engineers write production quality code quickly when working with AI. The core idea is to introduce a new kind of collaboration workflow.
You start with the same kind of prompt, like “I want to build this feature...”, but instead of the model making changes right away, it proposes an architecture for what it plans to do, shown from a bird’s-eye view in the 2D canvas.
You collaborate with the AI on this architecture to ensure everything is built the way you want. You’re setting up data flows, structure, and validation checks. Once you’re satisfied with the design, you hit play, and the model writes the code.
I quite liked the video. Hope you get to launch the product and I could try it out some day.
The only thing that I kept thinking about was - if there is a correction needed- you have to make it fully by hand. Find everything and map. However, if the first try was way off , I would like to enter from "midpoint" a correction that I want. So instead of fixing 50%, I would be left with maybe 10 or 20. Don't know if you get what I mean.
Yes, the idea is to ‘speak/write’ to the local model to fix those little things so you don’t have to do them by hand. I actually already have a fine-tuned Qwen model running on Apple’s MLX to handle some of that, but given the hard YC deadline, it didn’t make it into the demo.
Eventually, you’d say, ‘add an additional layer, TopicsController, between those two files,’ and the local model would do it quickly without a problem, since it doesn’t involve complicated code generation. You’d only use powerful remote models at the end.
The video was a good intro to the concept. As long as it has repeatable memory for the corrections shown in the video, then the answer to your question about being adopted is “yes!”
Recently, Augment Code. But more generally, the "leader" switches so frequently at this point, I don't commit to use either and switch more or less freely from one to another. It helps to have monthly subscriptions and free cancellation policy.
I expect, or hope for, more stability in the future, but so far, from aider to Copilot, to Claude Code, to Cursor/Windsurf/Augment, almost all of them improve (or at least change) fast and seem to borrow ideas from each other too, so any leader is temporary.
Windsurf at the moment. It now can run multiple "flows" in parallel, so I can set one cascade off to look into a bug somewhere while another cascade implements a feature elswhere in the code base. The LLMs spit out their tokens in the background, I drop in eventually to reveiew and accept or ask for further changes.
Until you change model in one of the tabs and all other tabs (and editor instances!) get model changed, stop what they're doing, lose context etc. There is also a bug where if you have two editors working on two codebases they get lost and start working on same thing, I suppose there is some kind of a background workspace that gets mixed up.
I feel a bit out of place here, as I’m not a dev… I come from the operational side, but do all my work in Puppet code. I was using Codeium + VSC and life was wonderful. One day, though, everything updated and Codeium was gone in favor of Windsurf and things got crazy. VSC no longer understood Puppet code and didn’t seem to be able to access the language tools from the native Puppet Development Kit plugins either.
The crazy part is my Vim setup has the Codeium plugins all still in place, and it works perfectly. I’m afraid if I update the plugin to a windsurf variant, it will completely “forget” about Puppet, its syntax, and everything it has “learned” from my daily workflow over the last couple years.
VS Code with GitHub Copilot works great, though they are usually a little late to add features compared to Cursor or Windsurf. I use the 'Edit' feature the most.
Windsurf I think has more features, but I find it slower compared to others.
Cursor is pretty fast, and I like how it automatically suggests completion even when moving my cursor to a line of code. (Unlike others where you need to 'trigger' it by typing a text first)
Honorable mention: Supermaven. It was the first and fastest AI autocomplete I used. But it's no longer updated since they were acquired by Cursor.
90% of their features could fit inside a VS Code extension.
There are already a few popular open-source extension doing 90%+ of what Cursor is doing - Cline, Roo Code (a fork of Cline), Kilo Code (a fork of Roo Code and something I help maintain).
Since installing entirely new software is just downloading Cursor.AppImage from the official website and double-clicking on it, it's not a large hassle for most users.
If you're on Arch, there's even an AUR package, so it's even less steps than that.
AppImages aren't sandboxed and they can access the rest of the system just fine. After all, they're just a regular SquashFS directory that get mounted into a /tmp mount and then executed from there.
OP probably means to keep using vscode. Honestly, best thing you can do is just try each for a few weeks. Feature comparison tables only say so much, particularly because the terminology is still in a state of flux.
I’ve personally never felt at home in vscode. If you’re open to switching, definitely check out Zed, as others are suggesting.
From my experience Claude 3.7 seems to make less mistakes when you provide detailed specs. It saves a lot of back and forth, especially at the beginning, when you're trying to 'de-noise' the idea. I find that the best way is using a hybrid approach: going between iterating over specs to iterating over code, and once I have the MVP I update the specs one last time. I pass these specs to another session and continue working. It solves multiple problems at once: context getting full, LLM getting lost etc.
Also, keep in mind that you can continue iterating over code while you keep the specs up to date. You can then give the final specs to another LLM to resume work or produce another version for a different platform or using a different framework. I do need to update the article to clarify this.
It sounds slow and expensive, but it ended up saving me a lot of tokens and time. Especially when you want to target multiple platforms at once. For me it was liberating, absolutely night and day.
Regardless of which, my favorite model is ChatGPT's. I feel they're the only ones talking to customers. The other models are not as pleasant to work with as a software engineer.
Cursor has for me had the best UX and results until now. Trae's way of adding context is way too annoying. Windsurf has minor UI-issues all over. Options that are extensions in VSCode do not cut it in turn of providing fantastic UI/UX because of the API not supporting it.
I really like Zed. Have not tried any of the mentioned by op.
Zed I feel like is getting somewhere that can replace Sublime Text completely (but not there yet).
AI aided development has first class support in Zed.
Ie. it's not a "plugin" but built-in ecosystem developed by core team.
Speed of iterations on new features is quite impressive.
Their latest agentic editing update basically brought claude code cli to the editor.
Most corporations don't have direct access to arbitrary LLMs but through Microsoft's Github's Copilot they do – and you can use models through copilot and other providers like Ollama – which is great for work.
With their expertise (team behind pioneering tech like electron, atom, teletype, tree sitter, building their own gpu based cross platform ui etc.) and velocity it seems that they're positioned to outpace competition.
Personally I'd say that their tech is maybe two orders of magnitude more valuable than windsurf?
I don't dispite Zed is great, I actually am using it myself, but it's an editor first and foremost. The OP, to me at least seems to be asking more-so about the AI agent comparisons.
Yes very observant, modified forks with their agents built in. Zed does not have any built in, sublime does not have agents built in but if you like you can continue this disingenuous discussion.
Zed has it built in, it's called "agentic editing" [0] and behaves like claude code cli and other agents – mcp based editing, iterating on tests until they pass etc. – where you leave it in a background window and can do something else waiting for completion notification or you can follow it to see what changes it is doing.
It's not only that they have it built in but it seems to be currently the best open replacement for tools like claude code cli because you can use arbitrary llm with it, ie. from ollama and you have great extension points (mcp servers, rules, slash commands etc).
Agentic editing was released recently yes, llm integration was there for much longer. It supported editing but it was more manual – context of conversation from chat was basically available in in-line editing so you could edit code based on llm output but it was more manual process, now it's agentic.
Thank you for your kind offer, I shall take you up on it. Zed does have it built in. Now, please continue your disingenuous conversation by repeatedly claiming something that is demonstrably not true.
Nah, I can see a losing argument when I see one. In my eyes, the OP was asking about the LLM/AI side and not the editor. But okay, zed now has one built in. I know now.
Personally, I've been using Cursor since day 1. Lately with Gemini 2.5 Pro. I've also started experimenting with Zed and local models served via ollama in the last couple of days. Unfortunately, without good results so far.
Personally, if you take the time to configure it well, I think Aider is vastly superior. You can have 4 terminals open in a grid and be running agentic coding workflows on them and 4x the throughput of someone in Cursor, whereas Cursor's UI isn't really amenable to running a bunch of instances and managing them all simultaneously. That plus Aider lets you do more complex automated Gen -> Typecheck -> Lint -> Test workflows with automated fixing.
You can, but Aider is designed to work in a console and be interacted with through limited screen real estate, whereas cursor is designed to be interacted with through a full screen IDE. Besides the resource consumption issue, Cursor's manual prompts are hard to interact with when the window is tiny because it wants to try and pop up source file windows and display diffs in an editor pane, for instance.
When we're managing 10-20 AI coding agents to get work done, the interface for each is going to need to be minimal. A lot of cursor's functionality is going to be vestigial at that point, as a tool it only makes sense as a gap-bridger for people that are still attached to manual coding.
I use the Windsurf Cascade plugin in JetBrains IDEs. My current flow is I rough-in the outline of what I want and have generally use the plugin to improve what I have by creating tests, performance improvements, or just making things more idiomatic. I need to invest time to add rules at both a global and project level which should make the overall experience event better.
I've been flipping between the two, and overall I've found Cursor to be the better experience. Its autocomplete feels better at 'looking ahead' and figuring out what I might be doing next, while Windsurf tends to focus more on repeating whatever I've done recently.
Also, while Windsurf has more project awareness, and it's better at integrating things across files, actually trying to get it to read enough context to do so intelligently is like pulling teeth. Presumably this is a resource-saving measure but it often ends up taking more tokens when it needs to be redone.
Overall Cursor 'just works' better IME. They both have free trials though so there's little reason not to try both and make a decision yourself. Also, Windsurf's pricing is lower (and they have a generous free tier) so if you're on a tight budget it's a good option.
I've had trials for both running and tested both on the same codebases.
Cursor works roughly how I've expected. It reads files and either gets it right or wrong in agent mode.
Windsurf seems restricted to reading files 50 lines at a time, and often will stop after 200 lines [0]. When dealing with existing code I've been getting poorer results than Cursor.
As to autocomplete: perhaps I haven't set up either properly (for PHP) but the autocomplete in both is good for pattern matching changes I make, and terrible for anything that require knowledge of what methods an object has, the parameters a method takes etc. They both hallucinate wildly, and so I end up doing bits of editing in Cursor/Windsurf and having the same project open in PhpStorm and making use of its intellisense.
I'm coming to the end of both trials and the AI isn't adding enough over Jetbrains PhpStorm's built in features, so I'm going back to that until I figure out how to reduce hallucinations.
> I am Claude, an AI assistant created by Anthropic. In this interface, I'm operating as "Junie," a helpful assistant designed to explore codebases and answer questions about projects. I'm built on Anthropic's large language model technology, specifically the Claude model family.
Jetbrains wider AI tools let you choose the model that gets used but as far as I can tell Junie doesn't. That said, it works great.
Amazon Q. Claude Code is great (the best imho, what everything else measures against right now), and Amazon Q seems almost as good and for the first week I've been using it I'm still on the free tier.
The flat pricing of Claude Code seems tempting, but it's probably still cheaper for me to go with usage pricing. I feel like loading my Anthropic account with the minimum of $5 each time would last me 2-3 days depending on usage. Some days it wouldn't last even a day.
I'll probably give Open AI's Codex a try soon, and also circle back to Aider after not using it for a few months.
I don't know if I misundersand something with Cursor or Copilot. It seems so much easier to use Claude Code than Cursor, as Claude Code has many more tools for figuring things out. Cursor also required me to add files to the context, which I thought it should 'figure out' on its own.
> I don't know if I misundersand something with Cursor or Copilot. It seems so much easier to use Claude Code than Cursor, as Claude Code has many more tools for figuring things out. Cursor also required me to add files to the context, which I thought it should 'figure out' on its own.
Cursor can find files on its own. But if you point it in the right direction it has far better results than Claude code.
It went through multiple stages of upgrades and I would say at this stage it is better than copilot. Fundamentally it is as good as cursor or windsurf but lacks some features and cannot match their speed of release. If you re on aws tho its a compelling offering.
I remember asking Amazon Q something and it wouldn’t reply cuz of security policy or something. It was as far as I can remember a legit question around Iam policy which I was trying to configure. I figured it out back in Google search.
It might seem contrary to the current trend, but I've recently returned to using nvim as my daily driver after years with VS Code. This shift wasn't due to resource limitations but rather the unnecessary strain from agentic features consuming high amounts of resources.
I wish your own coding would just be augmented like somebody looking over your shoulder. The problem with the current AI coding is that you don't know your code base anymore. Basically, like somebody helping you figure out stuff faster, update documentation etc.
> Neither? I'm surprised nobody has said it yet. I turned off AI autocomplete ...
This represents one group of developers and is certainly valid for that group. To each their own
For another group, where I belong, AI is a great companion! We can handle the noise and development speed is improved as well as the overall experience.
I prefer VSCode and GitHub copilot. My opinion is this combo will eventually eat all the rest, but that's besides the point.
Agent mode could be faster, sometimes it is rather slow thinking but not a big deal. This mode is all I use these days. Integration with the code base is a huge part of the great experience
The only thing preventing me to keep using Cursor/Windsurf it's the lack of sync feature. I use different machines and it's crucial to keep the same exact configuration on each of them :(
I evaluated Windsurf at a friend's recommendation around half a year ago and found that it could not produce any useful behaviors on files above a thousand lines or so. I understand this is mostly a property of the model, but certainly also a property of the approach used by the editor of just tossing the entire file in, yeah? I haven't tried any of these products since then, but it might be worth another shot because Gemini might be able to handle these files.
I’ve only played with Junie and Aider so far and like the approach the Junie agent takes of reading the code to understand it vs the Aider approach of using the repo map to understand the codebase.
I find Windsurf almost unusable. It’s hard to explain but the same models in Zed, Windsurf, Copilot and Cursor produce drastically worse code in Windsurf for whatever reason. The agent also tends to do way more stupid things at random, like wanting to call MCP tools, creating new files, then suddenly forgetting how to use tools and apologizing a dozen times
I can’t really explain or prove it, but it was noticeable enough to me that I canceled my subscription and left Windsurf
Maybe a prompting or setting issue? Too high temperature?
Nowadays Copilot got good enough for me that it became my daily driver. I also like that I can use my Copilot subscription in different places like Zed, Aider, Xcode
I have also observed this with Gemini 2.5 in Cursor vs Windsurf.
Cline seems to be the best thing, I suspect because it doesn't do any dirty tricks with trimming down the context under the hood to keep the costs down. But for the same reason, it's not exactly fun to watch the token/$ counter ticking as it works.
Recently started using Cursor for adding a new feature on a small codebase for work, after a couple of years where I didn't code. It took me a couple of tries to figure out how to work with the tool effectively, but it worked great! I'm now learning how to use it with TaskMaster, it's such a different way to do and play with software. Oh, one important note: I went with Cursor also because of the pricing, that's despite confusing in term of fast vs slow requests, it smells less consumption base.
VSCode with auto complete and Gemini 2.5 Pro in a standalone chat (pick any interface that works for you, eg librechat, vertex etc). The agents-in-a-IDE experience is hella slow in my opinion.
If you have any intellectual property worth protecting or need to comply with HIPAA, a completely local installation of Cline or Aider or Codeium with LM Studio with Qwen or DeepSeek Coder works well. If you'd rather not bother, I don't see any option to GitHub Copilot for Business. Sure, they're slower to catch up to Cursor, but catch up they will.
Neither! Neovim for most of my work and vscode w/ appropriate plugins when it's needed. If I need any LLM assistance I just run Claude Code in the terminal.
Using Windsurf since the start and I am satisfied. Didn't look beyond it. Focused on actually doing the coding. It's impossible to keep up with daily AI news and if something groundbreaking happens it will go viral.
I think with the answer, each responder should include their level of coding proficiency. Or, at least whether they are able to (or even bother to) read the code that the tool generates. Preferences would vary wildly based on it.
I just use Copilot (across VS Code, VS etc), it lets you pick the model you want and it's a fixed monthly cost (and there is a free tier). They have most of the core features of these other tools now.
Cursor, Windsurf et al have no "moat" (in startup speak), in that a sufficiently resourced organization (e.g. Microsoft) can just copy anything they do well.
VS code/Copilot has millions of users, cursor etc have hundreds of thousands of users. Google claims to have "hundreds of millions" of users but we can be pretty sure that they are quoting numbers for their search product.
Windsurf. The context awareness is superior compared to cursor. It falls over less and is better at retrieving relevant snippets of code. The premium plan is cheaper too, which is a nice touch.
I've been using cursor for several months. Absolutely hate Agent mode - it jumps too far ahead, and it's solutions, though valid, can overcomplicate the whole flow. It can also give you a whole bunch of code that you have to accept blindly will work, and is not super great at making good file layouts, etc. I've switched to autocomplete with the ask mode when I'm stuck
Cursor is good for basic stuff but Windsurf consistently solves issues Cursor fails on even after 40+ mins of retries and prompting changes.
Cursor is very lazy about looking beyond the current context or even context at all sometimes it feels it’s trying to one shot a guess without looking deeper.
Bad thing about Windsurf is the plans are pretty limited and the unlimited “cascade base” feels dumb the times I used it so ultimately I use Cursor until I hit a wall then switch to Windsurf.
Cursor. Good price, the predictive next edit is great, good enough with big code bases and with the auto mode i dont even spend all my prem requests.
I've tried VScode with copilot a couple of times and its frustrating, you have to point out individual files for edits but project wide requests are a pain.
My only pain is the workflow for developing mobile apps where I have to switch back and forth between Android Studio and Xcode as vscode extensions for mobile are not so good
I tested windsurf last week, it installed all dependencies to my global python....it didn't know best practices for Python, and didn't create any virtual env..... I am disappointed. My Cursor experience was slightly better. Still, one issue I had was how to make sure it does not change the part of code I don't want it to change. Every time you ask it to do something for A, it rewrote B in the process, very annoying.
For advanced autocomplete (not code generation, but can do that too), basic planning, looking things up instead of web search, review & summary, even one shooting smaller scripts, the 32b Q4 models proved very good for me (24gb VRAM RTX 3090). All LLM caveats still apply, of course. Note that setting up local llm in cursor is pain because they don't support local host. Ngrok or vps and reverse ssh solve that though.
It's not so much that it's slow, it's that local models are still a far cry from what SOTA cloud LLM providers offer. Depending on what you're actually doing, a local model might be good enough.
I am retired now, out of the game, but I also suggest an alternative: running locally with open-codex, Ollama, and the qwen3 models and gemma3, and when necessary use something hosted like Gemini 2.5 Pro without an IDE.
I like to strike a balance between coding from scratch and using AI.
That's the fun part, you can use all of them! And you don't need to use browser plugins or console scripts to auto-retry failures (there aren't any) or queue up a ton of tasks overnight.
Have a 3950X w/ 32GB ram, Radeon VII & 6900XT sitting in the closet hosting smaller models then a 5800X3D/128GB/7900XTX as my main machine.
Most any quantized model that fits in half of the vram of a single gpu (and ideally supports flash attention, optionally speculative decoding) will give you far faster autocompletes. This is especially the case with the Radeon VII thanks to the memory bandwidth.
And I get fast enough autcomplete results for it to be useful. I have and NVIDIA 4060 RTX in a laptop with 8 gigs of dedicated memory that I use for it. I still use claude for chat (pair programming) though, and I don't really use agents.
While I haven't used Windsurf, I've been using Cursor and I LOVE it: especially the inline autocomplete is like reading my mind and making the work MUCH faster.
I can't say anything about Windsurf (as I haven't tried yet) but I can confidently say Cursor is great.
Currently using cursor. I've found cursor even without the AI features to be a more responsive VS Code. I've found the AI features to be particularly useful when I contain the blast radius to a unit of work.
If I am continuously able to break down my work into smaller pieces and build a tight testing loop, it does help me be more productive.
I'm not sure the answer matters so much. My guess is that as soon as one of them gains any notable advantage over the other, the other will copy it as quickly as possible. They're using the same models under the hood.
I use Windsurf but it's been having ridiculous downtime lately.
I can't use Cursor because I don't use Ubuntu which is what their Linux packages are compiled against and they don't run on my non-Ubuntu distro of choice.
Claude code is the best so far, I am using the 200$ plan. in terms of feature matrix all tools are almost same with some hits and misses but speed is something which claude code wins.
Do you think you use more than just 200$ worth of API credits in a month? I've used both Claude Code and Cursor, and I find myself liking the Terminal CLI, but the price is much more than 20$ per month for me.
best thing about cursor is $20 and u basically get unlimited requests. I know you get “slower” after a certain amount of requests but honestly you dont feel it being slow and reasoning models are taking so much to answer, so anyways you send the prompt and go doing other stuff, so the slowness i dont think it matters and basically unlimited compute u know?
the age of swearing allegiance to a particular IDE/AI tool is over. I keep switching between Cursor and GH Copilot and for the most part they are very similar offerings. Then there's v0, Claude (for its Artifacts feature) and Cline which I use quite regularly for different requirements.
I agree with /u/welder. Preferably neither. Both of these are custom and runs the risk of being acquired and enshittified in future.
If you are using VScode, get familiar with cline. Aider is also excellent if you don’t want to modify your IDE.
Additionally, Jetbrains IDEs now also have built-in local LLMs and their auto-complete is actually fast and decent. They also have added a new chat sidepanel in recent update.
The goal is NOT to change your workflow or dev env, but to integrate these tools into your existing flow, despite what the narrative says.
I’ve really been enjoying the combination of CodeCompanion with Gemini 2.5 for chat, Copilot for completion, and Claude Code/OpenAI Codex for agentic workflows.
I had always wanted to get comfortable with Vim, but it never seemed worth the time commitment, especially with how much I’ve been using AI tools since 2021 when Copilot went into beta. But recently I became so frustrated by Cursor’s bugs and tab completion performance regressions that I disabled completions, and started checking out alternatives.
This particular combination of plugins has done a nice job of mostly replicating the Cursor functionality I used routinely. Some areas are more pleasant to use, some are a bit worse, but it’s nice overall. And I mostly get to use my own API keys and control the prompts and when things change.
I still need to try out Zed’s new features, but I’ve been enjoying daily driving this setup a lot.
Lately I switched to using a triple monitor setup and coding with both Cursor and Windsurf. Basically, the middle monitor has my web browser that shows the front-end I'm building. The left monitor has Cursor, and right one has Windsurf. I start coding with Cursor first because I'm more familiar with its interface, then I ask Windsurf to check if the code is good. If it is, then I commit. Once I'm done coding a feature, I'll also open VScode in the middle monitor, with Cline installed, and I will ask it to check the code again to make sure it's perfect.
I think people who ask the "either or" question are missing the point. We're supposed to use all the AI tools, not one or two of them.
Why not just write a script that does this but with all of the model providers and requests multiple completions from each? Why have a whole ass editor open just for code review?
I'm finding I increasingly produce entire changesets without opening an editor: just `claude code`, or my own cobbled-together version of `claude code`, and `git diff` to preview what's happening. For me, the future of these tools isn't "inside" a text editor. If you want to poke around, my “cobbled‑together Claude Code” lives here: https://github.com/cablehead/gpt2099.nu
It's not just an "editor". Both Windsurf and Cursor do some tricks with context so that the underlying LLM doesn't get confused. Besides, writing a script sounds hard, no need to spend the extra energy when you can simply open a tool. Anyway, that's how I code, feel free to do whatever you prefer.
I use neovim now, after getting tired of the feature creep and the constant chasing of shiny new features.
AI is not useful when it does the thinking for you. It's just advanced snippets at that point. I only use LLMs to explain things or to clarify a topic that doesn't make sense right away to me. That's when it shows it's real strength.
I am using both. Windsurf feels complete less clunky. They are very close tho and the pace of major updates is crazy.
I dont like CLI based tools to code. Dont understand why they are being shilled. Claude code is maybe better at coding from scratch because it is only raw power and eating tokens like there is no tomorrow but it us the wrong interface to build anything serious.
Neither? I'm surprised nobody has said it yet. I turned off AI autocomplete, and sometimes use the chat to debug or generate simple code but only when I prompt it to. Continuous autocomplete is just annoying and slows me down.
All this IDE churn makes me glad to have settled on Emacs a decade ago. I have adopted LLMs into my workflow via the excellent gptel, which stays out of my way but is there when I need it. I couldn't imagine switching to another editor because of some fancy LLM integration I have no control over. I have tried Cursor and VS Codium with extensions, and wasn't impressed. I'd rather use an "inferior" editor that's going to continue to work exactly how I want 50 years from now.
Emacs and Vim are editors for a lifetime. Very few software projects have that longevity and reliability. If a tool is instrumental to the work that you do, those features should be your highest priority. Not whether it works well with the latest tech trends.
Ironically LLMs have made Emacs even more relevant. The model LLMs use (text) happens to match up with how Emacs represents everything (text in buffers). This opens up Emacs to becoming the agentic editor par excellence. Just imagine, some macro magic acound a defcommand and voila, the agent can do exactly what a user can. If only such a project could have the funding like Cursor does...
Nothing could be worse for the modern Emacs ecosystem than for the tech industry finance vampires ("VCs," "LPs") to decide there's blood enough there to suck.
Fortunately, alien space magic seems immune, so far at least. I assume they do not like the taste, and no wonder.
Why should the Emacs community care whether someone decides to build a custom editor with AI features? If anything this would bring more interest and development into the ecosystem, which everyone would benefit from. Anyone not interested can simply ignore it, as we do for any other feature someone implements into their workflow.
Elnode should make this very easy, given the triviality of the MCP "protocol."
I would take care. Emacs has no internal boundaries by design and it comes with the ability to access files and execute commands on remote systems using your configured SSH credentials. Handing the keys to an enthusiastically helpy and somewhat cracked robot might prove so bad an idea you barely even have time to put your feet up on the dash before you go sailing through the windshield.
yea... I guess is too niche, I guess scratch your own itch + foss it so the low hundreds of us can have fun or smt
I was exploring using andyk/ht discussed on hn a few months back, to sit as a proxy my llm can call at the same time i control via xtermjs, but i need to figure out how to train the llm to output keybindings/special keys etc, but promising start nonetheless, i can indeed parse a lot of extra info than just a command, just imagine if AI could use all of the shell auto-complete features but feed into it..
maybe i should revisit/cleanup that repo and make it public. It feels like with just some data training on special key bindings etc an llm should be able to type, even if -char by char- at a faster speed than a human, to control TUI's
I'm not sure why you were downvoted. You're right that buffers and everything being programmable makes Emacs an ideal choice for building an AI-first editor. Whether that's something that a typical Emacs user wants is a separate issue, but someone could certainly build a polished experience if they had the resources and motivation. Essentially every Emacs setup is someone's custom editor, and AI features are not different from any other customization.
Emacs diff tools alone is a reason to use the editor. I switch between macOS, Linux, and Windows frequently so settled on Emacs and happy with that choice as well.
I’ve been using Aidermacs to access Aider in Emacs and it works quite well and makes lots of LLMs available. Claude Sonnet 3.7 has been reasonable for code generation, though there are certainly tasks that it seems to struggle on.
Cursor/Windsurf and similar IDEs and plugins are more than autocomplete on steroids.
Sure, you might not like it and think you as a human should write all code, but frequent experience in the industry in the past months is that productivity in the teams using tools like this has greatly increased.
It is not unreasonable to think that someone deciding not to use tools like this will not be competitive in the market in the near future.
I think you’re right, and perhaps it’s time for the “autocomplete on steroids” tag to be retired, even if something approximating that is happening behind the scenes.
I was converting a bash script to Bun/TypeScript the other day. I was doing it the way I am used to… working on one file at a time, only bringing in the AI when helpful, reviewing every diff, and staying in overall control.
Out of curiosity, threw the whole task over to Gemini 2.5Pro in agentic mode, and it was able to refine to a working solution. The point I’m trying to make here is that it uses MCP to interact with the TS compiler and linters in order to automatically iterate until it has eliminated all errors and warnings. The MCP integrations go further, as I am able to use tools like Console Ninja to give the model visibility into the contents of any data structure at any line of code at runtime too. The combination of these makes me think that TypeScript and the tooling available is particularly suitable for agentic LLM assisted development.
Quite unsettling times, and I suppose it’s natural to feel disconcerted about how our roles will become different, and how we will participate in the development process. The only thing I’m absolutely sure about is that these things won’t be uninvented with the genie going back in the bottle.
That wasn’t really the point I was getting at, but as you asked…
The reading doesn’t involve much more than a cursory (no pun intended) glance, and I didn’t test more than I would have tested something I had written manually.
Maybe it wasn't your point. But cost of development is a very important factor, considering some of the thinking models burn tokens like no tomorrow. Accuracy is another. Maybe your script is kind of trivial/inconsequential so it doesn't matter if the output has some bugs as long as it seems to work. There are a lot of throwaway scripts we write, for which LLMs are an excellent tool to use.
I use Rider with some built in AI auto-complete. I'd say its hit rate is pretty low!
Sometimes it auto-completes nonsense, but sometimes I think I'm about to tab on auto-completing a method like FooABC and it actually completes it to FoodACD, both return the same type but are completely wrong.
I have to really be paying attention to catch it selecting the wrong one. I really really hate this. When it works its great, but every day I'm closer to just turning it off out of frustration.
Arguing that ActiveX or Silverlight are comparable to AI, seeing what changes it did bring and is bringing, is definitely a weak argument.
A lot of people are against change because it endangers their routine, way of working, livelihood, which might be a normal reaction. But as accountants switched to using calculators and Excel sheets, we will also switch to new tools.
Where is this 2x, 10x or even 1.5x increase in output? I don't see more products, more features, less bugs or anything related to that since this "AI revolution".
I keep seeing this being repeated ad nauseam without any real backing of hard evidence. It's all copium.
Surely if everyone is so much more productive, a single person startup is now equivalent to 1 + X right?
Please enlighten me as I'm very eager to see this impact in the real world.
> is that productivity in the teams using tools like this has greatly increased
On the short term. Have fun debugging that mess in a year while your customers are yelling at you! I'll be available for hire to fix the mess you made which you clearly don't have the capability to understand :-)
Debugging any system is not easy, it is not like technical debt didn't exit before AI, people will be writing shitcode in the future as they were in the past. Probably more, but there are also more tools that help with debugging.
Additionally, what you are failing to realise is that not everyone is just vibe coding and accepting blindly what the LLM is suggesting and deploying it to prod. There are actually people with decade+ of experience who do use these tools and who found it to be an accelerator in many areas, from writing boilerplate code, to assisting with styling changes.
In any case, thanks for the heads up, definitely will not be hiring you with that snarky attitude. Your assumption that I have no capability to understand something without any context tells more about you than me, and unfortunately there is no AI to assist you with that.
To be fair, I think the most value is added by Agent modes, not autocomplete. And I agree that AI-autocomplete is really quite annoying, personally I disable it too.
But coding agents can indeed save some time writing well-defined code and be of great help when debugging. But then again, when they don't work on a first prompt, I would likely just write the thing in Vim myself instead of trying to convince the agent.
My point being: I find agent coding quite helpful really, if you don't go overzealous with it.
Are you using these in your day job to complete real world tasks or in greenfield projects?
I simply cannot see how I can tell an agent to implement anything I have to do in a real day job unless it's a feature so simple I could do it in a few minutes. Even those the AI will likely screw it up since it sucks at dealing with existing code, best practices, library versions, etc.
I've found it useful for doing simple things in parallel. For instance, I'm working on a large typescript project and one file doesn't have types yet. So I tell the AI to add typing to it with a description while I go work on other things. I check back in 5-10 mins later and either commit the changes or correct it.
Or if I'm working on a full stack feature, and I need some boilerplate to process a new endpoint or new resource type on the frontend, I have the AI build the api call that's similar to the other calls and process the data while I work on business logic in the backend. Then when I'm done, the frontend API call is mostly set up already
I found this works rather well, because it's a list of things in my head that are "todo, in progress" but parallelizable so I can easily verify what its doing
SOTA LLMs are broadly much better at autonomous coding than they were even a few months ago. But also, it really depends on what it is exactly you're working on, and what tech is involved. Things are great if you're writing Python or TypeScript, less so with C++, and even less so with Rust and other emerging technologies.
The few times I've tried to use an agent for anything slightly complex or on a moderately large code base it just proceeds to smeer poop all over the floor eventually backing itself into a corner.
I shortcut the "cursor tab" and enable or disable it as needed. If only Ai was smart enough to learn when I do and don't want it (like clippy in the ms days) - when you are manually toggling it on/off clear patterns emerge (to me at least) as to when I do and don't want it.
Bottom right says "cursor tab" you can manually manipulate it there (and snooze for X minutes - interesting feature). For binding shortcuts - Command/Ctrl + Shift + P, then look for "Enable|Disable|Whatever Cursor Tab" and set shortcuts there.
Old fashioned variable name / function name auto complete is not affected.
I considered a small macropad to enable / disable with a status light - but honestly don't do enough work to justify avoiding work by finding / building / configuring / rebuilding such a solution. If the future is this sort of extreme autocomplete in everything I do on a computer, I would probably go to the effort.
I can't even get simple code generation to work for VHDL. It just gives me garbage that does not compile. I have to assume this is not the case for the majority of people using more popular languages? Is this because the training data for VHDL is far more limited? Are these "AIs" not able to consume the VHDL language spec and give me actual legal syntax at least?! Or is this because I'm being cheap and lazy by only trying free chatGPT and I should be using something else?
Its all of that to some extent or the other. LLMs don't update overnight and as such lag behind innovations in major frameworks, even in web development. No matter what is said about augmenting their capabilities, their performance using techniques like RAG seem to be lacking. They don't work well with new frameworks either.
Any library that breaks backwards compatibility in major version releases will likely befuddle these models. That's why I have seen them pin dependencies to older versions, and more egregiously, default to using the same stack to generate any basic frontend code. This ignores innovations and improvements made in other frameworks.
For example, in Typescript there is now a new(ish) validation library call arktype. Gemini 2.5 pro straight up produces garbage code for this. The type generation function accepts an object/value. But gemini pro keeps insisting that it consumes a type.
So Gemini defines an optional property as `a?: string` which is similar to what you see in Typescript. But this will fail in arktype, because it needs it input as `'a?': 'string'`. Asking gemini to check again is a waste of time, and you will need enough familiarity with JS/TS to understand the error and move ahead.
Forcing development into an AI friendly paradigm seems to me a regressive move that will curb innovation in return for boosts in junior/1x engineer productivity.
Yep, management dreams of being able to make every programmer a 10x programmer by handing them an LLM, but the 10x programmers are laughing because they know how far off the rails the LLM will go. Debugging skills are the next frontier.
It's fun watching the AI bros try to spin justifications for building (sorry, vibing) new apps using Ruby for no reason other then the model has so much content back to 2004 to train off.
They are probably really good at React. And because that ecosystem has been in a constant cycle of reinventing the wheel, they can easily pump out boilerplate code because there is just so much of it to train from.
The amount of training data available certainly is a big factor. If you’re programming in Python or JavaScript, I think the AIs do a lot better. I write in Clojure, so I have the same problem as you do. There is a lot less HDL code publicly available, so it doesn’t surprise me that it would struggle with VHDL. That said, from everything I’ve read, free ChatGPT doesn’t do as well on coding. OpenAI’s paid models are better. I’ve been using Anthropic’s Claude Sonnet 3.7. It’s paid but it’s very cost effective. I’m also playing around with the Gemini Pro preview.
It's very helpful for low level chores. The bane of my existence is frontend, and generating UI elements for testing backend work on the fly rocks. I like the analogy of it being a junior dev; Perhaps even an intern. You should check their work constantly and give them extremely pedantic instructions
Same here. It's extremely distracting to see the random garbage that the autocomplete keeps trying to do.
I said this in another comment but I'll repeat the question: where are these 2x, 10x or even 1.5x increases in output? I don't see more products, more features, less bugs or anything related to that since this "AI revolution".
I keep seeing this being repeated ad nauseam without any real backing of hard evidence.
If this was true and every developer had even a measly 30% increase in productivity, it would be like a team of 10 is now 13. The amount of code being produced would be substantially more and as a result we should see an absolute boom in new... everything.
New startups, new products, new features, bugs fixed and so much more. But I see absolutely nothing but more bullshit startups that use APIs to talk to these models with a few instructions.
Please someone show me how I'm wrong because I'd absolutely love to magically become way more productive.
I am but a small humble minority voice here but perhaps I represent a larger non-HN group:
I am not a professional SWE; I am not fluent in C or Rust or bash (or even Typescript) and I don't use Emacs as my editor or tmux in the terminal;
I am just a nerdy product guy who knows enough to code dangerously. I run my own small business and the software that I've written powers the entire business (and our website).
I have probably gotten a AT LEAST a 500-1000% speedup in my personal software productivity over the past year that I've really leaned into using Claude/Gemini (amazing that GPT isn't on that list anymore, but that's another topic...) I am able to spec out new features and get them live in production in hours vs. days and for bigger stuff, days vs weeks (or even months). It has changed the pace and way in which I'm able to build stuff. I literally wrote an entire image editing workflow to go from RAW camera shot to fully processed product image on our ecommerce store that's cut out actual, real, dozens of hours of time spent previously.
Is the code I'm producting perfect? Absolutely not. Do I have 100% test coverage? Nope. Would it pass muster if I were a software engineer at Google? Probably not.
Is it working, getting to production faster, and helping my business perform better and insanely more efficiently? Absolutely.
I think that tracks with what I see: LLMs enable non-experts to do something really fast.
If I want to, let's say, create some code in a language I never worked on an LLM will definitely make me more "productive" by spewing out code for me way faster than I could write it. Same if I try to quickly learn about a topic I'm not familiar with. Especially if you don't care about the quality, maintainability, etc. too much.
But if I'm already a software developer with 15 years of experience dealing with technology I use every day, it's not going to increase my productivity in any meaningful way.
This is the dissonance I see with AI talk here. If you're not a software developer the things LLMs enable you to do are game-changers. But if you are a good software developer, in its best days it's a smarter autocomplete, a rubber-duck substitute (when you can't talk to a smart person) or a mildly faster google search that can be very inaccurate.
If you go from 0 to 1 that's literally infinitely better but if you go from 100 to 105, it's barely noticeable. Maybe everyone with these absurd productivity gains are all coming from zero or very little knowledge but for someone that's been past that point I can't believe these claims.
Yeah, I use IntelliJ with the chat sidebar. I don't use autocomplete, except in trivial cases where I need to write boilerplate code. Other than that, when I need help, I ask the LLM and then write the code based on its response.
I'm sure it's initially slower than vibe-coding the whole thing, but at least I end up with a maintainable code base, and I know how it works and how to extend it in the future.
Absolutely hate the agent mode but I find autocomplete with asks to be the best for me. I like to at least know what I'm putting in my codebase and it genuinely makes me faster due to:
1) Stops me overthinking the solution
2)Being able to ask it pros and cons of different solutions
3) multi-x speedup means less worry about throwing away a solution/code I don't like and rewriting / refactoring
4) Really good at completing certain kinds of "boilerplate-y" code
5) Removed need to know the specific language implementation but rather the principle (for example pointers, structs, types, mutexes, generics, etc). My go to rule now is that I won't use it if I'm not familiar with the principle, and not the language implementation of that item
6) Absolute beast when it comes to debugging simple to medium complexity bugs
I'm past the honeymoon stage for LLM autocomplete.
I just noticed CLion moved to a community license, so I re-installed it and set up Copilot integration.
It's really noisy and somehow the same binding (tab complete) for built in autocomplete "collides" with LLM suggestions (with varying latency). It's totally unusable in this state; you'll attempt to populate a single local variable or something and end up with 12 lines of unrelated code.
I've had much better success with VSCode in this area, but the complete suggestions via LLM in either are usually pretty poor; not sure if it's related to the model choice differing for auto complete or what, but it's not very useful and often distracting, although it looks cool.
This is where I landed too. Used Cursor for a while before realizing that it was actually slowing me down because the PR cycle took so much longer, due to all the subtle bugs in generated code.
Went back to VSCode with a tuned down Copilot and use the chat or inline prompt for generating specific bits of code.
Well yes, but I personally would never submit a PRr I could use the excuse, "sorry, AI wrote those parts, that's why this PR has now bugs than usual".
All that to say that the base of your argument is still correct: AI really isn't saving all that much time since everyone has to proof-read it so much in order to not increase the number of PR bugs from using it in the first place.
AI autocomplete can be infuriating if like me, you like to browse the public methods and properties by dotting the type. The AI autocomplete sometimes kicks in and starts writing broken code using suggestions that don't exist and that prevents quickly exploring the actual methods available.
I have largely disabled it now, which is a shame, because there are also times it feels like magic and I can see how it could be a massive productivity lever if it needed a tighter confidence threshold to kick in.
I always forget syntax for things like ssh port forwarding. Now just describe it at the shell:
$ ssh (take my local port 80 and forward it to 8080 on the machine betsy) user@betsy
or maybe:
$ ffmpeg -ss 0:10:00 -i somevideo.mp4 -t 1:00 (speed it up 2x) out.webm
I press ctrl+x x and it will replace the english with a suggested command. It's been a total game changer for git, jq, rsync, ffmpeg, regex..
For more involved stuff there's screen-query: Confusing crashes, strange terminal errors, weird config scripts, it allows a joint investigation whereas aider and friends just feels like I'm asking AI to fuck around.
This never accesses any extradata and works only when explicitly asked? I find terminal as most important part from privacy perspective and I haven’t tried any LLM integration yet…
I also realized this morning that shell-hook is good enough to typo correct. I have that turned on at the shell level (setopt correct) but sometimes it doesn't work like here
git cloen blahalalhah
I did a ctrl+x x and it fixed it. I'm using openrouter/google/gemma-3-27b-it:free via chutes. Not a frontier model in the slightest.
I was 100% in agreement with you when I tried out Copilot. So annoying and distracting. But Cursor’s autocomplete is nothing like that. It’s much less intrusive and mostly limits itself to suggesting changes you’ve already done. It’s a game changer for repetitive refactors where you need to do 50 nearly identical but slightly different changes.
I had turned autocomplete off as well. Way too many times it was just plain wrong and distracting. I'd like it to be turned on for method documentation only, though, where it worked well once the method was completed, but so far I wasn't able to customize it this way.
Having it as tab was a mistake, tab complete for snippets is fine because it’s at the end of a line, tab complete in empty text space means you always have to be aware if it’s in autocomplete context or not before setting an indent.
We have an internal ban policy on copilot for IP reasons and while I was... missing it initially, now just using neovim without any AI feels fine. Maybe I'll add an avante.nvim for a built-in chat box though.
Your comment is about 2 years late. Autocomplete is not the focus of AI IDEs anymore, even though it has gotten really good with "next edit predicion". People use AI these days use it for the agentic mode.
What folks don't understand, or keep in mind maybe, is that in order for that autocomplete to work, all your code is going up to a third party as you write it or open files. This is one of the reasons I disable it. I want to control what I send via the chat side panel by explicitly giving it context. It's also pretty useless most of the time, generating nonsense and not even consistently either.
Asking HN this is like asking which smartphone to use. You'll get suggestions for obscure Linux-based modular phones that weigh 6 kilos and lack a clock app or wifi. But they're better because they're open source or fully configurable or whatever. Or a smartphone that a fellow HNer created in his basement and plans to sell soon.
Cursor and Windsurf are both good, but do what most people do and use Cursor for a month to start with.
haha so on point! In the HN world, backend are written in Rust with formal proof and frontend are in pure JS and maybe Web Components. In the real world however, a lot of people are using different tech
Except for the crowd of extreme purists on HN where the backend is written in their divine C language by programmers blessed with an inability to ever have bugs that make it to production. Ad where the frontend is pure HTML because JavaScript is the language the devil speaks.
I use Cursor as my base editor + Cline as my main agentic tool. I have not tried Windsurf so alas I can't comment here but the Cursor + Cline combo works brilliantly for me:
* Cursor's Cmk-K edit-inline feature (with Claude 3.7 as my base model there) works brilliantly for "I just need this one line/method fixed/improved"
* Cursor's tab-complete (neé SuperMaven) is great and better than any other I've used.
* Cline w/ Gemini 2.5 is absolutely the best I've tried when it comes to full agentic workflow. I throw a paragraph of idea at it and it comes up with a totally workable and working plan & implementation
Fundamentally, and this may be my issue to get over and not actually real, I like that Cline is a bring-your-own-API-key system and an open source project, because their incentives are to generate the best prompt, max out the context, and get the best results (because everyone working on it wants it to work well). Cursor's incentive is to get you the best results....within their budget (of $.05 per request for the max models and within your monthly spend/usage allotment for the others). That means they're going to try to trim context or drop things or do other clever/fancy cost saving techniques for Cursor, Inc.. That's at odds with getting the best results, even if it only provides minor friction.
Just use codex and machtiani (mct). Both are open source. Machtiani was open sourced today. Mct can find context in a hay stack, and it’s efficient with tokens. Its embeddings are locally generated because of its hybrid indexing and localization strategy. No file chunking. No internet, if you want to be hardcore. Use any inference provider, even local. The demo video shows solving an issue VSCode codebase (of 133,000 commits and over 8000 files) with only Qwen 2.5 coder 7B. But you can use anything you want, like Claude 3.7. I never max out context in my prompts - not even close.
https://github.com/tursomari/machtiani
This sounds really cool. Can you explain your workflow in a bit more detail? i.e. how exactly you work with codex to implement features, fix bugs etc.
Say I'm chatting in a git project directory `undici`. I can show you a few ways how I work with codex.
1. Follow up with Codex.
`mct "fix bad response on h2 server" --model anthropic/claude-3.7-sonnet:thinking`
Machtiani will stream the answer, then also apply git patches suggested in the convo automatically.
Then I could follow up with codex.
`codex "See unstaged git changes. Run tests to make sure it works and fix and problems with the changes if necessary."
2. Codex and MCT together
`codex "$(mct 'fix bad response on h2 server' --model deepseek/deepseek-r1 --mode answer-only)"`
In this case codex will dutifully implement the suggested changes of codex, saving tokens and time.
The key for the second example is `--mode answer-only`. Without this flagged argument, mct will itself try and apply patches. But in this case codex will do it as mct withholds the patches with the aforementioned flagged arg.
3. Refer codex to the chat.
Say you did this
`mct "fix bad response on h2 server" --model gpt-4o-mini --mode chat`
Here, I used `--mode chat`, which tells mct to stream the answer and save the chat convo, but not to apply git changes (differrent than --mode answer-only).
You'll see mct will printout that something like
`Response saved to .machtiani/chat/fix_bad_server_resonse.md`
Now you can just tell codex.
`codex "See .machtiani/chat/fix_bad_server_resonse.md, and do this or that...."`
*Conclusion*
The example concepts should cover day-to-day use cases. There are other exciting workflows, but I should really post a video on that. You could do anything with unix philosophy!
How does this compare to aider?
I skipped using aider, but I heard good things. I needed to work with large, complex repos, not vibe codebases. And agents require always top-notch models that are expensive and can't run locally well. So when Codex came out, it skipped to that.
But mct leverages the weak models well, do things not possible otherwise. And it does even better with stronger models. Rewards stronger models, but doesn't punish smaller models.
So basically, you can use save money and do more using mct + codex. But I hear aider is terminal tool so maybe try and mct + aider?
Totally agree on aligning with the one with clearest incentives here
I also like Cline since it being open source means that while I’m using it I can see the prompts and tools and thus learn how to build better agents.
Clines agent work is better than Cursors own?
Cursor does something with truncating context to save costs on their end, you dont get the same with Cline because you're paying for each transaction - so depending on complexity I find Cline works significantly better.
I still use cursor chat with agent mode though, but I've always been indecisive. Like the others said though, its nice to see how cline behaves to assist with creating your own agentic workflows.
> Cursor does something with truncating context to save costs on their end
I have seen mentioning of this but is there actually a source to back it up? Tried Cline every now and then. While it's great, I don't find it better than Cursor (nor worse in any clear way)
Totally anecdotal of course so take this with a grain of salt, but I've seen and experienced this when Cursor chats start to get very long (eg the context starts to really fill up). It suddenly starts "forgetting" things you talked about earlier or producing code that's at odds with code it already produced. I think it's partly why they suggest but don't enforce starting a new chat when things start to really grow.
It's actually very easy to see for yourself. When the agent "looks" at a file it will say the number of lines it looks at, almost always its the top 0-250 or 0-500 but might depend on model selected and if MAX mode is utilized.
Zed. They've upped their game in the AI integration and so far it's the best one I've seen (external from work). Cursor and VSCode+Copilot always felt slow and janky, Zed is much less janky feels like pretty mature software, and I can just plug in my Gemini API key and use that for free/cheap instead of paying for the editor's own integration.
I gave Zed an in-depth trial this week and wrote about it here: https://x.com/vimota/status/1921270079054049476
Overall Zed is super nice and opposite of janky, but still found a few of defaults were off and Python support still was missing in a few key ways for my daily workflow.
ooc what python support was missing for you? I'm debating Zed
Consumes lots of resources on an M4 Macbook. Would love to test it though. If it didn’t freeze my Macbook.
Edit:
With the latest update to 0.185.15 it works perfectly smooth. Excellent addition to my setup.
I'll second the zed recommendation, sent from my M4 macbook. I don't know why exactly it's doing this for you but mine is idling with ~500MB RAM (about as little as you can get with a reasonably-sized Rust codebase and a language server) and 0% CPU.
I have also really appreciated something that felt much less janky, had better vim bindings, and wasn't slow to start even on a very fast computer. You can completely botch Cursor if you type really fast. On an older mid-range laptop, I ran into problems with a bunch of its auto-pair stuff of all things.
Yeah, same. Zed is incredibly efficient on my M1 Pro. It's my daily driver these days, and my Python setup in it is almost perfect.
What’s your Python setup?
In my case this was the culprit: https://github.com/zed-industries/zed/issues/13190 otherwise it worked great mostly.
I am using Zed too, it still has some issues but it is comparable to Cursor. In my opinion they iterate even faster than the VSCode forks.
Yep not having to build off a major fork will certainly help you move fast
Here's a recent Changelog podcast episode about the latest with Zed and its new agentic feature, https://changelog.com/podcast/640.
I just wish they'd release a debugger already. Once its done i'll be moving to them completely.
Zed doesn't even run on my system and the relevant github issue is only updated by people who come to complain about the same issue.
Don’t use windows? I don’t feel like that’s a terribly uncommon proposition for a dev.
Debian latest stable.
Windows? If so, you can run it, you just have to build it.
Debian latest stable.
Does it have Cursor’s “tab” feature?
Yep: https://zed.dev/blog/edit-prediction
It would be great if there was an easy way to run their open model (https://huggingface.co/zed-industries/zeta) locally ( for latency reasons ).
I don't think Zeta is quite up to windsurf's completion quality/speed.
I get that this would go against their business model, but maybe people would pay for this - it could in theory be the fastest completion since it would run locally.
> the fastest completion since it would run locally
We are living in a strange age that local is slower than the cloud. Due to the sheer amount of compute we need to do. Compute takes hundreds of milliseconds (if not seconds) on local hardware, making 100ms of network latency irrelevant.
Even for a 7B model your expensive Mac or 4090 can't beat, for example, a box with 8x A100s running FOSS serving stack (sglang) with TP=8, in latency.
Running models locally is very expensive in terms of memory and scheduling requirements, maybe instead they should host their model on the Cloudflare AI network which is distributed all around the world and can have lower latency
Sort of. The quality is light and day different (cursor feels like magic, Zed feels like a chore).
I can second this. I really do want to move to Zed full time but the code completion is nowhere near as useful or "smart" as cursor's yet.
Yep I want Zed to win but it has not yet become my daily driver
For the agentic stuff I think every solution can be hit or miss. I've tried claude code, aider, cline, cursor, zed, roo, windsurf, etc. To me it is more about using the right models for the job, which is also constantly in flux because the big players are constantly updating their models and sometimes that is good and sometimes that is bad.
But I daily drive Cursor because the main LLM feature I use is tab-complete, and here Cursor blows the competition out of the water. It understands what I want to do next about 95% of the time when I'm in the middle of something, including comprehensive multi-line/multi-file changes. Github Copilot, Zed, Windsurf, and Cody aren't at the same level imo.
If we’re talking purely auto complete I think Supermaven does it the best.
Cursor bought Supermaven last year.
It still works
Do they actually improve the model you can get without cursor? Or in reality all the development goes to cursor's autocomplete without making this available to supermaven subscribers? It is hard to make sure of that from their website and lack of info online.
Aider! Use the editor of your choice and leave your coding assistant separate. Plus, it's open source and will stay like this, so no risk to see it suddenly become expensive or dissappear.
I used to be religiously pro-Aider. But after a while those little frictions flicking backwards and forwards between the terminal and VS Code, and adding and dropping from the context myself, have worn down my appetite to use it. The `--watch` mode is a neat solution but harms performance. The LLM gets distracted by deleting its own comment.
Roo is less solid but better-integrated.
Hopefully I'll switch back soon.
I suspect that if you're a vim user those friction points are a bit different. For me, Aider's git auto commit and /undo command are what sells it for me at this current junction of technology. OpenHands looks promising, though rather complex.
The (relative) simplicity is what sells aider for me (it also helps that I use neovim in tmux).
It was easy to figure out exactly what it's sending to the LLM, and I like that it does one thing at a time. I want to babysit my LLMs and those "agentic" tools that go off and do dozens of things in a loop make me feel out of control.
I like your framing about “feeling out of control”.
For the occasional frontend task, I don’t mind being out of control when using agentic tools. I guess this is the origin of Karpathy’s vibe coding moniker: you surrender to the LLM’s coding decisions.
For backend tasks, which is my bread and butter, I certainly want to know what it’s sending to the LLM so it’s just easier to use the chat interface directly.
This way I am fully in control. I can cherry pick the good bits out of whatever the LLM suggests or redo my prompt to get better suggestions.
How do get you out the "good bits" without a diff/patch file? or do you ask the LLM for that and apply it manually?
Basically what antirez described about 4 days ago in this thread https://news.ycombinator.com/item?id=43929525.
So this part of my workflow is intentionally fairly labor intensive because it involves lots of copy-pasting between my IDE and the chat interface in a browser.
From the linked comment: > Mandatory reminder that "agentic coding" works way worse than just using the LLM directly
just isn't true. If everything was equal, that might possibly be true, but it turns out that system prompts are quite powerful in influencing how an LLM behaves. ChatGPT with a blank user entered system prompt behaves differently (read: poorer at coding) than one with a tuned system prompt. Aider/Copilot/Windsurf/etc all have custom system prompts that make them more powerful rather than less, compared to using a raw web browser, and also don't involve the overhead of copy pasting.
Approximately how much does it cost in practice to use Aider? My understanding is that Aider itself is free, but you have to pay per token when using an API key for your LLM of choice. I can look up for myself the prices of the various LLMs, but it doesn't help much, since I have no intuition whatsoever about how many tokens I am likely to consume. The attraction of something like Zed or Cursor for me is that I just have a fixed monthly cost to worry about. I'd love to try Aider, as I suspect it suits my style of work better, but without having any idea how much it would cost me, I'm afraid of trying.
I'm using Gemini 2.5 Pro with Aider and Cline for work. I'd say when working for 8 full hours without any meetings or other interruptions, I'd hit around $2. In practice, I average at $0.50 and hit $1 once in the last weeks.
Wow my first venture into Claude Code (which completely failed for a minor feature addition on a tiny Swift codebase) burned $5 in about 20 minutes.
Probably related to Sonnet 3.7’s rampant ADHD and less the CLI tool itself (and maybe a bit of LLMs-suck-at-Swift?)
In my testing aider tends to spend about 1/10th the money as claude code. I assume because, in aider, you are explicit about /add and everything
I'd be really keen to know more about what you're using it for, how you typically prompt it, and how many times you're reaching for it. I've had some success at keeping spend low but can also easily spend $4 from a single prompt so I don't tend to use tools like Aider much. I'd be much more likely to use them if I knew I could reliably keep the spend down.
I'll try to elaborate:
I'm using VSC for most edits, tab-completion is done via Copilot, I don't use it that much though, as I find the prediction to be subpar or too wordy in case of commenting. I use Aider for rubber-ducking and implementing small to mid-scope changes. Normally, I add the required files, change to architect or ask mode (depends on the problem I want to solve), explain what my problem is and how I want it to be solved. If the Aider answer satisfies me, I change to coding mode and allow the changes.
No magic, I have no idea how a single prompt can generate $4. I wouldn't be surprised if I'm only scratching on the surface with my approach though, maybe there is a better but more costly strategy yielding better results which I just didn't realize yet.
This is very inexpensive. What is your workflow and savings techniques! I can spend $10/h or more with very short sessions and few files.
Huh, I didn't configure anything for saving, honestly. I just add the whole repo and do my stuff. How do you get to $10/h? I probably couldn't even provoke this.
I assume we have a very different workflow.
do you use any tool to add the whole repo?
Not sure how that’s possible? Do you ask it one question every hour or so?
Depends entirely on the API.
With deepseek: ~nothing.
is deepseek fast enough for you? For me the API is very slow, sometimes unusable
To be honest I'm using windsurf with openAI/google right now and used deepseek with aider when it was still less crowded.
My only problem was deepseek occasionally not answering at all, but generally it was fast (non thinking that was).
It will tell you how much each request cost you as well as a running total.
You your /tokens to see how many tokens it has in its context for the next request. You manage it by dropping files and clearing the context.
I love Aider, but I got frustrated with its limitations and ended up creating Brokk to solve them: https://brokk.ai/
Compared to Aider, Brokk
- Has a GUI (I know, tough sell for Aider users but it really does help when managing complex projects)
- Builds on a real static analysis engine so its equivalent to the repomap doesn't get hopelessly confused in large codebases
- Has extremely useful git integration (view git log, right click to capture context into the workspace)
- Is also OSS and supports BYOK
I'd love to hear what you think!
Apart from the GUI, What does it improve on when compared to aider.
Yup, choose your model and pay as you go, like commodities like rice and water. The others played games with me to minimize context and use cheaper models (such as 3 modes, daily credits etc, using most expensive model etc).
Also the --watch mode is the most productive interface of using your editor, no need of extra textboxes with robot faces.
fwiw. Gemini-*, which is available in Aider, isn't Pay As You Go (payg) but post paid, which means you get a bill at the end of the month and not the OpenAI/others model of charging up credits before you can use the service.
I guess this is a good reason to consider things like openrouter. Turns it into a prepaid service.
For a time windsurf was way ahead of cursor in full agentic coding, but now I hear cursor has caught up. I have yet to switch back to try out cursor again but starting to get frustrated with Windsurf being restricted to gathering context only 100-200 lines at a time.
So many of the bugs and poor results that it can introduce are simply due to improper context. When forcibly giving it the necessary context you can clearly see it’s not a model problem but it’s a problem with the approach of gathering disparate 100 line snippets at a time.
Also, it struggles with files over 800ish lines which is extremely annoying
We need some smart deepseek-like innovation in context gathering since the hardware and cost of tokens is the real bottleneck here.
Wait, are these 800 lines of code? Am I the only one seeing that as a major code smell? Assuming these are code files, the issue is not AI processing power but rather bread and butter coding practices related to file organisation and modularisation.
I agree, but I've worked with many people now who seem to prefer one massive file. Specifically Python and React people seem to do this a lot.
Frustrates the hell out of me as someone who thinks at 300-400 lines generally you should start looking at breaking things up.
you can use the filesystem mcp and have it use the read file tool to read the files in full on call
For daily work - neither. They basically promote the style of work where you end up with mediocre code that you don't fully understand, and with time the situation gets worse.
I get much better result by asking specific question to a model that has huge context (Gemini) and analyzing the generated code carefully. That's the opposite of the style of work you get with Cursor or Windsurf.
Is it less efficient? If you are paid by LoCs, sure. But for me the quality and long-term maintainability are far more important. And especially the Tab autocomplete feature was driving me nuts, being wrong roughly half of the time and basically just interrupting my flow.
I agree! I like local tools, mostly, use Gemini 2.5 Pro when actually needed and useful, and do a lot of manual coding.
But how do you dump your entire code base into Gemini? Literally all I want is a good model with my entire code base in its context window.
I wrote a simple Python script that I run in any directory that gets the context I usually need and copies to the clipboard/paste buffer. A short custom script let's you adjust to your own needs.
Repomix can be run from the command line
https://github.com/yamadashy/repomix
Legal issues aside (you are the legal owner of that code or you checked with one), and provided it's small enough, just ask an LLM to write a script to do so . If the code base is too big, you might have luck choosing the right parts. The right balance of inclusions and exclusions can work miracles here.
I’ve been using Zed Agent with GitHub Copilot’s models, but with GitHub planning to limit usage, I’m exploring alternatives.
Now I'm testing Claude Code’s $100 Max plan. It feels like magic - editing code and fixing compile errors until it builds. The downside is I’m reviewing the code a lot less since I just let the agent run.
So far, I’ve only tried it on vibe coding game development, where every model I’ve tested struggles. It says “I rewrote X to be more robust and fixed the bug you mentioned,” yet the bug still remains.
I suspect it will work better for backend web development I do for work: write a failing unit test, then ask the agent to implement the feature and make the test pass.
Also, give Zed’s Edit Predictions a try. When refactoring, I often just keep hitting Tab to accept suggestions throughout the file.
Can you say more to reconcile "It feels like magic" with "every model I’ve tested struggles."?
It feels like magic when it works and it at least gets the code to compile. Other models* would usually return a broken code. Specially when using a new release of a library. All the models use the old function signatures, but Claud Code then sees compile error and fixes it.
Compared to Zed Agent, Claude Code is: - Better at editing files. Zed would sometimes return the file content in the chatbox instead of updating it. Zed Agent also inserted a new function in the middle of the existing function. - Better at running tests/compiling. Zed struggled with nix environment and I don't remember it going to the update code -> run code -> update code feedback loop.
With this you can leave Claude Code alone for a few minutes, check back and give additional instructions. With Zed Agent it was more of a constantly monitoring / copy pasting and manually verifying everything.
*I haven't tested many of the other tools mentioned here, this is mostly my experience with Zed and copy/pasting code to AI.
I plan to test other tools when my Claude Code subscription expires next month.
Zed's agentic editing with Claude 3.7 + thinking does what you're describing testing out with the $100 Claude Code tool. Why leave the Zed editor and pay more to do something you can run for free/cheap within it instead?
I'm with Cursor for the simple reason it is in practice unlimited. Honestly the slow requests after 500 per month are fast enough. Will I stay with Cursor? No, ill switch the second something better comes along.
Same. Love the "slow but free" model, I hope they can continue providing it, I love paying only $20/m instead of having a pay by usage.
I've been building SO MANY small apps and web apps in the latest months, best $20/m ever spent.
20€ seems totally subsidized considering the amount of tokens. Pricing cheaply to be competitive but users will jump to the next one when they inevitably hike the price up.
I'm cursor with claude 3.7
Somehow other models don't work as well with it. ,,auto'' is the worst.
Still, I hate it when it deletes all my unit tests to ,,make them pass''
Or when it arbitrarily decides to rewrite half the content on your website and not mention it.
Or, my favorite: when you’ve been zeroing in on something actually interesting and it says at the last minute, “let’s simplify our approach”. It then proceeds to rip out all the code you’ve written for the last 15 minutes and insert a trivial simulacrum of the feature you’ve been working on that does 2% of what you originally specified.
$5 to anyone who can share a rules.md file that consistently guides Sonnet 3.7 to give up and hand back control when it has no idea what it’s doing, rather than churn hopelessly and begin slicing out nearby unrelated code like it’s trying to cut out margins around a melanoma.
If you accept a changeset that you don't like, isn't that on you?
It is a time waster in any case.
I wish it was unlimited for me. I got 500 fast requests, about 500 slow requests, then at some point it started some kind of exponential backoff, and became unbearably slow. 60+ second hangs with every prompt, at least, sometimes 5 minutes. I used that period to try out windsurf, vscode copilot, etc and found they weren't as good. Finally the month refreshed and I'm back to fast requests. I'm hoping they get the capacity to actually become usably unlimited.
Cursor is acceptable because for the price it's unbeatable. Free, unlimited requests are great. But by itself, Cursor is not anything special. It's only interesting because they pay Claude or Gemini from their pockets.
Ideally, things like RooCode + Claude are much better, but you need infinite money glitch.
On weekend the slow requests regularly are faster than the paid requests.
I am betting on myself.
I built a minimal agentic framework (with editing capability) that works for a lot of my tasks with just seven tools: read, write, diff, browse, command, ask and think.
One thing I'm proud of is the ability to have it be more proactive in making changes and taking next action by just disabling the `ask` tool.
I won't say it is better than any of the VSCode forks, but it works for 70% of my tasks in an understandable manner. As for the remaining stuff, I can always use Cursor/Windsurf in a complementary manner.
It is open, have a look at https://github.com/aperoc/toolkami if it interests you.
Nearly all of your comments have been self promo, I would chill out a bit
Sometimes I feel like I'm the only one sitting here with vim enjoying myself. Letting this whole AI wave float away.
I don't mind having to learn these new tools, but I don't see any drawbacks in waiting a year or more until it stabilizes.
Same as in the crazy times of frontend libraries when it was a new one every week. Just don't jump on anything, and learn the winner in the end.
Sure, I may not be state of the art. But I can pick up whatever fast. Let someone else do all the experiments.
I love vim and but I am playing with this stuff too...
There are a couple of neovim projects that allow this ... Advante come to mind right now.
I will say this: it is a different thought process to get an llm to write code for you. And right now, the biggest issue for me is the interface. It is wrong some how, my attention not being directed to the most important part of what is going on....
You're not the only one. LLMs are hardly intelligent anyway.
Cursor: Autocomplete is really good. At a time when I compared them, it was without a doubt better than Githib Copilot autocomplete. Cmd-K - insert/edit snippet at cursor - is good when you use good old Sonnet 3.5. ;;; Agent mode, is, honestly, quite disappointing; it doesn't feel like they put a lot of thought into prompting and wrapping LLM calls. Sometimes it just fails to submit code changes. Which is especially bad as they charge you for every request. Also I think they over-charge for Gemini, and Gemini integration is especially poor.
My reference for agent mode is Claude Code. It's far from perfect, but it uses sub-tasks and summarization using smaller haiku model. That feels way more like a coherent solution compared to Cursor. Also Aider ain't bad when you're OK with more manual process.
Windsurf: Have only used it briefly, but agent mode seems somewhat better thought out. For example, they present possible next steps as buttons. Some reviews say it's even more expensive than Cursor in agent mode.
Also something to consider: I have a script I wrote myself which just feeds selected files as a context to LLM and then either writes a response to the stdout, or extracts a file out of it.
That seems to be often better than using Cursor. I don't really understand why it calls tools when I selected entire file to be used a context - tool calls seem to be unnecessary distraction in this case, making calls more expensive. Also Gemini less neurotic when I use it with very basic prompts -- either Cursor prompts make it worse, or the need to juggle tool calls distract it from calls.
Since this topic is closely related to my new project, I’d love to hear your opinion on it.
I’m thinking of building an AI IDE that helps engineers write production quality code quickly when working with AI. The core idea is to introduce a new kind of collaboration workflow.
You start with the same kind of prompt, like “I want to build this feature...”, but instead of the model making changes right away, it proposes an architecture for what it plans to do, shown from a bird’s-eye view in the 2D canvas.
You collaborate with the AI on this architecture to ensure everything is built the way you want. You’re setting up data flows, structure, and validation checks. Once you’re satisfied with the design, you hit play, and the model writes the code.
Website (in progress): https://skylinevision.ai
YC Video showing prototype that I just finished yesterday: https://www.youtube.com/watch?v=DXlHNJPQRtk
Karpathy’s post that talks about this: https://x.com/karpathy/status/1917920257257459899
Thoughts? Do you think this workflow has a chance of being adopted?
I quite liked the video. Hope you get to launch the product and I could try it out some day.
The only thing that I kept thinking about was - if there is a correction needed- you have to make it fully by hand. Find everything and map. However, if the first try was way off , I would like to enter from "midpoint" a correction that I want. So instead of fixing 50%, I would be left with maybe 10 or 20. Don't know if you get what I mean.
Yes, the idea is to ‘speak/write’ to the local model to fix those little things so you don’t have to do them by hand. I actually already have a fine-tuned Qwen model running on Apple’s MLX to handle some of that, but given the hard YC deadline, it didn’t make it into the demo.
Eventually, you’d say, ‘add an additional layer, TopicsController, between those two files,’ and the local model would do it quickly without a problem, since it doesn’t involve complicated code generation. You’d only use powerful remote models at the end.
Looks like an antidote for "vibe coding", like it. When are you planning to release something that could be tried? Is this open source?
I believe we can have a beta release in September, and yes, we plan to open-source the editor.
PS. I’m stealing the ‘antidote to “vibe coding”’ phrase :)
The video was a good intro to the concept. As long as it has repeatable memory for the corrections shown in the video, then the answer to your question about being adopted is “yes!”
It looks interesting, but I couldn't really follow what you were doing in the video or why. And then just as you were about to build, the video ends?
Just watched the demo video and thought it is a very interesting approach to development, I will definitely be following this project. Good Luck.
Recently, Augment Code. But more generally, the "leader" switches so frequently at this point, I don't commit to use either and switch more or less freely from one to another. It helps to have monthly subscriptions and free cancellation policy.
I expect, or hope for, more stability in the future, but so far, from aider to Copilot, to Claude Code, to Cursor/Windsurf/Augment, almost all of them improve (or at least change) fast and seem to borrow ideas from each other too, so any leader is temporary.
Windsurf at the moment. It now can run multiple "flows" in parallel, so I can set one cascade off to look into a bug somewhere while another cascade implements a feature elswhere in the code base. The LLMs spit out their tokens in the background, I drop in eventually to reveiew and accept or ask for further changes.
Cursor offers this too - open different tabs in chat and ask for different changes; they’ll run in parallel.
Until you change model in one of the tabs and all other tabs (and editor instances!) get model changed, stop what they're doing, lose context etc. There is also a bug where if you have two editors working on two codebases they get lost and start working on same thing, I suppose there is some kind of a background workspace that gets mixed up.
Zed has this background flow as well, you can see in the video [0] from their latest blog post.
[0] https://zed.dev/blog/fastest-ai-code-editor
We are truly living in the future
I feel a bit out of place here, as I’m not a dev… I come from the operational side, but do all my work in Puppet code. I was using Codeium + VSC and life was wonderful. One day, though, everything updated and Codeium was gone in favor of Windsurf and things got crazy. VSC no longer understood Puppet code and didn’t seem to be able to access the language tools from the native Puppet Development Kit plugins either.
The crazy part is my Vim setup has the Codeium plugins all still in place, and it works perfectly. I’m afraid if I update the plugin to a windsurf variant, it will completely “forget” about Puppet, its syntax, and everything it has “learned” from my daily workflow over the last couple years.
Has anyone else seen anything similar?
VS Code with GitHub Copilot works great, though they are usually a little late to add features compared to Cursor or Windsurf. I use the 'Edit' feature the most.
Windsurf I think has more features, but I find it slower compared to others.
Cursor is pretty fast, and I like how it automatically suggests completion even when moving my cursor to a line of code. (Unlike others where you need to 'trigger' it by typing a text first)
Honorable mention: Supermaven. It was the first and fastest AI autocomplete I used. But it's no longer updated since they were acquired by Cursor.
90% of their features could fit inside a VS Code extension.
There are already a few popular open-source extension doing 90%+ of what Cursor is doing - Cline, Roo Code (a fork of Cline), Kilo Code (a fork of Roo Code and something I help maintain).
The other 10% being what differentiates them in the market :)
Of course. Are they useful enough though for people to install an entirely new software?
Since installing entirely new software is just downloading Cursor.AppImage from the official website and double-clicking on it, it's not a large hassle for most users.
If you're on Arch, there's even an AUR package, so it's even less steps than that.
appImage is useless for development since it only has access to globally installed development tools and environments.
That's not true.
AppImages aren't sandboxed and they can access the rest of the system just fine. After all, they're just a regular SquashFS directory that get mounted into a /tmp mount and then executed from there.
So you're saying that the Cursor appImage is done poorly? I'd believe that.
I’m curious what the motivation is for all these sub-forks. why not just upstream to cline?
OP probably means to keep using vscode. Honestly, best thing you can do is just try each for a few weeks. Feature comparison tables only say so much, particularly because the terminology is still in a state of flux.
I’ve personally never felt at home in vscode. If you’re open to switching, definitely check out Zed, as others are suggesting.
You need none of these fancy tools if you iterate over specs instead of iterating over code. I explain it all in here: https://www.cleverthinkingsoftware.com/spec-first-developmen...
I think a series of specs as the system instruction could help guide it. You can't just go from spec to app though, at least in my experience.
From my experience Claude 3.7 seems to make less mistakes when you provide detailed specs. It saves a lot of back and forth, especially at the beginning, when you're trying to 'de-noise' the idea. I find that the best way is using a hybrid approach: going between iterating over specs to iterating over code, and once I have the MVP I update the specs one last time. I pass these specs to another session and continue working. It solves multiple problems at once: context getting full, LLM getting lost etc.
That is slow and expensive.
Also, keep in mind that you can continue iterating over code while you keep the specs up to date. You can then give the final specs to another LLM to resume work or produce another version for a different platform or using a different framework. I do need to update the article to clarify this.
It sounds slow and expensive, but it ended up saving me a lot of tokens and time. Especially when you want to target multiple platforms at once. For me it was liberating, absolutely night and day.
cheaper than the alternatives
Regardless of which, my favorite model is ChatGPT's. I feel they're the only ones talking to customers. The other models are not as pleasant to work with as a software engineer.
Cursor has for me had the best UX and results until now. Trae's way of adding context is way too annoying. Windsurf has minor UI-issues all over. Options that are extensions in VSCode do not cut it in turn of providing fantastic UI/UX because of the API not supporting it.
I really like Zed. Have not tried any of the mentioned by op. Zed I feel like is getting somewhere that can replace Sublime Text completely (but not there yet).
Zed is an editor firslty.. The Ops has mentioned options which are AI development "agents" basically.
AI aided development has first class support in Zed.
Ie. it's not a "plugin" but built-in ecosystem developed by core team.
Speed of iterations on new features is quite impressive.
Their latest agentic editing update basically brought claude code cli to the editor.
Most corporations don't have direct access to arbitrary LLMs but through Microsoft's Github's Copilot they do – and you can use models through copilot and other providers like Ollama – which is great for work.
With their expertise (team behind pioneering tech like electron, atom, teletype, tree sitter, building their own gpu based cross platform ui etc.) and velocity it seems that they're positioned to outpace competition.
Personally I'd say that their tech is maybe two orders of magnitude more valuable than windsurf?
I don't dispite Zed is great, I actually am using it myself, but it's an editor first and foremost. The OP, to me at least seems to be asking more-so about the AI agent comparisons.
Cursor and Windsurf are both forks of VS Code, an editor.
Yes very observant, modified forks with their agents built in. Zed does not have any built in, sublime does not have agents built in but if you like you can continue this disingenuous discussion.
Zed has it built in, it's called "agentic editing" [0] and behaves like claude code cli and other agents – mcp based editing, iterating on tests until they pass etc. – where you leave it in a background window and can do something else waiting for completion notification or you can follow it to see what changes it is doing.
It's not only that they have it built in but it seems to be currently the best open replacement for tools like claude code cli because you can use arbitrary llm with it, ie. from ollama and you have great extension points (mcp servers, rules, slash commands etc).
[0] https://zed.dev/agentic
I was under impression Zed had native LLM integration, built into the editor?
yes, its now built in. i haven't had a chance to use it much yet, it was released fairly recently.
Agentic editing was released recently yes, llm integration was there for much longer. It supported editing but it was more manual – context of conversation from chat was basically available in in-line editing so you could edit code based on llm output but it was more manual process, now it's agentic.
Thank you for your kind offer, I shall take you up on it. Zed does have it built in. Now, please continue your disingenuous conversation by repeatedly claiming something that is demonstrably not true.
Hello? You're not going to continue?
Nah, I can see a losing argument when I see one. In my eyes, the OP was asking about the LLM/AI side and not the editor. But okay, zed now has one built in. I know now.
Personally, I've been using Cursor since day 1. Lately with Gemini 2.5 Pro. I've also started experimenting with Zed and local models served via ollama in the last couple of days. Unfortunately, without good results so far.
I've created a list of self-hostable alternatives to cursor that I try to keep updated. https://selfhostedworld.com/alternative/cursor/
Personally, if you take the time to configure it well, I think Aider is vastly superior. You can have 4 terminals open in a grid and be running agentic coding workflows on them and 4x the throughput of someone in Cursor, whereas Cursor's UI isn't really amenable to running a bunch of instances and managing them all simultaneously. That plus Aider lets you do more complex automated Gen -> Typecheck -> Lint -> Test workflows with automated fixing.
why can't you create separate git worktrees, and open each worktree in a separate IDE window? then you get the same functionality, no?
You can, but Aider is designed to work in a console and be interacted with through limited screen real estate, whereas cursor is designed to be interacted with through a full screen IDE. Besides the resource consumption issue, Cursor's manual prompts are hard to interact with when the window is tiny because it wants to try and pop up source file windows and display diffs in an editor pane, for instance.
When we're managing 10-20 AI coding agents to get work done, the interface for each is going to need to be minimal. A lot of cursor's functionality is going to be vestigial at that point, as a tool it only makes sense as a gap-bridger for people that are still attached to manual coding.
I use the Windsurf Cascade plugin in JetBrains IDEs. My current flow is I rough-in the outline of what I want and have generally use the plugin to improve what I have by creating tests, performance improvements, or just making things more idiomatic. I need to invest time to add rules at both a global and project level which should make the overall experience event better.
I use the same setup, works like a charm.
I've been flipping between the two, and overall I've found Cursor to be the better experience. Its autocomplete feels better at 'looking ahead' and figuring out what I might be doing next, while Windsurf tends to focus more on repeating whatever I've done recently.
Also, while Windsurf has more project awareness, and it's better at integrating things across files, actually trying to get it to read enough context to do so intelligently is like pulling teeth. Presumably this is a resource-saving measure but it often ends up taking more tokens when it needs to be redone.
Overall Cursor 'just works' better IME. They both have free trials though so there's little reason not to try both and make a decision yourself. Also, Windsurf's pricing is lower (and they have a generous free tier) so if you're on a tight budget it's a good option.
Windsurf autocomplete is free.
Cursor autocomplete stops working after trial ends.
I've had trials for both running and tested both on the same codebases.
Cursor works roughly how I've expected. It reads files and either gets it right or wrong in agent mode.
Windsurf seems restricted to reading files 50 lines at a time, and often will stop after 200 lines [0]. When dealing with existing code I've been getting poorer results than Cursor.
As to autocomplete: perhaps I haven't set up either properly (for PHP) but the autocomplete in both is good for pattern matching changes I make, and terrible for anything that require knowledge of what methods an object has, the parameters a method takes etc. They both hallucinate wildly, and so I end up doing bits of editing in Cursor/Windsurf and having the same project open in PhpStorm and making use of its intellisense.
I'm coming to the end of both trials and the AI isn't adding enough over Jetbrains PhpStorm's built in features, so I'm going back to that until I figure out how to reduce hallucinations.
0. https://www.reddit.com/r/Codeium/comments/1hsn1xw/report_fro...
Claude Code. And... Junie in Jetbrains IDE. It appeared recently and I'm really impressed by its quality. I think it is on the level of Claude Code.
I think it uses claude code by default, it is literally the same thing, with different (better) interface.
Really interesting. Source?
Junie in Ask mode:
> Which LLM are you?
> I am Claude, an AI assistant created by Anthropic. In this interface, I'm operating as "Junie," a helpful assistant designed to explore codebases and answer questions about projects. I'm built on Anthropic's large language model technology, specifically the Claude model family.
Jetbrains wider AI tools let you choose the model that gets used but as far as I can tell Junie doesn't. That said, it works great.
That just means it's using Sonnet, not that it's using Claude Code.
Even that doesn't have to be true, LLMs often impersonate other popular models.
usually OpenAI, but yes
Amazon Q. Claude Code is great (the best imho, what everything else measures against right now), and Amazon Q seems almost as good and for the first week I've been using it I'm still on the free tier.
The flat pricing of Claude Code seems tempting, but it's probably still cheaper for me to go with usage pricing. I feel like loading my Anthropic account with the minimum of $5 each time would last me 2-3 days depending on usage. Some days it wouldn't last even a day.
I'll probably give Open AI's Codex a try soon, and also circle back to Aider after not using it for a few months.
I don't know if I misundersand something with Cursor or Copilot. It seems so much easier to use Claude Code than Cursor, as Claude Code has many more tools for figuring things out. Cursor also required me to add files to the context, which I thought it should 'figure out' on its own.
> I don't know if I misundersand something with Cursor or Copilot. It seems so much easier to use Claude Code than Cursor, as Claude Code has many more tools for figuring things out. Cursor also required me to add files to the context, which I thought it should 'figure out' on its own.
Cursor can find files on its own. But if you point it in the right direction it has far better results than Claude code.
this is the first time I am seeing someone says good things about Amazon Q
Do they publish any benchmark sheet on how it compares against others?
It is currently at top3 in swe bench verified.
It went through multiple stages of upgrades and I would say at this stage it is better than copilot. Fundamentally it is as good as cursor or windsurf but lacks some features and cannot match their speed of release. If you re on aws tho its a compelling offering.
I remember asking Amazon Q something and it wouldn’t reply cuz of security policy or something. It was as far as I can remember a legit question around Iam policy which I was trying to configure. I figured it out back in Google search.
It might seem contrary to the current trend, but I've recently returned to using nvim as my daily driver after years with VS Code. This shift wasn't due to resource limitations but rather the unnecessary strain from agentic features consuming high amounts of resources.
I wish your own coding would just be augmented like somebody looking over your shoulder. The problem with the current AI coding is that you don't know your code base anymore. Basically, like somebody helping you figure out stuff faster, update documentation etc.
"Throughly review my code change (git diff HEAD), {extra context etc}"
> Neither? I'm surprised nobody has said it yet. I turned off AI autocomplete ...
This represents one group of developers and is certainly valid for that group. To each their own
For another group, where I belong, AI is a great companion! We can handle the noise and development speed is improved as well as the overall experience.
I prefer VSCode and GitHub copilot. My opinion is this combo will eventually eat all the rest, but that's besides the point.
Agent mode could be faster, sometimes it is rather slow thinking but not a big deal. This mode is all I use these days. Integration with the code base is a huge part of the great experience
The only thing preventing me to keep using Cursor/Windsurf it's the lack of sync feature. I use different machines and it's crucial to keep the same exact configuration on each of them :(
I evaluated Windsurf at a friend's recommendation around half a year ago and found that it could not produce any useful behaviors on files above a thousand lines or so. I understand this is mostly a property of the model, but certainly also a property of the approach used by the editor of just tossing the entire file in, yeah? I haven't tried any of these products since then, but it might be worth another shot because Gemini might be able to handle these files.
A year is a long time. Even in the past few months it has improved a lot.
Windsurf has improved a lot in the last few months.
I'm glad there are finally multiple options for agentic for JetBrains so I no longer have to sometimes switch over to VSCode and its various versions.
Copilot at work and Junie at home. I found nothing about my VSCode excursions to be better than Sublime or IntelliJ.
I’ve only played with Junie and Aider so far and like the approach the Junie agent takes of reading the code to understand it vs the Aider approach of using the repo map to understand the codebase.
I find Windsurf almost unusable. It’s hard to explain but the same models in Zed, Windsurf, Copilot and Cursor produce drastically worse code in Windsurf for whatever reason. The agent also tends to do way more stupid things at random, like wanting to call MCP tools, creating new files, then suddenly forgetting how to use tools and apologizing a dozen times
I can’t really explain or prove it, but it was noticeable enough to me that I canceled my subscription and left Windsurf
Maybe a prompting or setting issue? Too high temperature?
Nowadays Copilot got good enough for me that it became my daily driver. I also like that I can use my Copilot subscription in different places like Zed, Aider, Xcode
I have also observed this with Gemini 2.5 in Cursor vs Windsurf.
Cline seems to be the best thing, I suspect because it doesn't do any dirty tricks with trimming down the context under the hood to keep the costs down. But for the same reason, it's not exactly fun to watch the token/$ counter ticking as it works.
Cline?
I love Cline and use it every day. It works the way I think and makes smart decisions about features.
Cline!
... if you can afford it. Pay per token can get expensive with large context.
Recently started using Cursor for adding a new feature on a small codebase for work, after a couple of years where I didn't code. It took me a couple of tries to figure out how to work with the tool effectively, but it worked great! I'm now learning how to use it with TaskMaster, it's such a different way to do and play with software. Oh, one important note: I went with Cursor also because of the pricing, that's despite confusing in term of fast vs slow requests, it smells less consumption base.
BTW There's a new OSS competitor in town that got the front a couple of days ago - Void: Open-source Cursor alternative https://news.ycombinator.com/item?id=43927926
VSCode with auto complete and Gemini 2.5 Pro in a standalone chat (pick any interface that works for you, eg librechat, vertex etc). The agents-in-a-IDE experience is hella slow in my opinion.
Plus its less about the actual code generation and more about how to use it effectively. I wrote a simple piece on how I use it to automate the boring parts of dev work to great effect https://quickthoughts.ca/posts/automate-smarter-maximizing-r...)
If you have any intellectual property worth protecting or need to comply with HIPAA, a completely local installation of Cline or Aider or Codeium with LM Studio with Qwen or DeepSeek Coder works well. If you'd rather not bother, I don't see any option to GitHub Copilot for Business. Sure, they're slower to catch up to Cursor, but catch up they will.
https://github.com/features/copilot/plans?cft=copilot_li.fea...
Which entity is going to steal your IP? Cursor / Windsurf or OpenAI / Anthropic?
Neither! Neovim for most of my work and vscode w/ appropriate plugins when it's needed. If I need any LLM assistance I just run Claude Code in the terminal.
Using Windsurf since the start and I am satisfied. Didn't look beyond it. Focused on actually doing the coding. It's impossible to keep up with daily AI news and if something groundbreaking happens it will go viral.
Neither. Do some real work instead instead of using some cancerous shitty autocomplete.
I think with the answer, each responder should include their level of coding proficiency. Or, at least whether they are able to (or even bother to) read the code that the tool generates. Preferences would vary wildly based on it.
I just use Copilot (across VS Code, VS etc), it lets you pick the model you want and it's a fixed monthly cost (and there is a free tier). They have most of the core features of these other tools now.
Cursor, Windsurf et al have no "moat" (in startup speak), in that a sufficiently resourced organization (e.g. Microsoft) can just copy anything they do well.
VS code/Copilot has millions of users, cursor etc have hundreds of thousands of users. Google claims to have "hundreds of millions" of users but we can be pretty sure that they are quoting numbers for their search product.
Windsurf. The context awareness is superior compared to cursor. It falls over less and is better at retrieving relevant snippets of code. The premium plan is cheaper too, which is a nice touch.
How about Cursor vs. Windsurf vs. (Claude Desktop + MCP)?
Haven't tried out Cursor / Windsurf yet, but I can see how I can adapt Claude Desktop to specifically my workflow with a custom MCP server.
Latest cursor update where they started charging for tokens is pretty good. I don't use non-MAX mode on cursor anymore
I've been using cursor for several months. Absolutely hate Agent mode - it jumps too far ahead, and it's solutions, though valid, can overcomplicate the whole flow. It can also give you a whole bunch of code that you have to accept blindly will work, and is not super great at making good file layouts, etc. I've switched to autocomplete with the ask mode when I'm stuck
Cursor is good for basic stuff but Windsurf consistently solves issues Cursor fails on even after 40+ mins of retries and prompting changes.
Cursor is very lazy about looking beyond the current context or even context at all sometimes it feels it’s trying to one shot a guess without looking deeper.
Bad thing about Windsurf is the plans are pretty limited and the unlimited “cascade base” feels dumb the times I used it so ultimately I use Cursor until I hit a wall then switch to Windsurf.
Cursor. Good price, the predictive next edit is great, good enough with big code bases and with the auto mode i dont even spend all my prem requests.
I've tried VScode with copilot a couple of times and its frustrating, you have to point out individual files for edits but project wide requests are a pain.
My only pain is the workflow for developing mobile apps where I have to switch back and forth between Android Studio and Xcode as vscode extensions for mobile are not so good
I tested windsurf last week, it installed all dependencies to my global python....it didn't know best practices for Python, and didn't create any virtual env..... I am disappointed. My Cursor experience was slightly better. Still, one issue I had was how to make sure it does not change the part of code I don't want it to change. Every time you ask it to do something for A, it rewrote B in the process, very annoying.
My best experience so far is v0.dev :)
https://nonbios.ai - [Disclosure: I am working on this.]
- We are in public beta and free for now.
- Fully Agentic. Controllable and Transparent. Agent does all the work, but keeps you in the loop. You can take back control anytime and guide it.
- Not an IDE, so don't compete with VSCode forks. Interface is just a chatbox.
- More like Replit - but full stack focussed. You can build backend services.
- Videos are up at youtube.com/@nonbios
Notepad++ best
Has anyone had any joy using a local model? Or is it still too slow?
On something like a M4 Macbook Pro can local models replace the connection to OpenAi/Anthropic?
For advanced autocomplete (not code generation, but can do that too), basic planning, looking things up instead of web search, review & summary, even one shooting smaller scripts, the 32b Q4 models proved very good for me (24gb VRAM RTX 3090). All LLM caveats still apply, of course. Note that setting up local llm in cursor is pain because they don't support local host. Ngrok or vps and reverse ssh solve that though.
It's not so much that it's slow, it's that local models are still a far cry from what SOTA cloud LLM providers offer. Depending on what you're actually doing, a local model might be good enough.
I am retired now, out of the game, but I also suggest an alternative: running locally with open-codex, Ollama, and the qwen3 models and gemma3, and when necessary use something hosted like Gemini 2.5 Pro without an IDE.
I like to strike a balance between coding from scratch and using AI.
Void. I'd rather run my own models locally https://voideditor.com/
Which model are you running locally? Is it faster than waiting for Claudes generation? What gear do you use?
That's the fun part, you can use all of them! And you don't need to use browser plugins or console scripts to auto-retry failures (there aren't any) or queue up a ton of tasks overnight.
Have a 3950X w/ 32GB ram, Radeon VII & 6900XT sitting in the closet hosting smaller models then a 5800X3D/128GB/7900XTX as my main machine.
Most any quantized model that fits in half of the vram of a single gpu (and ideally supports flash attention, optionally speculative decoding) will give you far faster autocompletes. This is especially the case with the Radeon VII thanks to the memory bandwidth.
Not OP but for autocomplete I am running Qwen2.5-Coder-7B and I quantized it using Q2_K. I followed this guide:
https://blog.steelph0enix.dev/posts/llama-cpp-guide/#quantiz...
And I get fast enough autcomplete results for it to be useful. I have and NVIDIA 4060 RTX in a laptop with 8 gigs of dedicated memory that I use for it. I still use claude for chat (pair programming) though, and I don't really use agents.
While I haven't used Windsurf, I've been using Cursor and I LOVE it: especially the inline autocomplete is like reading my mind and making the work MUCH faster.
I can't say anything about Windsurf (as I haven't tried yet) but I can confidently say Cursor is great.
Currently using cursor. I've found cursor even without the AI features to be a more responsive VS Code. I've found the AI features to be particularly useful when I contain the blast radius to a unit of work.
If I am continuously able to break down my work into smaller pieces and build a tight testing loop, it does help me be more productive.
How do you define a unit of work for your purposes?
Vs code with agent mode
I'm not sure the answer matters so much. My guess is that as soon as one of them gains any notable advantage over the other, the other will copy it as quickly as possible. They're using the same models under the hood.
Trae.ai actually, otherwise Windsurf
I like cursor, the autocomplete is great most of the time, as others have said use a shortcut to disable it.
The agents are a bit beta, it can’t solve bugs very often, and will write a load of garbage if you let it.
I use Windsurf but it's been having ridiculous downtime lately.
I can't use Cursor because I don't use Ubuntu which is what their Linux packages are compiled against and they don't run on my non-Ubuntu distro of choice.
Considering Microsoft is closing down on the ecosystem, I'd pick VSCode with Copilot over those two.
It's a matter of time before they're shuttered or their experience gets far worse.
Unlikely they'll disappear. I currently use Cursor but am happy to change if a competitor is markedly better
MS is slow to release new features though.
Claude code is the best so far, I am using the 200$ plan. in terms of feature matrix all tools are almost same with some hits and misses but speed is something which claude code wins.
Do you think you use more than just 200$ worth of API credits in a month? I've used both Claude Code and Cursor, and I find myself liking the Terminal CLI, but the price is much more than 20$ per month for me.
I use Windsurf and vim. Windsurf is good for pure exploration using a vibe coding style but prefer to hand-code anything that I am going to keep.
Zed! I find it to be less buggy and generally more intuitive to use.
Neither. VS Code or Zed.
neither : my pen is my autocomplete
Personally copilot/code assist for tab autocomplete, if I need longer boilerplate I request it to the LLM. Usually VIM with LSP.
Anything that’s not boilerplate I still code it
best thing about cursor is $20 and u basically get unlimited requests. I know you get “slower” after a certain amount of requests but honestly you dont feel it being slow and reasoning models are taking so much to answer, so anyways you send the prompt and go doing other stuff, so the slowness i dont think it matters and basically unlimited compute u know?
Whatever the answer is, if you don't like it wait a week. They are constantly going back and forth.
the age of swearing allegiance to a particular IDE/AI tool is over. I keep switching between Cursor and GH Copilot and for the most part they are very similar offerings. Then there's v0, Claude (for its Artifacts feature) and Cline which I use quite regularly for different requirements.
Both, most times one works better than the other.
I have had so much fun lately just with vanilla VS Code and Claude Code. Aider is a close second.
I’m using Github Copilot in VScode Insiders, mostly because I don’t want yet another subscription. I guess I’m missing out.
Still on codeium lol! Might give aider another spin. It is never been quite good for my needs but tech evolves.
I agree with /u/welder. Preferably neither. Both of these are custom and runs the risk of being acquired and enshittified in future.
If you are using VScode, get familiar with cline. Aider is also excellent if you don’t want to modify your IDE.
Additionally, Jetbrains IDEs now also have built-in local LLMs and their auto-complete is actually fast and decent. They also have added a new chat sidepanel in recent update.
The goal is NOT to change your workflow or dev env, but to integrate these tools into your existing flow, despite what the narrative says.
I’d just wait a bit. At current rate of progress winner will be apparent sooner rather than later.
VScode + Github Copilot Pro. $10 per month to try out AI code assist is cheap enough
Zed. It is blazing fast.
If you don't mind not having DAP and Windows support, then Zed is great.
I think zed is the answer you're looking for.
Windsurf - the repo code awareness is much higher than Cursor.
Anything which can't exfiltrate your data
Cursor for personal projects and Just Pycharm for work projects.
hijacking this thread: Whats the best AI tool for NeoVim ?
I’ve really been enjoying the combination of CodeCompanion with Gemini 2.5 for chat, Copilot for completion, and Claude Code/OpenAI Codex for agentic workflows.
I had always wanted to get comfortable with Vim, but it never seemed worth the time commitment, especially with how much I’ve been using AI tools since 2021 when Copilot went into beta. But recently I became so frustrated by Cursor’s bugs and tab completion performance regressions that I disabled completions, and started checking out alternatives.
This particular combination of plugins has done a nice job of mostly replicating the Cursor functionality I used routinely. Some areas are more pleasant to use, some are a bit worse, but it’s nice overall. And I mostly get to use my own API keys and control the prompts and when things change.
I still need to try out Zed’s new features, but I’ve been enjoying daily driving this setup a lot.
Tried them all extensively, Cursor is SOTA
VS Code with Copilot.
This is the way.
Getting great results both in chat, edit and now agentic mode. Don’t have to worry about any blocked extensions in the cat and mouse game with MS.
i'm on cursor, performance has gone done. thinking about windsurf
Lately I switched to using a triple monitor setup and coding with both Cursor and Windsurf. Basically, the middle monitor has my web browser that shows the front-end I'm building. The left monitor has Cursor, and right one has Windsurf. I start coding with Cursor first because I'm more familiar with its interface, then I ask Windsurf to check if the code is good. If it is, then I commit. Once I'm done coding a feature, I'll also open VScode in the middle monitor, with Cline installed, and I will ask it to check the code again to make sure it's perfect.
I think people who ask the "either or" question are missing the point. We're supposed to use all the AI tools, not one or two of them.
Why not just write a script that does this but with all of the model providers and requests multiple completions from each? Why have a whole ass editor open just for code review?
I'm finding I increasingly produce entire changesets without opening an editor: just `claude code`, or my own cobbled-together version of `claude code`, and `git diff` to preview what's happening. For me, the future of these tools isn't "inside" a text editor. If you want to poke around, my “cobbled‑together Claude Code” lives here: https://github.com/cablehead/gpt2099.nu
It's not just an "editor". Both Windsurf and Cursor do some tricks with context so that the underlying LLM doesn't get confused. Besides, writing a script sounds hard, no need to spend the extra energy when you can simply open a tool. Anyway, that's how I code, feel free to do whatever you prefer.
Using cursor.. pretty good tool. Pick one and start.
pycharm + augment code + Gemini/Claude to generate the prompt for augment code.
What do you use to generate the prompt for Gemini/Claude?
I compare them here https://www.devtoolsacademy.com/blog/cursor-vs-windsurf/
Cursor is a better vibe IMO
Claude Code
I use neovim now, after getting tired of the feature creep and the constant chasing of shiny new features.
AI is not useful when it does the thinking for you. It's just advanced snippets at that point. I only use LLMs to explain things or to clarify a topic that doesn't make sense right away to me. That's when it shows it's real strength.
sing AI for autocomplete? I turn it off.
Use vim
vi
Can you add a poll?
tab completed is a nightmare when it comes to non-expected code.
zed!! the base editor is just better then vscode.
and they just released agentic editing.
Amp.
Early access waitlist -> ampcode.com
This is a product by Sourcegraph https://sourcegraph.com who already have a solution in this space.
Is this something wildly different to Cody, your existing solution, or just a "subtle" attempt to gain more customers?
I'd love to try it, could you please share an invite? My email is on my profile page.
Interesting! Do you have an invite to spare? My email is in my bio
Windsurf, no autocomplete.
you should also ask if people acutally used both :)
I prefer Zed
It changes too quickly for this to really matter. just pick the one you think looks and feels better to you
I use Zed.
Cline beats both, and it costs nothing but direct token usage from your LLM provider.
Cursor/Windsurf/et. al. are pointless middlemen.
None
VSCode with Github Copilot!
I am using both. Windsurf feels complete less clunky. They are very close tho and the pace of major updates is crazy.
I dont like CLI based tools to code. Dont understand why they are being shilled. Claude code is maybe better at coding from scratch because it is only raw power and eating tokens like there is no tomorrow but it us the wrong interface to build anything serious.
cursor.
Neither? I'm surprised nobody has said it yet. I turned off AI autocomplete, and sometimes use the chat to debug or generate simple code but only when I prompt it to. Continuous autocomplete is just annoying and slows me down.
This is the way.
All this IDE churn makes me glad to have settled on Emacs a decade ago. I have adopted LLMs into my workflow via the excellent gptel, which stays out of my way but is there when I need it. I couldn't imagine switching to another editor because of some fancy LLM integration I have no control over. I have tried Cursor and VS Codium with extensions, and wasn't impressed. I'd rather use an "inferior" editor that's going to continue to work exactly how I want 50 years from now.
Emacs and Vim are editors for a lifetime. Very few software projects have that longevity and reliability. If a tool is instrumental to the work that you do, those features should be your highest priority. Not whether it works well with the latest tech trends.
Ironically LLMs have made Emacs even more relevant. The model LLMs use (text) happens to match up with how Emacs represents everything (text in buffers). This opens up Emacs to becoming the agentic editor par excellence. Just imagine, some macro magic acound a defcommand and voila, the agent can do exactly what a user can. If only such a project could have the funding like Cursor does...
Nothing could be worse for the modern Emacs ecosystem than for the tech industry finance vampires ("VCs," "LPs") to decide there's blood enough there to suck.
Fortunately, alien space magic seems immune, so far at least. I assume they do not like the taste, and no wonder.
Why should the Emacs community care whether someone decides to build a custom editor with AI features? If anything this would bring more interest and development into the ecosystem, which everyone would benefit from. Anyone not interested can simply ignore it, as we do for any other feature someone implements into their workflow.
what i find interesting is why is nobody building llms trained on using the shell and PTY at its full
right now its dumb unix piping only
I want an AI that can use emacs or vim with me
Elnode should make this very easy, given the triviality of the MCP "protocol."
I would take care. Emacs has no internal boundaries by design and it comes with the ability to access files and execute commands on remote systems using your configured SSH credentials. Handing the keys to an enthusiastically helpy and somewhat cracked robot might prove so bad an idea you barely even have time to put your feet up on the dash before you go sailing through the windshield.
There are tens (maybe low hundreds?) of thousands of people who want that.
Which is exactly why it hasn’t been commercially developed.
yea... I guess is too niche, I guess scratch your own itch + foss it so the low hundreds of us can have fun or smt
I was exploring using andyk/ht discussed on hn a few months back, to sit as a proxy my llm can call at the same time i control via xtermjs, but i need to figure out how to train the llm to output keybindings/special keys etc, but promising start nonetheless, i can indeed parse a lot of extra info than just a command, just imagine if AI could use all of the shell auto-complete features but feed into it..
maybe i should revisit/cleanup that repo and make it public. It feels like with just some data training on special key bindings etc an llm should be able to type, even if -char by char- at a faster speed than a human, to control TUI's
There are several neovim mcp providers that expose bash and neovim as a tool so you can already do that.
I'm not sure why you were downvoted. You're right that buffers and everything being programmable makes Emacs an ideal choice for building an AI-first editor. Whether that's something that a typical Emacs user wants is a separate issue, but someone could certainly build a polished experience if they had the resources and motivation. Essentially every Emacs setup is someone's custom editor, and AI features are not different from any other customization.
Emacs diff tools alone is a reason to use the editor. I switch between macOS, Linux, and Windows frequently so settled on Emacs and happy with that choice as well.
I’ve been using Aidermacs to access Aider in Emacs and it works quite well and makes lots of LLMs available. Claude Sonnet 3.7 has been reasonable for code generation, though there are certainly tasks that it seems to struggle on.
Cursor/Windsurf and similar IDEs and plugins are more than autocomplete on steroids.
Sure, you might not like it and think you as a human should write all code, but frequent experience in the industry in the past months is that productivity in the teams using tools like this has greatly increased.
It is not unreasonable to think that someone deciding not to use tools like this will not be competitive in the market in the near future.
I think you’re right, and perhaps it’s time for the “autocomplete on steroids” tag to be retired, even if something approximating that is happening behind the scenes.
I was converting a bash script to Bun/TypeScript the other day. I was doing it the way I am used to… working on one file at a time, only bringing in the AI when helpful, reviewing every diff, and staying in overall control.
Out of curiosity, threw the whole task over to Gemini 2.5Pro in agentic mode, and it was able to refine to a working solution. The point I’m trying to make here is that it uses MCP to interact with the TS compiler and linters in order to automatically iterate until it has eliminated all errors and warnings. The MCP integrations go further, as I am able to use tools like Console Ninja to give the model visibility into the contents of any data structure at any line of code at runtime too. The combination of these makes me think that TypeScript and the tooling available is particularly suitable for agentic LLM assisted development.
Quite unsettling times, and I suppose it’s natural to feel disconcerted about how our roles will become different, and how we will participate in the development process. The only thing I’m absolutely sure about is that these things won’t be uninvented with the genie going back in the bottle.
How much did that cost you? How long did you spend reading and testing the results?
That wasn’t really the point I was getting at, but as you asked… The reading doesn’t involve much more than a cursory (no pun intended) glance, and I didn’t test more than I would have tested something I had written manually.
Maybe it wasn't your point. But cost of development is a very important factor, considering some of the thinking models burn tokens like no tomorrow. Accuracy is another. Maybe your script is kind of trivial/inconsequential so it doesn't matter if the output has some bugs as long as it seems to work. There are a lot of throwaway scripts we write, for which LLMs are an excellent tool to use.
I use Rider with some built in AI auto-complete. I'd say its hit rate is pretty low!
Sometimes it auto-completes nonsense, but sometimes I think I'm about to tab on auto-completing a method like FooABC and it actually completes it to FoodACD, both return the same type but are completely wrong.
I have to really be paying attention to catch it selecting the wrong one. I really really hate this. When it works its great, but every day I'm closer to just turning it off out of frustration.
https://alex.party/posts/2025-05-05-the-future-of-web-develo...
Arguing that ActiveX or Silverlight are comparable to AI, seeing what changes it did bring and is bringing, is definitely a weak argument.
A lot of people are against change because it endangers their routine, way of working, livelihood, which might be a normal reaction. But as accountants switched to using calculators and Excel sheets, we will also switch to new tools.
Ahh yes, software development, the discipline that famously has difficult to measure metrics and difficulty with long term maintenance. Months indeed.
Where are these amazing productivity increases?
Where is this 2x, 10x or even 1.5x increase in output? I don't see more products, more features, less bugs or anything related to that since this "AI revolution".
I keep seeing this being repeated ad nauseam without any real backing of hard evidence. It's all copium.
Surely if everyone is so much more productive, a single person startup is now equivalent to 1 + X right?
Please enlighten me as I'm very eager to see this impact in the real world.
I think you’re arguing a straw man
I don’t think the point was “don’t use LLM tools”. I read the argument here as about the best way to integrate these tools into your workflow.
Similar to the parent, I find interfacing with a chat window sufficiently productive and prefer that to autocomplete, which is just too noisy for me.
> is that productivity in the teams using tools like this has greatly increased
On the short term. Have fun debugging that mess in a year while your customers are yelling at you! I'll be available for hire to fix the mess you made which you clearly don't have the capability to understand :-)
Debugging any system is not easy, it is not like technical debt didn't exit before AI, people will be writing shitcode in the future as they were in the past. Probably more, but there are also more tools that help with debugging.
Additionally, what you are failing to realise is that not everyone is just vibe coding and accepting blindly what the LLM is suggesting and deploying it to prod. There are actually people with decade+ of experience who do use these tools and who found it to be an accelerator in many areas, from writing boilerplate code, to assisting with styling changes.
In any case, thanks for the heads up, definitely will not be hiring you with that snarky attitude. Your assumption that I have no capability to understand something without any context tells more about you than me, and unfortunately there is no AI to assist you with that.
Keep deluding yourself, buddy.
To be fair, I think the most value is added by Agent modes, not autocomplete. And I agree that AI-autocomplete is really quite annoying, personally I disable it too.
But coding agents can indeed save some time writing well-defined code and be of great help when debugging. But then again, when they don't work on a first prompt, I would likely just write the thing in Vim myself instead of trying to convince the agent.
My point being: I find agent coding quite helpful really, if you don't go overzealous with it.
Are you using these in your day job to complete real world tasks or in greenfield projects?
I simply cannot see how I can tell an agent to implement anything I have to do in a real day job unless it's a feature so simple I could do it in a few minutes. Even those the AI will likely screw it up since it sucks at dealing with existing code, best practices, library versions, etc.
I've found it useful for doing simple things in parallel. For instance, I'm working on a large typescript project and one file doesn't have types yet. So I tell the AI to add typing to it with a description while I go work on other things. I check back in 5-10 mins later and either commit the changes or correct it.
Or if I'm working on a full stack feature, and I need some boilerplate to process a new endpoint or new resource type on the frontend, I have the AI build the api call that's similar to the other calls and process the data while I work on business logic in the backend. Then when I'm done, the frontend API call is mostly set up already
I found this works rather well, because it's a list of things in my head that are "todo, in progress" but parallelizable so I can easily verify what its doing
SOTA LLMs are broadly much better at autonomous coding than they were even a few months ago. But also, it really depends on what it is exactly you're working on, and what tech is involved. Things are great if you're writing Python or TypeScript, less so with C++, and even less so with Rust and other emerging technologies.
I am. I've spent some time developing cursor rules where I describe best practices, etc.
The few times I've tried to use an agent for anything slightly complex or on a moderately large code base it just proceeds to smeer poop all over the floor eventually backing itself into a corner.
I shortcut the "cursor tab" and enable or disable it as needed. If only Ai was smart enough to learn when I do and don't want it (like clippy in the ms days) - when you are manually toggling it on/off clear patterns emerge (to me at least) as to when I do and don't want it.
How do you do that? Sorry if it's obvious - I've looked for this functionality before and didn't spot it
Bottom right says "cursor tab" you can manually manipulate it there (and snooze for X minutes - interesting feature). For binding shortcuts - Command/Ctrl + Shift + P, then look for "Enable|Disable|Whatever Cursor Tab" and set shortcuts there.
Old fashioned variable name / function name auto complete is not affected.
I considered a small macropad to enable / disable with a status light - but honestly don't do enough work to justify avoiding work by finding / building / configuring / rebuilding such a solution. If the future is this sort of extreme autocomplete in everything I do on a computer, I would probably go to the effort.
Thanks!
The thing that bugs me is when Im trying to use tab to indent with spaces, but I get a suggestion instead.
I tried to disable caps lock, then remap tab to caps lock, but no joy
I can't even get simple code generation to work for VHDL. It just gives me garbage that does not compile. I have to assume this is not the case for the majority of people using more popular languages? Is this because the training data for VHDL is far more limited? Are these "AIs" not able to consume the VHDL language spec and give me actual legal syntax at least?! Or is this because I'm being cheap and lazy by only trying free chatGPT and I should be using something else?
Its all of that to some extent or the other. LLMs don't update overnight and as such lag behind innovations in major frameworks, even in web development. No matter what is said about augmenting their capabilities, their performance using techniques like RAG seem to be lacking. They don't work well with new frameworks either.
Any library that breaks backwards compatibility in major version releases will likely befuddle these models. That's why I have seen them pin dependencies to older versions, and more egregiously, default to using the same stack to generate any basic frontend code. This ignores innovations and improvements made in other frameworks.
For example, in Typescript there is now a new(ish) validation library call arktype. Gemini 2.5 pro straight up produces garbage code for this. The type generation function accepts an object/value. But gemini pro keeps insisting that it consumes a type.
So Gemini defines an optional property as `a?: string` which is similar to what you see in Typescript. But this will fail in arktype, because it needs it input as `'a?': 'string'`. Asking gemini to check again is a waste of time, and you will need enough familiarity with JS/TS to understand the error and move ahead.
Forcing development into an AI friendly paradigm seems to me a regressive move that will curb innovation in return for boosts in junior/1x engineer productivity.
Yep, management dreams of being able to make every programmer a 10x programmer by handing them an LLM, but the 10x programmers are laughing because they know how far off the rails the LLM will go. Debugging skills are the next frontier.
It's fun watching the AI bros try to spin justifications for building (sorry, vibing) new apps using Ruby for no reason other then the model has so much content back to 2004 to train off.
They are probably really good at React. And because that ecosystem has been in a constant cycle of reinventing the wheel, they can easily pump out boilerplate code because there is just so much of it to train from.
The amount of training data available certainly is a big factor. If you’re programming in Python or JavaScript, I think the AIs do a lot better. I write in Clojure, so I have the same problem as you do. There is a lot less HDL code publicly available, so it doesn’t surprise me that it would struggle with VHDL. That said, from everything I’ve read, free ChatGPT doesn’t do as well on coding. OpenAI’s paid models are better. I’ve been using Anthropic’s Claude Sonnet 3.7. It’s paid but it’s very cost effective. I’m also playing around with the Gemini Pro preview.
It completely fails to be helpful as a C/C++. I don’t understand the positivity around it but it must be trained on a lot of web frameworks.
It's very helpful for low level chores. The bane of my existence is frontend, and generating UI elements for testing backend work on the fly rocks. I like the analogy of it being a junior dev; Perhaps even an intern. You should check their work constantly and give them extremely pedantic instructions
Same here. It's extremely distracting to see the random garbage that the autocomplete keeps trying to do.
I said this in another comment but I'll repeat the question: where are these 2x, 10x or even 1.5x increases in output? I don't see more products, more features, less bugs or anything related to that since this "AI revolution".
I keep seeing this being repeated ad nauseam without any real backing of hard evidence.
If this was true and every developer had even a measly 30% increase in productivity, it would be like a team of 10 is now 13. The amount of code being produced would be substantially more and as a result we should see an absolute boom in new... everything.
New startups, new products, new features, bugs fixed and so much more. But I see absolutely nothing but more bullshit startups that use APIs to talk to these models with a few instructions.
Please someone show me how I'm wrong because I'd absolutely love to magically become way more productive.
I am but a small humble minority voice here but perhaps I represent a larger non-HN group:
I am not a professional SWE; I am not fluent in C or Rust or bash (or even Typescript) and I don't use Emacs as my editor or tmux in the terminal;
I am just a nerdy product guy who knows enough to code dangerously. I run my own small business and the software that I've written powers the entire business (and our website).
I have probably gotten a AT LEAST a 500-1000% speedup in my personal software productivity over the past year that I've really leaned into using Claude/Gemini (amazing that GPT isn't on that list anymore, but that's another topic...) I am able to spec out new features and get them live in production in hours vs. days and for bigger stuff, days vs weeks (or even months). It has changed the pace and way in which I'm able to build stuff. I literally wrote an entire image editing workflow to go from RAW camera shot to fully processed product image on our ecommerce store that's cut out actual, real, dozens of hours of time spent previously.
Is the code I'm producting perfect? Absolutely not. Do I have 100% test coverage? Nope. Would it pass muster if I were a software engineer at Google? Probably not.
Is it working, getting to production faster, and helping my business perform better and insanely more efficiently? Absolutely.
I think that tracks with what I see: LLMs enable non-experts to do something really fast.
If I want to, let's say, create some code in a language I never worked on an LLM will definitely make me more "productive" by spewing out code for me way faster than I could write it. Same if I try to quickly learn about a topic I'm not familiar with. Especially if you don't care about the quality, maintainability, etc. too much.
But if I'm already a software developer with 15 years of experience dealing with technology I use every day, it's not going to increase my productivity in any meaningful way.
This is the dissonance I see with AI talk here. If you're not a software developer the things LLMs enable you to do are game-changers. But if you are a good software developer, in its best days it's a smarter autocomplete, a rubber-duck substitute (when you can't talk to a smart person) or a mildly faster google search that can be very inaccurate.
If you go from 0 to 1 that's literally infinitely better but if you go from 100 to 105, it's barely noticeable. Maybe everyone with these absurd productivity gains are all coming from zero or very little knowledge but for someone that's been past that point I can't believe these claims.
Yeah, I use IntelliJ with the chat sidebar. I don't use autocomplete, except in trivial cases where I need to write boilerplate code. Other than that, when I need help, I ask the LLM and then write the code based on its response.
I'm sure it's initially slower than vibe-coding the whole thing, but at least I end up with a maintainable code base, and I know how it works and how to extend it in the future.
This is the way.
+100. I’ve found the “chat” interface most productive as I can scope a problem appropriately.
Cursor, Windsurf, etc tend to feel like code vomit that takes more time to sift through than working through code by myself.
Absolutely hate the agent mode but I find autocomplete with asks to be the best for me. I like to at least know what I'm putting in my codebase and it genuinely makes me faster due to:
1) Stops me overthinking the solution 2)Being able to ask it pros and cons of different solutions 3) multi-x speedup means less worry about throwing away a solution/code I don't like and rewriting / refactoring 4) Really good at completing certain kinds of "boilerplate-y" code 5) Removed need to know the specific language implementation but rather the principle (for example pointers, structs, types, mutexes, generics, etc). My go to rule now is that I won't use it if I'm not familiar with the principle, and not the language implementation of that item 6) Absolute beast when it comes to debugging simple to medium complexity bugs
That is interesting. Which tech are you using?
Are you getting irrelevant suggestions as those autocompletes are meant to predict the things you are about to type.
Agreed, I've found for JS the suggestions are remarkably good
I'm past the honeymoon stage for LLM autocomplete.
I just noticed CLion moved to a community license, so I re-installed it and set up Copilot integration.
It's really noisy and somehow the same binding (tab complete) for built in autocomplete "collides" with LLM suggestions (with varying latency). It's totally unusable in this state; you'll attempt to populate a single local variable or something and end up with 12 lines of unrelated code.
I've had much better success with VSCode in this area, but the complete suggestions via LLM in either are usually pretty poor; not sure if it's related to the model choice differing for auto complete or what, but it's not very useful and often distracting, although it looks cool.
This is where I landed too. Used Cursor for a while before realizing that it was actually slowing me down because the PR cycle took so much longer, due to all the subtle bugs in generated code.
Went back to VSCode with a tuned down Copilot and use the chat or inline prompt for generating specific bits of code.
What is a "PR cycle"?
open a pull request, reviewer finds a bug and asks for changes, you make changes and re-request a review...
That's what I was afraid of, I'd never think anyone submitting AI-generated code wouldn't first read it themselves before asking others to review it!
I read it myself. But if one person could catch every bug in code then we wouldn't need PR reviews at all would we?
Well yes, but I personally would never submit a PRr I could use the excuse, "sorry, AI wrote those parts, that's why this PR has now bugs than usual".
All that to say that the base of your argument is still correct: AI really isn't saving all that much time since everyone has to proof-read it so much in order to not increase the number of PR bugs from using it in the first place.
AI autocomplete can be infuriating if like me, you like to browse the public methods and properties by dotting the type. The AI autocomplete sometimes kicks in and starts writing broken code using suggestions that don't exist and that prevents quickly exploring the actual methods available.
I have largely disabled it now, which is a shame, because there are also times it feels like magic and I can see how it could be a massive productivity lever if it needed a tighter confidence threshold to kick in.
If I can, I map it to ctrl-; so I can bring it up when I need it.
But I found once it was optional I hardly ever used it.
I use Deepseek or others as a conversation partner or rubber duck, but I'm perfectly happy writing all my code myself.
Maybe this approach needs a trendy name to counter the "vibe coding" hype.
"rubber botting"
Agreed. You may like the arms-length stuff here: https://github.com/day50-dev/llmehelp . shell-hook.zsh and screen-query have been life-changing
I always forget syntax for things like ssh port forwarding. Now just describe it at the shell:
$ ssh (take my local port 80 and forward it to 8080 on the machine betsy) user@betsy
or maybe:
$ ffmpeg -ss 0:10:00 -i somevideo.mp4 -t 1:00 (speed it up 2x) out.webm
I press ctrl+x x and it will replace the english with a suggested command. It's been a total game changer for git, jq, rsync, ffmpeg, regex..
For more involved stuff there's screen-query: Confusing crashes, strange terminal errors, weird config scripts, it allows a joint investigation whereas aider and friends just feels like I'm asking AI to fuck around.
This never accesses any extradata and works only when explicitly asked? I find terminal as most important part from privacy perspective and I haven’t tried any LLM integration yet…
It is intentionally non-agentic and only runs when invoked.
For extradata it sends uname and the procname when it captures such as "nvim" or "ipython" and that's it.
I also realized this morning that shell-hook is good enough to typo correct. I have that turned on at the shell level (setopt correct) but sometimes it doesn't work like here
git cloen blahalalhah
I did a ctrl+x x and it fixed it. I'm using openrouter/google/gemma-3-27b-it:free via chutes. Not a frontier model in the slightest.
I thought Cursor was dumb and useless too when I was just using autocomplete. It's the "agent chat" on the sidebar that is where it really shines.
I was 100% in agreement with you when I tried out Copilot. So annoying and distracting. But Cursor’s autocomplete is nothing like that. It’s much less intrusive and mostly limits itself to suggesting changes you’ve already done. It’s a game changer for repetitive refactors where you need to do 50 nearly identical but slightly different changes.
I had turned autocomplete off as well. Way too many times it was just plain wrong and distracting. I'd like it to be turned on for method documentation only, though, where it worked well once the method was completed, but so far I wasn't able to customize it this way.
I've be very surprised if the LLM is correctly identifying the "why" that method documentation should capture.
Having it as tab was a mistake, tab complete for snippets is fine because it’s at the end of a line, tab complete in empty text space means you always have to be aware if it’s in autocomplete context or not before setting an indent.
We have an internal ban policy on copilot for IP reasons and while I was... missing it initially, now just using neovim without any AI feels fine. Maybe I'll add an avante.nvim for a built-in chat box though.
The chat in what tool? Not Cursor nor Windsurf, it sounds like?
You could also use these AI coding features on a plug-and-play basis with an IDE extension.
For example, VS Code has Cline & Kilo Code (disclaimer: I help maintain Kilo).
Jetbrains has Junie, Zencoder, etc.
It sometimes works really well, but I have at times been hampered by its autocomplete.
Your comment is about 2 years late. Autocomplete is not the focus of AI IDEs anymore, even though it has gotten really good with "next edit predicion". People use AI these days use it for the agentic mode.
Absolutely
Yeah
AI autocomplete is a feature, not a product (to paraphrase SJ)
I can understand Windsurf getting the valuation as they had their own Codeium model
$B for a VSCode fork? Lol
Microsoft seems to be always winner - maybe they predicted all this and for this reason they made core extensions closed source.
What folks don't understand, or keep in mind maybe, is that in order for that autocomplete to work, all your code is going up to a third party as you write it or open files. This is one of the reasons I disable it. I want to control what I send via the chat side panel by explicitly giving it context. It's also pretty useless most of the time, generating nonsense and not even consistently either.
Honestly, the only files I like this turned on is unit tests.
Asking HN this is like asking which smartphone to use. You'll get suggestions for obscure Linux-based modular phones that weigh 6 kilos and lack a clock app or wifi. But they're better because they're open source or fully configurable or whatever. Or a smartphone that a fellow HNer created in his basement and plans to sell soon.
Cursor and Windsurf are both good, but do what most people do and use Cursor for a month to start with.
It's frightening how well you called this, if you scroll down the page literally exactly the dynamic that you mentioned is playing out in real time.
I use Cursor and I like it a lot.
haha so on point! In the HN world, backend are written in Rust with formal proof and frontend are in pure JS and maybe Web Components. In the real world however, a lot of people are using different tech
Except for the crowd of extreme purists on HN where the backend is written in their divine C language by programmers blessed with an inability to ever have bugs that make it to production. Ad where the frontend is pure HTML because JavaScript is the language the devil speaks.
"The clock app isn't missing! You just have to cross-compile it from source and flash a custom firmware that allows loading it!"
Surely, you're not the only one here that doesn't share the open source extremist views. HN has a diverse user base.
"It can't make calls yet because we're waiting on a module that doesn't taint the kernel"