I'm somewhat surprised that this is not open source (from what I can tell). Compare to Mimo Code https://github.com/XiaomiMiMo/MiMo-Code (which is a CLI, while this is a desktop app).
They might be sending some user requests to Anthropic to gather trading data for their own models. If they do so, perhaps they need to add some tracer to request that they prefer to hide.
I don't even know what I would do with a desktop app. I'm running these things in headless VMs, so I can run them with `--dangerously-skip-permissions` or whatever. I don't trust them, even without that flag, on my desktop/laptop.
Given that there's such severe concern being expressed by Anthropic about Claude being distilled, and the idea that the harness is part of the the moat, it doesn't seem super surprising that the other side of that would try to also make it harder for them to tell how well they're doing and what their approach is.
It's only a cli because they yanked out the opencode desktop code. (As well as the opencode go/zen model provider)
Edit: my theory is they wanted to mimic being the primary provider in a quick way with a lot of string replace. Though they could have added opencode back as a regular provider.
If you're already used to your TUI coding agent, you don't need the desktop agent. Although it is nice that it is there for folks who prefer the Codex App/Claude App UI approach.
Yeah, I use GLM 5.2 in OpenCode, running in a Docker container with CodeNomad as the web-based GUI. It works perfectly; I can access it from anywhere, and it runs all models (except for Anthropic's subscriptions).
For some tasks, it's better. Opus refuses tasks for me pretty regularly. GLM 5.2 has never refused a task. So for anything security-related or that touches on topics that trigger Opus's safety guardrails, I use GLM 5.2.
OTOH, for anything related to UI design, I use Opus 4.8. It's much better at taking relatively vague descriptions of user interfaces and a mockup of a related UI and combining them into an immaculate design.
For anything else, I tend to run tasks in Opus and then have GLM review them and write a Markdown file with anything it finds. Then I have Opus review the markdown file and fix the issues it agrees with. The reason I usually go with Opus 4.8 first is mainly that it's faster. Opus 4.8 is, on average, about twice as fast as GLM 5.2 running on z'ai's infrastructure for the same task. There's a large variance (sometimes GLM 5.2 is pretty fast and Opus 4.8 is pretty slow), but on average it's a very noticeable difference.
When I run into Anthropic's Quota, I switch to GLM 5.2 rather than Sonnet. I don't think there's much reason to ever use Sonnet for anything if you can use GLM 5.2 instead.
This is all pretty subjective, of course. On average, I think Opus 4.8 is still a better, more reliable, and faster model, but if it went away tomorrow and I only had GLM 5.2, I wouldn't be too sad about it; I'd get things done with GLM 5.2 just fine.
Are you micromanaging your GLM costs? It seems the best bang for buck strategy right now is a Opencode Go subscription to get the subsidized rate and then switch to Openrouter's model above and beyond that + make use of a dual model strategy by having GLM 5.2 do planning and Deepseek V4 Flash for implementation.
Looks quite pretty! Not sure if I want to try that instead of OpenCode, maybe. OpenCode also has a desktop app, I will admit that I like their TUI one better (and honestly more than Claude Code TUI) but whole the desktop version is kinda more basic, it's nice enough: https://opencode.ai/download
That said, it's interesting that they're releasing a bunch of stuff: ZCode, OCR.z.ai, Image.z.ai, Audio.z.ai, AutoClaw and some other stuff that https://chat.z.ai/ links to. That's a lot of stuff for one org to pull off.
Figured I'd try out their Pro coding plan, seems like it doesn't necessarily give me that much quota than Opus (at least given how many tokens are needed for accomplishing a certain task), but GLM 5.2 in of itself seems like a beefier Sonnet model, pretty good.
It gets even wilder when you click on "Join the Linux Beta Group". That leads you to https://www.feishu.cn/download eventually. I have no clue what feishu.cn is and I don't see a language toggle. Sometimes it just seems like chinese companies simply don't want international business.
Does anyone use an agnostic TUI or harness for development tasks that can fairly seamlessly switch between providers?
I'm wanting local context in the spirit of "here are 3 AI providers available, for coding tasks use this one... and for writing prose use this one... and for generating images use this one..." etc.
I’ve written a skill for codex and Claude code that designates an orchestrator on the primary worktree and “workers” on N supporting worktrees.
The supporting worktrees are labeled wb1, wb2…wbn. You run either Claude or Codex in tabs for each work tree.
Then you generally work with the orchestrator on the primary worktree. It delegates tasks to the different workers and answers their smaller questions, surfacing results and assisting them with context clearing when needed.
The orchestrator and workers communicate using a simple shared file system under tmp/* and together they can handle a big and varied workload.
The orchestrator knows which AI client is running in any given worktree, so it would be fairly easy to designate which AI should receive what kind of tasks.
I do have some AI TUI specific instructions, for instance codex is primitive at monitoring compared to CC.
I use iterm2, so I’ve also added iterm2 specific python that allows the orchestrator to “kick” a worker by modifying the input and submitting it.
This also allows the orchestrator to monitor and reset its own context when necessary.
All context resets are handled gracefully, and continuation prompt and comms history allows workers and orchestrators to ably restore and continue their work without need to compact.
OpenCode was the first agent harness I used, and I have always like it. You can configure a wide variety of providers, but it's open source and has a number of core contributors.
The other opinionated option is Pi (the Pi agent harness). This is a great lightweight option and also supports a number of providers. You can also use local model servers.
I use the one that I've been developing since 2023. It's intended to be used in exactly this spirit! Written in Go, has image support (which has yet to be fleshed out).
It supports MCP (unlike Pi), sandboxing (with user-mode networking), and runs efficiently at huge contexts.
have used both pi and opencode for the last 6 months, haven't opened a proprietary harness (cc, codex, cursor) in that same amount of time. right now i'm on pi and i can switch seamlessly between any model across any provider i want, even mid session. can even point them at locally running models.
i think people don't realize how much better life is over on this side, cc and codex rely entirely on vendor lock in imo.
i like the more minimal design of the tui, feels more integrated with my existing terminal workflows. oc always looked a little out of place. i really like pi's extension ecosystem as well.
I don't find a closed-source Chinese agent system trustworthy.
It is essentially a black box with full user permissions, meaning you are just handing over your entire system to a Chinese-owned server. With OpenCode and its GLM provider, at least I can monitor which files were read, which were edited, and what commands were executed.
Not to mention that Chinese national security laws legally obligate companies to cooperate with state intelligence and counter-espionage efforts [0]. If you have this installed on a corporate workstation, and your company is large enough, the possibility of them spying on you is not just a risk—it's almost a certainty.
I think it’s a real concern. Chinese companies are much more closely tied to the state, as in if you decide to go to China one day they might already have all the data on how you have interacted with their models.
The US is certainly inching in that direction but it’s not like someone from the US government sits at Anthropic’s HQ reading chats from state people of interest.
if you're going to try this one out, don't be surprised to get this message repeatedly, like 4 out of 5 prompts you're trying to send, 24/7, this is gonna be your new friend, then you'll learn to write the only prompt that matters: "retry", "retry", "retry"
Here's the message: "Cannot connect to API: write EPIPE"
For GLM Coding Plan subscribers, quota consumed via Coding Plan for GLM-5.2 in ZCode is discounted by the coefficients below — the same usage draws down less quota, roughly 1.5x the effective allowance.
Peak hours (14:00–18:00 daily) 3x -> 2x
Off-peak (remaining 20 hours) 1x -> 0.67x
I wonder whether that is referring to local time, or CST (UTC+8)?
It's sad to see that the teams that have the most resources that can contribute to development of next-gen harnesses are essentially copying the same exact thing from each other, with no meaningful changes.
And most of the advancement and experimentation happens in some random 0-star github repos.
I would very much agree. Even the hand icon, the usage in the text field, and the sidebar style are 1:1 identical to Codex. It's a misleading title - it's not close the Claude Code.
"Quality" of the harness matters a lot to the user experience, and the construction of the harness will depend on the behavior/quirks of the underlying model. So, if you're using Claude Code, you can expect it to work best with Anthropic models, and expect other model-makers to want you to use the harness they've developed.
I'm somewhat surprised that this is not open source (from what I can tell). Compare to Mimo Code https://github.com/XiaomiMiMo/MiMo-Code (which is a CLI, while this is a desktop app).
They might be sending some user requests to Anthropic to gather trading data for their own models. If they do so, perhaps they need to add some tracer to request that they prefer to hide.
Source? Or is it "trust me bro"?
I don't even know what I would do with a desktop app. I'm running these things in headless VMs, so I can run them with `--dangerously-skip-permissions` or whatever. I don't trust them, even without that flag, on my desktop/laptop.
Given that there's such severe concern being expressed by Anthropic about Claude being distilled, and the idea that the harness is part of the the moat, it doesn't seem super surprising that the other side of that would try to also make it harder for them to tell how well they're doing and what their approach is.
It's only a cli because they yanked out the opencode desktop code. (As well as the opencode go/zen model provider)
Edit: my theory is they wanted to mimic being the primary provider in a quick way with a lot of string replace. Though they could have added opencode back as a regular provider.
Z.ai documents integrations with nearly all the popular CLI-based agents: https://docs.z.ai/devpack/tool/others
If you're already used to your TUI coding agent, you don't need the desktop agent. Although it is nice that it is there for folks who prefer the Codex App/Claude App UI approach.
Yeah, I use GLM 5.2 in OpenCode, running in a Docker container with CodeNomad as the web-based GUI. It works perfectly; I can access it from anywhere, and it runs all models (except for Anthropic's subscriptions).
From your experience, is it comparable to Claude Code with Opus 4.8? How does it feel? How do the two differ?
It's comparable, but not the same.
For some tasks, it's better. Opus refuses tasks for me pretty regularly. GLM 5.2 has never refused a task. So for anything security-related or that touches on topics that trigger Opus's safety guardrails, I use GLM 5.2.
OTOH, for anything related to UI design, I use Opus 4.8. It's much better at taking relatively vague descriptions of user interfaces and a mockup of a related UI and combining them into an immaculate design.
For anything else, I tend to run tasks in Opus and then have GLM review them and write a Markdown file with anything it finds. Then I have Opus review the markdown file and fix the issues it agrees with. The reason I usually go with Opus 4.8 first is mainly that it's faster. Opus 4.8 is, on average, about twice as fast as GLM 5.2 running on z'ai's infrastructure for the same task. There's a large variance (sometimes GLM 5.2 is pretty fast and Opus 4.8 is pretty slow), but on average it's a very noticeable difference.
When I run into Anthropic's Quota, I switch to GLM 5.2 rather than Sonnet. I don't think there's much reason to ever use Sonnet for anything if you can use GLM 5.2 instead.
This is all pretty subjective, of course. On average, I think Opus 4.8 is still a better, more reliable, and faster model, but if it went away tomorrow and I only had GLM 5.2, I wouldn't be too sad about it; I'd get things done with GLM 5.2 just fine.
Are you micromanaging your GLM costs? It seems the best bang for buck strategy right now is a Opencode Go subscription to get the subsidized rate and then switch to Openrouter's model above and beyond that + make use of a dual model strategy by having GLM 5.2 do planning and Deepseek V4 Flash for implementation.
No. I got the yearly highest-end GLM subscription when it was available for a few hundred bucks. I haven't run into quota limits even once.
Thank you, this is the type of hands-on experience report i was looking for.
Also, kudos to the Z.ai team for adding Linux support from day one.
Looks quite pretty! Not sure if I want to try that instead of OpenCode, maybe. OpenCode also has a desktop app, I will admit that I like their TUI one better (and honestly more than Claude Code TUI) but whole the desktop version is kinda more basic, it's nice enough: https://opencode.ai/download
That said, it's interesting that they're releasing a bunch of stuff: ZCode, OCR.z.ai, Image.z.ai, Audio.z.ai, AutoClaw and some other stuff that https://chat.z.ai/ links to. That's a lot of stuff for one org to pull off.
Figured I'd try out their Pro coding plan, seems like it doesn't necessarily give me that much quota than Opus (at least given how many tokens are needed for accomplishing a certain task), but GLM 5.2 in of itself seems like a beefier Sonnet model, pretty good.
Their tui is quite heavy and crashing quite often as compared to claude code.
Which are you talking about? OpenCode or ZCode?
OpenCode
The site is in Chinese (?) and there is no obvious way to switch to English on mobile?
I don't know about on mobile, but on desktop there is an EN / CN button on top.
It gets even wilder when you click on "Join the Linux Beta Group". That leads you to https://www.feishu.cn/download eventually. I have no clue what feishu.cn is and I don't see a language toggle. Sometimes it just seems like chinese companies simply don't want international business.
Dang, can you change the submission url to https://zcode.z.ai/en ?
https://zcode.z.ai/en
There's an `EN` link at top right
Only if you have a wide enough screen. I had to rotate my phone to landscape. Thanks for the pointer!
Does anyone use an agnostic TUI or harness for development tasks that can fairly seamlessly switch between providers?
I'm wanting local context in the spirit of "here are 3 AI providers available, for coding tasks use this one... and for writing prose use this one... and for generating images use this one..." etc.
I’ve written a skill for codex and Claude code that designates an orchestrator on the primary worktree and “workers” on N supporting worktrees.
The supporting worktrees are labeled wb1, wb2…wbn. You run either Claude or Codex in tabs for each work tree.
Then you generally work with the orchestrator on the primary worktree. It delegates tasks to the different workers and answers their smaller questions, surfacing results and assisting them with context clearing when needed.
The orchestrator and workers communicate using a simple shared file system under tmp/* and together they can handle a big and varied workload.
The orchestrator knows which AI client is running in any given worktree, so it would be fairly easy to designate which AI should receive what kind of tasks.
I do have some AI TUI specific instructions, for instance codex is primitive at monitoring compared to CC.
I use iterm2, so I’ve also added iterm2 specific python that allows the orchestrator to “kick” a worker by modifying the input and submitting it.
This also allows the orchestrator to monitor and reset its own context when necessary.
All context resets are handled gracefully, and continuation prompt and comms history allows workers and orchestrators to ably restore and continue their work without need to compact.
https://opencode.ai/
OpenCode was the first agent harness I used, and I have always like it. You can configure a wide variety of providers, but it's open source and has a number of core contributors.
The other opinionated option is Pi (the Pi agent harness). This is a great lightweight option and also supports a number of providers. You can also use local model servers.
I’ve been using Crush with Openrouter and have good success lately
https://github.com/charmbracelet/crush
I use the one that I've been developing since 2023. It's intended to be used in exactly this spirit! Written in Go, has image support (which has yet to be fleshed out).
It supports MCP (unlike Pi), sandboxing (with user-mode networking), and runs efficiently at huge contexts.
https://codeberg.org/mlow/lmcli
(The screenshot in the folder is a little bit out of date, but is still representative of the overall look)
have used both pi and opencode for the last 6 months, haven't opened a proprietary harness (cc, codex, cursor) in that same amount of time. right now i'm on pi and i can switch seamlessly between any model across any provider i want, even mid session. can even point them at locally running models.
i think people don't realize how much better life is over on this side, cc and codex rely entirely on vendor lock in imo.
You can use Claude Code with a self hosted model no problem. I don't believe you can switch during a session though.
Haha I pretty much commented the same thing one minute apart.
why did you switch from oc to pi?
i like the more minimal design of the tui, feels more integrated with my existing terminal workflows. oc always looked a little out of place. i really like pi's extension ecosystem as well.
I don't find a closed-source Chinese agent system trustworthy.
It is essentially a black box with full user permissions, meaning you are just handing over your entire system to a Chinese-owned server. With OpenCode and its GLM provider, at least I can monitor which files were read, which were edited, and what commands were executed.
Not to mention that Chinese national security laws legally obligate companies to cooperate with state intelligence and counter-espionage efforts [0]. If you have this installed on a corporate workstation, and your company is large enough, the possibility of them spying on you is not just a risk—it's almost a certainty.
[0]: https://en.wikipedia.org/wiki/National_Intelligence_Law_of_t...
I agree. I don't find the US competitors trustworthy either. I think open source is the way here.
If you are not US based that’s not really a big concern.
I think it’s a real concern. Chinese companies are much more closely tied to the state, as in if you decide to go to China one day they might already have all the data on how you have interacted with their models.
The US is certainly inching in that direction but it’s not like someone from the US government sits at Anthropic’s HQ reading chats from state people of interest.
if you're going to try this one out, don't be surprised to get this message repeatedly, like 4 out of 5 prompts you're trying to send, 24/7, this is gonna be your new friend, then you'll learn to write the only prompt that matters: "retry", "retry", "retry"
Here's the message: "Cannot connect to API: write EPIPE"
The English language version is:
https://zcode.z.ai/en/docs/welcome
It's sad to see that the teams that have the most resources that can contribute to development of next-gen harnesses are essentially copying the same exact thing from each other, with no meaningful changes.
And most of the advancement and experimentation happens in some random 0-star github repos.
Could you share some of these 0-star github repos?
There the ones with most to prove
UI-wise this looks a lot closer to Codex than Claude Code. It's basically an exact copy of Codex.
I would very much agree. Even the hand icon, the usage in the text field, and the sidebar style are 1:1 identical to Codex. It's a misleading title - it's not close the Claude Code.
As someone who doesnt use these tools, why does every AI company need their own version of Claude Code? Is there more to it than vendor lock-in?
"Quality" of the harness matters a lot to the user experience, and the construction of the harness will depend on the behavior/quirks of the underlying model. So, if you're using Claude Code, you can expect it to work best with Anthropic models, and expect other model-makers to want you to use the harness they've developed.
But mostly vendor lock-in, I imagine.
implementing their own version of steganographic monitoring lol
A joke but also not a joke.
sweet! i'm heaviliy using glm 5.2 in mouse.dev which is great for mobile. the ui looks really good, similar to cursor agents window ect.
Is it possible to use their subscription pricing with Opencode?
Is this GUI only?
Yes.
it's an electron app, it highlights wrong spelling but doesn't suggest corrections. how does someone exhibit so much incompetence?
Welcome to using v1.0.0 of any product
This comes with a little bit of free credits. (after login)
how is this cheaper?
I tried it but went back to OC, which feels smarter.
It does have a 1.5x usage promotion for GLM 5.2 on the coding plan so now is a good time to test it...
GLM-5.2 seems capable. It’s just much slower than Opus.
Telemetry enabled?