When people say "this is as bad as it's gonna be" and "it only gets better from here" about AI, I have Spotify to show them, no, regression is entirely possible! It used to work better but now it's like wtf and confusing.
But hey, they raised rates numerous times to pay for all those nonexistent new features. I cancelled my membership a week or two ago after yet another rate hike.
The really interesting part is what this system runs on.
Spotify built Fleet Management back in 2022 to apply changes across thousands of repos. Half their PRs were already flowing through it before any AI was involved, but only for mechanical stuff like dep bumps and config updates. Claude Code is what let it understand what the code means, not just its structure.
The thing nobody's talking about is that none of this AI automation works without Backstage doing the boring work of tracking who owns what, how things build, and what depends on what.
> a framework for applying code changes across hundreds or thousands of repositories at once
Statements like this raise fair questions. Is there code duplication across 1,000s of repos, and why respond by increasing surface area further with bespoke tooling?
Imagine you initialized 10,000 NPM repos identically simultaneously. Then had 100 different teams each take of 100 those repos for 10 different projects, and let each repo run for 1,000 commits. How distinct would each of those repos be? How might have they evolved independently? What types of interesting patterns might be adopted to improve development experience, or detect bugs by each team? What packages at what version might be most popular?
Now imagine you had the tools to do a diff across all those repos simultaneously, and classify, group, and review those patterns. What could you learn NPM teams and practices?
Now imagine you could pick best of breed, and propagate those back to all the other projects automatically to improve their productivity, security, etc. How fast would your productivity improve and your engineering culture change if everyone could automatically learn the best of what everyone else had to offer?
Companies like Spotify have sophisticated tooling for detecting repo changes and enforcing policy like that, and they run that experiment 1,000 times a day. Small evolutions in what was an identical build script, like a version bump, are detected, and if it passes a threshold it can be rolled out everywhere else immediately.
Having all the copies that you can sync up centrally periodically puts natural selection to work on internal best practices.
Basically, things work differently at scale. When the number developers you employ approaches a meaningful percentage of the total number of developers globally, your internal diversity starts to mirror the global diversity. So you have to manage that diversity. If you freeze policy entirely, you fall behind the global average. If you let things run wild, your company fractures technologically.
So, make a 1,000 copies, see what pops up, adopt and enforce things that look good, then do it again. Evolve to the next best place you can be from where you are.
> The thing nobody's talking about is that none of this AI automation works without Backstage doing the boring work of tracking who owns what, how things build, and what depends on what.
Do you mean that Backstage has the metadata like what services call which other services, and AI needs that to make changes safely? Sounds helpful to both AI and human developers ;-)
I wonder how it works and how much heavy lifting "supervising" is doing. Whenever I try to use AI, the outcome is about the same.
It's good at non-critical things like logging or brute force debugging where I can roll back after I figure out what's going on. If it's something I know well, I can coax a reasonable solution out of it. If it's something I don't know, it's easy to get it hallucinating.
It really goes off the rails once the context gets some incorrect information and, for things that I don't understand thoroughly, I always find myself poisoning the context by asking questions about how things work. Tools like the /ask mode in Aider help and I suspect it's a matter of learning how to use the tooling, so I keep trying.
I'd like to know if AI is writing code their best developers couldn't write on their own or if it's only writing code they could write on their own because that has a huge impact on efficiency gains, right? If it can accelerate my work, that's great, but there's still a limit to the throughput which isn't what the AI companies are selling.
I do believe there are gains in efficiency, especially if we can have huge contexts the AI can recall and explain to us, but I'm extremely skeptical of who's going to own that context and how badly they're going to exploit it. There are significant risks.
If someone can do the work of 10 people with access to the lifetime context of everyone that's worked on a project / system, what happens if that context / AI memory gets taken away? In my opinion, there needs to be a significant conversation about context ownership before blindly adopting all these AI systems.
In the context of Spotify in this article, who owns the productivity increase? Is it Spotify, Anthropic, or the developers? Who has the most leverage to capture the gains from increasing productivity?
There's no definitive way to tell if some code is written by AI or not; thus their statement doesn't have to be true. Usage of AI itself is nebulous, it could mean anything from OpenClaw-style automated agents to someone prompting an LLM via chat to an engineer manually writing code after wasting hours trying to get an agent to do it (that still counts as "usage", even if not ultimately productive).
The market currently rewards claims of usage of AI, so everyone is claiming to be using AI. There is no way to prove one way or another, and the narrative will be changed down the line once the market swings.
When it comes to productivity claims, I have yet to see it. If AI is truly providing significant, sustained productivity improvements across the software development lifecycle I'd expect products to be better, cheaper, or get developed faster (all of which happened with other industrial breakthroughs). I do not see that in software at large and especially not in Spotify's case.
The macOS app is literally dead. Before AI invasion, there were frequent updates and features, but now it feels like a corpse - cold and decaying in time.
Spotify hasn't released a new useful feature or improved any existing one since ~2018, IIRC.
They did however, remove many useful ones.
It also took them a solid month to fix sorting podcast episodes by date a bit back, essentially broke the experience.
When people say "this is as bad as it's gonna be" and "it only gets better from here" about AI, I have Spotify to show them, no, regression is entirely possible! It used to work better but now it's like wtf and confusing.
It is unusable you cannot go to your music library without being exposed to advertisements.
you know you can pay them money, right?
how does one get used to the taste of leather, rubber, dirt and gravel? is there much nutrition there?
But hey, they raised rates numerous times to pay for all those nonexistent new features. I cancelled my membership a week or two ago after yet another rate hike.
The really interesting part is what this system runs on.
Spotify built Fleet Management back in 2022 to apply changes across thousands of repos. Half their PRs were already flowing through it before any AI was involved, but only for mechanical stuff like dep bumps and config updates. Claude Code is what let it understand what the code means, not just its structure.
The thing nobody's talking about is that none of this AI automation works without Backstage doing the boring work of tracking who owns what, how things build, and what depends on what.
I went through the Anthropic case study, Spotify's engineering blog series, and the full earnings call transcript here https://www.everydev.ai/p/blog-spotify-built-an-ai-coding-ag...
> a framework for applying code changes across hundreds or thousands of repositories at once
Statements like this raise fair questions. Is there code duplication across 1,000s of repos, and why respond by increasing surface area further with bespoke tooling?
Imagine you initialized 10,000 NPM repos identically simultaneously. Then had 100 different teams each take of 100 those repos for 10 different projects, and let each repo run for 1,000 commits. How distinct would each of those repos be? How might have they evolved independently? What types of interesting patterns might be adopted to improve development experience, or detect bugs by each team? What packages at what version might be most popular?
Now imagine you had the tools to do a diff across all those repos simultaneously, and classify, group, and review those patterns. What could you learn NPM teams and practices?
Now imagine you could pick best of breed, and propagate those back to all the other projects automatically to improve their productivity, security, etc. How fast would your productivity improve and your engineering culture change if everyone could automatically learn the best of what everyone else had to offer?
Companies like Spotify have sophisticated tooling for detecting repo changes and enforcing policy like that, and they run that experiment 1,000 times a day. Small evolutions in what was an identical build script, like a version bump, are detected, and if it passes a threshold it can be rolled out everywhere else immediately.
Having all the copies that you can sync up centrally periodically puts natural selection to work on internal best practices.
Basically, things work differently at scale. When the number developers you employ approaches a meaningful percentage of the total number of developers globally, your internal diversity starts to mirror the global diversity. So you have to manage that diversity. If you freeze policy entirely, you fall behind the global average. If you let things run wild, your company fractures technologically.
So, make a 1,000 copies, see what pops up, adopt and enforce things that look good, then do it again. Evolve to the next best place you can be from where you are.
> The thing nobody's talking about is that none of this AI automation works without Backstage doing the boring work of tracking who owns what, how things build, and what depends on what.
Do you mean that Backstage has the metadata like what services call which other services, and AI needs that to make changes safely? Sounds helpful to both AI and human developers ;-)
I wonder how it works and how much heavy lifting "supervising" is doing. Whenever I try to use AI, the outcome is about the same.
It's good at non-critical things like logging or brute force debugging where I can roll back after I figure out what's going on. If it's something I know well, I can coax a reasonable solution out of it. If it's something I don't know, it's easy to get it hallucinating.
It really goes off the rails once the context gets some incorrect information and, for things that I don't understand thoroughly, I always find myself poisoning the context by asking questions about how things work. Tools like the /ask mode in Aider help and I suspect it's a matter of learning how to use the tooling, so I keep trying.
I'd like to know if AI is writing code their best developers couldn't write on their own or if it's only writing code they could write on their own because that has a huge impact on efficiency gains, right? If it can accelerate my work, that's great, but there's still a limit to the throughput which isn't what the AI companies are selling.
I do believe there are gains in efficiency, especially if we can have huge contexts the AI can recall and explain to us, but I'm extremely skeptical of who's going to own that context and how badly they're going to exploit it. There are significant risks.
If someone can do the work of 10 people with access to the lifetime context of everyone that's worked on a project / system, what happens if that context / AI memory gets taken away? In my opinion, there needs to be a significant conversation about context ownership before blindly adopting all these AI systems.
In the context of Spotify in this article, who owns the productivity increase? Is it Spotify, Anthropic, or the developers? Who has the most leverage to capture the gains from increasing productivity?
There's no definitive way to tell if some code is written by AI or not; thus their statement doesn't have to be true. Usage of AI itself is nebulous, it could mean anything from OpenClaw-style automated agents to someone prompting an LLM via chat to an engineer manually writing code after wasting hours trying to get an agent to do it (that still counts as "usage", even if not ultimately productive).
The market currently rewards claims of usage of AI, so everyone is claiming to be using AI. There is no way to prove one way or another, and the narrative will be changed down the line once the market swings.
When it comes to productivity claims, I have yet to see it. If AI is truly providing significant, sustained productivity improvements across the software development lifecycle I'd expect products to be better, cheaper, or get developed faster (all of which happened with other industrial breakthroughs). I do not see that in software at large and especially not in Spotify's case.
The macOS app is literally dead. Before AI invasion, there were frequent updates and features, but now it feels like a corpse - cold and decaying in time.
"They actually only generate code and supervise it." I wonder what "supervise" means...
[dead]
Yeah well it pretty obvious Spotify is doing no development any more only maintenence.
They’re probably spending all their time on Teams calls like everyone else.
Interesting, I will have to test that. Does Spotify have any security bug bounties? :D
Anna's Archive recently tested that successfully.
Layoffs incoming?