Im building some music playback software and am currently struggling with the implementation of a spectrum analyzer to visualize the music.
This is incredible stuff and I learned a lot. Well done sir.
Ps, also mourning the loss of Fable! It sorted out a 3 month bug hunt odyssey in 3 days. For a somewhat novel problem in a pretty niche area (DSD DoP audio crackle problems during certain playback edge cases).
Fable was quite relentless, it was fun watching it work. I described my lisp interpreter project's short term plans and long term roadmap, Fable thought for like 20 minutes then just told me it was all "inevitable" and started working on the stuff. Ever since then I started to picture Fable as some kind of Terminator.
Left me that code and a massive code review that unfortunately didn't contain any of the I/O and memory safety hardening I wanted. I haven't fully reviewed the code yet. I get a little sad when I read it. Not a US citizen so I'm not sure I'll ever get to use a state of the art model again.
> thought for like 20 minutes then just told me it was all "inevitable"
I have in mind an image of ASI as something that's able to seamlessly work across time as if it was weaving cloth. Reasoning about not just first or second order effects, but able to richly play with the nature of causality itself. In the limit, it effects change far into the distant future simply by making only the most minute change in the present then sitting back and waiting for things to play out.
For an AI that can do this, things like "managing subagents" or "context compaction" become child's play. Perhaps we'll know if we're getting close by seeing how well models do at prediction markets.
An octave (for example from a C to the next C) is a doubling in frequency. In the Western diatonic system, there are 12 notes per octave. (C, C#, D, D#, E, F, F#, G, G#, A, A#, B). Notes are "evenly spaced" within the octave - every note has the same ratio between its frequency and the frequency of the next note. Hence, that ratio is ¹²√2
One of the weaknesses of the video is that there are artifacts in the narration of passing through a text layer. "Bass" is pronounced as the fish at one point. "Wound" is pronounced as the injury. It's clear that these are homonyms of what was actually intended by the script.
That was my experience with Fable as well. Pulled my extremely complex project that I could squint and see was possible, but actually put mathematical concreteness to things in a way I could only intuit.
On the flip side, visualizers have always fascinated me. I love this one, but one build off I've always wanted to see: analyze the entire file a priori, and then generate the visuals. Sort of like a normalization pass, but getting longer form structures decoded ahead of time could be pretty neat.
I was not expecting the part where Fable produces a passable 3Blue1Brown-style explainer video of the algorithms it just implemented that sounds like it's narrated by a character from Dora the Explorer.
>The writing is also literary. It draws an analogy between the 12 musical pitch classes and the 12 markings on a clock. Noise lingers. Material surges off the rim.
I absolutely hate this revolting writing style by LLMs
Kinda interesting how its just like a FFT chart in a circle but perhaps the author is not aware that is the case. Would be curious to know what things were "implmentation details" for the fancy AI and what wasn't.
I could be wrong but milkdrop already would do light FFT analysis for effects right?
Pretty sure the author is aware. I think the interesting part is that the frequency is logarithmic and one rotation = 2x. This means you can make musical observations about chord qualities from the plot. That's not generally true for FFT plots.
You are right, it is cool idea in general. But, idk if we are seeing the same thing, in practice it ends up being kinda mushy looking right? In part, I think, because like its not rooted by a given root note, so at best we end up seeing constantly rotating, slightly different clock arrangements. Even in ideal conditions, anything like, e.g., Cmaj7 to Em is going to look almost the same, which feels off given the perceived harmonic change. I don't know if the arrangements being the same after transposition is as much a feature as a bug I guess.
I'm also curious about the implementation details, the result is visually beautiful, but the code could be interesting too, at least as a 'Fable hystorical artifact'. Is it visible on github?
This was really cool. I would love to play some Dave Tipper on this to see what it looks like.
Im building some music playback software and am currently struggling with the implementation of a spectrum analyzer to visualize the music.
This is incredible stuff and I learned a lot. Well done sir.
Ps, also mourning the loss of Fable! It sorted out a 3 month bug hunt odyssey in 3 days. For a somewhat novel problem in a pretty niche area (DSD DoP audio crackle problems during certain playback edge cases).
Fable was quite relentless, it was fun watching it work. I described my lisp interpreter project's short term plans and long term roadmap, Fable thought for like 20 minutes then just told me it was all "inevitable" and started working on the stuff. Ever since then I started to picture Fable as some kind of Terminator.
Left me that code and a massive code review that unfortunately didn't contain any of the I/O and memory safety hardening I wanted. I haven't fully reviewed the code yet. I get a little sad when I read it. Not a US citizen so I'm not sure I'll ever get to use a state of the art model again.
> thought for like 20 minutes then just told me it was all "inevitable"
I have in mind an image of ASI as something that's able to seamlessly work across time as if it was weaving cloth. Reasoning about not just first or second order effects, but able to richly play with the nature of causality itself. In the limit, it effects change far into the distant future simply by making only the most minute change in the present then sitting back and waiting for things to play out.
For an AI that can do this, things like "managing subagents" or "context compaction" become child's play. Perhaps we'll know if we're getting close by seeing how well models do at prediction markets.
> As we all know, the foundation of Western diatonic music theory is ¹²√2, the ratio between the frequencies of successive semitones.
Nods knowingly. Yes, of course. I definitely know this.
In a nutshell:
An octave (for example from a C to the next C) is a doubling in frequency. In the Western diatonic system, there are 12 notes per octave. (C, C#, D, D#, E, F, F#, G, G#, A, A#, B). Notes are "evenly spaced" within the octave - every note has the same ratio between its frequency and the frequency of the next note. Hence, that ratio is ¹²√2
This is amazing.
One of the weaknesses of the video is that there are artifacts in the narration of passing through a text layer. "Bass" is pronounced as the fish at one point. "Wound" is pronounced as the injury. It's clear that these are homonyms of what was actually intended by the script.
That was my experience with Fable as well. Pulled my extremely complex project that I could squint and see was possible, but actually put mathematical concreteness to things in a way I could only intuit.
On the flip side, visualizers have always fascinated me. I love this one, but one build off I've always wanted to see: analyze the entire file a priori, and then generate the visuals. Sort of like a normalization pass, but getting longer form structures decoded ahead of time could be pretty neat.
I was not expecting the part where Fable produces a passable 3Blue1Brown-style explainer video of the algorithms it just implemented that sounds like it's narrated by a character from Dora the Explorer.
What a strange era we now live in.
Relevant YouTube video about content farming channels creating AI generated math explainers.
https://youtu.be/mRO_QonhC2c
You mean what a strange era an opaque set of administration-approved companies live in...
> steady stream of promotions until they cap out at L5
Am I missing a joke? L5 is just a single promotion away from hiring-out-of-college, at least for the FAANG that I was at.
Only Amazon starts new grads at L4, the rest starts them as L3 so L5 would be senior.
Not that 2 promotions is a "steady stream"...
>The writing is also literary. It draws an analogy between the 12 musical pitch classes and the 12 markings on a clock. Noise lingers. Material surges off the rim.
I absolutely hate this revolting writing style by LLMs
Kinda interesting how its just like a FFT chart in a circle but perhaps the author is not aware that is the case. Would be curious to know what things were "implmentation details" for the fancy AI and what wasn't.
I could be wrong but milkdrop already would do light FFT analysis for effects right?
Pretty sure the author is aware. I think the interesting part is that the frequency is logarithmic and one rotation = 2x. This means you can make musical observations about chord qualities from the plot. That's not generally true for FFT plots.
You are right, it is cool idea in general. But, idk if we are seeing the same thing, in practice it ends up being kinda mushy looking right? In part, I think, because like its not rooted by a given root note, so at best we end up seeing constantly rotating, slightly different clock arrangements. Even in ideal conditions, anything like, e.g., Cmaj7 to Em is going to look almost the same, which feels off given the perceived harmonic change. I don't know if the arrangements being the same after transposition is as much a feature as a bug I guess.
I'm also curious about the implementation details, the result is visually beautiful, but the code could be interesting too, at least as a 'Fable hystorical artifact'. Is it visible on github?