The best presentation I've seen about CPU performance related to pipelining, branch prediction, and speculative execution was Chandler Carruth's "Going Nowhere Faster" presentation at CppCon 2017 [0]. I do recommend watching the whole presentation, but if you watch nothing else then just watch the 5 minutes or so from the linked timestamp.
It also contains a wonderfully prescient question asked right at the end of the talk: "... the processor gonna speculate, doing some loads out of the bounds of the array, how does it work in the hardware that it doesn't crash?"
Left unanswered at the time. I believe Spectre was known but not publicly disclosed at this time.
Correct (well, maybe not half a century, maybe 30 years or so). I was just about to reply that I'd love a version of this that shows instructions going in and out of a re-order buffer. That would be enlightening.
Well, how about the Berkeley Out-of-Order Machine [0] (BOOM)? It's superscalar, out-of-order RISC-V design (one of the very first ones, in fact), and the documentation is fairly detailed. Read [0] and [1] for the general introduction, and then move down to the "Core Overview" section in the left navbar: "Instruction Fetch", "Branch Prediction", etc.
Also, here [2] is another, much more detailed explanation of an O-o-O implementation of a very simplistic RISC ISA which nonetheless has most of the relevant RISC-V features. There are also some other related texts on this subsite [3], including a single-cycle and a pipelined implementations, for the comparison.
If anyone is interested, at https://sonic-rv.ics.jku.at/ we built an educational platform for web-based simulation and visualization of RISC-V processor architectures.
Our pipeline visualization is reconstructed from real RTL traces (you can run your on programs which are simulated using GHDL).
Under examples you can find some different architectures based on the Harris&Harris book on computer architecture.
I am always puzzled by such articles - its actually very well made, drawings are good, little interactive pipeline animations are fine. But in order to follow it you must already know and understand what its writeen about and if you dont - the content is just noise for you.
Hi, I'm the author! Thanks for saying it's well made :).
I actually agree with you, the intended audience isn't someone who has never heard of CPUs before.
I tend to either write for myself: you know the saying you don't understand something until you try to explain. Or I'm writing for the person self-studying that is looking for that one explanation where everything finally clicks. I always get a lot out of those type of posts myself, so like to create them for others too.
You could use colors in the step-by-step simulation to show dependencies.
Also show some tooltips/comments when things happen (that you described above). Ideally one should press next next next in the simulation, and understand what happens better than the paragraph description above.
Now do a dynamic scheduling out of order engine with renaming, 20 pipes, speculative execution and hundreds of instructions in flight. I guess you could make a multi-person game for this.
The best presentation I've seen about CPU performance related to pipelining, branch prediction, and speculative execution was Chandler Carruth's "Going Nowhere Faster" presentation at CppCon 2017 [0]. I do recommend watching the whole presentation, but if you watch nothing else then just watch the 5 minutes or so from the linked timestamp.
[0]: https://youtu.be/2EWejmkKlxs?t=2511
It also contains a wonderfully prescient question asked right at the end of the talk: "... the processor gonna speculate, doing some loads out of the bounds of the array, how does it work in the hardware that it doesn't crash?"
Left unanswered at the time. I believe Spectre was known but not publicly disclosed at this time.
CPUs haven't worked like that in anything but a microcontroller for half a century
Correct (well, maybe not half a century, maybe 30 years or so). I was just about to reply that I'd love a version of this that shows instructions going in and out of a re-order buffer. That would be enlightening.
Well, how about the Berkeley Out-of-Order Machine [0] (BOOM)? It's superscalar, out-of-order RISC-V design (one of the very first ones, in fact), and the documentation is fairly detailed. Read [0] and [1] for the general introduction, and then move down to the "Core Overview" section in the left navbar: "Instruction Fetch", "Branch Prediction", etc.
Also, here [2] is another, much more detailed explanation of an O-o-O implementation of a very simplistic RISC ISA which nonetheless has most of the relevant RISC-V features. There are also some other related texts on this subsite [3], including a single-cycle and a pipelined implementations, for the comparison.
[0] https://docs.boom-core.org/en/latest/sections/intro-overview...
[1] https://docs.boom-core.org/en/latest/sections/intro-overview...
[2] https://user.eng.umd.edu/~blj/risc/RiSC-oo.1.pdf
[3] https://user.eng.umd.edu/~blj/risc/
The tiny MIPS (or compatible) cores in things like cheap router SoCs might still be like that.
If anyone is interested, at https://sonic-rv.ics.jku.at/ we built an educational platform for web-based simulation and visualization of RISC-V processor architectures.
Our pipeline visualization is reconstructed from real RTL traces (you can run your on programs which are simulated using GHDL).
Under examples you can find some different architectures based on the Harris&Harris book on computer architecture.
Maybe it's just me, but the visualizations do not help me at all.
I am always puzzled by such articles - its actually very well made, drawings are good, little interactive pipeline animations are fine. But in order to follow it you must already know and understand what its writeen about and if you dont - the content is just noise for you.
Hi, I'm the author! Thanks for saying it's well made :).
I actually agree with you, the intended audience isn't someone who has never heard of CPUs before.
I tend to either write for myself: you know the saying you don't understand something until you try to explain. Or I'm writing for the person self-studying that is looking for that one explanation where everything finally clicks. I always get a lot out of those type of posts myself, so like to create them for others too.
You could use colors in the step-by-step simulation to show dependencies. Also show some tooltips/comments when things happen (that you described above). Ideally one should press next next next in the simulation, and understand what happens better than the paragraph description above.
The article does say what it expects you to know before reading. However, it has a dead link to the knowledge it wants you to know.
Author here: thanks for flagging the dead link! Unfortunately, I had to remove it. I couldn't find the original slides.
Now do a dynamic scheduling out of order engine with renaming, 20 pipes, speculative execution and hundreds of instructions in flight. I guess you could make a multi-person game for this.