Astro - Hacker News

6 comments

Schlagbohrer a minute ago ago

Bunnie also wrote up a post about it on his blog: https://www.bunniestudios.com/blog/2026/bio-the-bao-i-o-copr...
theamk 2 days ago ago

It's a very nice write-up, but this part makes me uneasy:
> So long as all the computation in the loop finishes before the next quantum, the timing requirements [...] are met.
Seems like we are back to cycle counting then? but instead of having just 32 1-IPC instructions, we have up to 4K instructions with various latency, and there is C compiler too, so even if you had enough cycles in budget now, the things might break when compiler is upgraded.
I am wondering if the original PIO approach was still salvageable if the binary compatibility is not a goal. Because while co-processors are useful, people did some amazing things with PIO, like fully software DVI.
[-]
- whstl 31 minutes ago ago
  
  The previous sentence already answers this:
  > Here, we leverage the “quantum” feature to get exact pulse timings without resorting to cycle-counting
  This is just a hard-real-time constraint that already exists in today’s computers and other devices.
  For example: Audio playback and processing are a day-to-day operations where hard-real-time guarantees are necessary for uninterrupted playback, and every digital audio device already conforms to it. If the buffer is too slow you get playback errors.
rasz a day ago ago

PIOs might be heavier on hardware resources
>The BIO uses 14597 cells, while the PIO uses 39087 cells
and BIO might reach higher clock speeds
> when ported to an ASIC flow, the clock rate achieved by the BIO is over 4x that of a PIO implemented in the same process node.
but BIO is ~15x less efficient per clock. RP2350 is capable of reading IOs at 400Mbps (https://github.com/gusmanb/logicanalyzer) and bitbanging at 800 Mbps (HSTX). From Bunnie writeup BIO needs 700MHz to do pedestrian 25Mhz SPI.
fragmede 3 hours ago ago

dupe of https://news.ycombinator.com/item?id=47459363 ?
jauntywundrkind 3 days ago ago

Really glad to get this write-up, adds a very nice broad picture & does a good job introducing the queue too.
I'm an unranked unwashed neophyte at hardware design, but I did spend some time looking at BIO. One particular thing that caught my eye a while ago was Streaming Semantic Registers, which is an instruction set extension for risc-v where load and store are implicit, with data pointers that automatically walk on each instruction. This greatly increases code density, allowing for DSP like capabilities on risc-v. https://arxiv.org/abs/1911.08356
I forget how exactly I was convinced, but after spending a while chatting with the LLM, I became somewhat convinced that the FIFO queues here gave a lot of similar capabilities. With additional interesting use for decoupling multiple systems. Register mapped data arrays, that can be used without having to load/store each word. I felt then and felt now that I still have a good bit to learn about how exactly each of the FIFO registers works, but it was cool to see, and I love this idea of code that can run without having to issue endless load/stores all the time.