Can someone experienced in this space explain this to me in mid-tier engineering terms? Are chiplets physically connected chips that come from different wafers? Or are they designs that can be ‘dragged and dropped’? If they’re physically connected (pre-packaging? Post-packaging?) how is this different than putting them on a board with interconnects?
If it’s design-level, and tapeout needs to happen for a given chiplet design, why does the article mention different process nodes? And also, how is this different than any fabs available IP?
Chiplets are separate dies that come from different wafers. (It used to be called multi-chip modules (MCM) back in the day but now it's chiplets.) You can see the chiplets pretty clearly in photos of e.g. AMD Ryzen.
The benefit of multiple dies on one package is that on-package wires are denser and shorter which increases performance. Multiple chiplets on an interposer is even better.
Generally, multi-chip-modules are for multiple die attached to (advanced) PCB backplanes, while CoWoS (Chip-on-Wafer-on-Substrate) and Foveros are using (passive) silicon backplanes. It radically increases bump and interconnect density.
Sort of - there's some distinction in the "multi-chip" vs "multi-chiplet". We're just not using the "MCM" term to describe those chips anymore. To some degree though, it's true in the sense of: "A rose by any other name would still smell as sweet".
The explanation by 'kurthr provides some insight as to why we don't use the term MCM for them today, because the term would imply that they're not integrated as tightly as they are today.
The best example of MCM's are probably Intel’s Pentium D and Core 2 Quad. In these older "MCM" designs, generally the multiple chips were mostly fully working chips - they each had their own last-level (e.g. L3) cache (LLC). They also happened to be manufactured on the same lithography node. When a core on Die A needed data that a core on Die B was working on, Die B had to send the data off the CPU entirely, down the motherboard's Front Side Bus (FSB), into the system RAM, and then Die A had to retrieve it from the RAM.
IBM POWER4 and POWER5 MCM's did share L3 cache though.
So parent was 'wrong' that "chiplets" were ever called MCM's. But right that "chips designed with multiple chiplet-looking-things" did used to be called "MCM's".
Today's 'chiplets' term implies that the pieces aren't fully-functioning by themselves, they're more like individual "organs". Functionality like I/O, memory controllers, and LLC are split off and manufactured on separate wafers/nodes. In the case of memory controllers that might be a bit confusing because back in the days of MCM's these were not in the same silicon, rather a separate chip entirely on the motherboard, but I digress.
Also MCM's lacked the kind of high-bandwidth, low-latency fabric for CPU's to communicate more directly with each other. For the Pentiums, that was organic substrates (the usual green PCB material) and routing copper traces between the dies. For the IBM's, that was an advanced ceramic-glass substrate, which had much higher bandwidth than PCB traces but still required a lot of space to route all the copper traces (latency taking a hit) and generated a lot of heat. Today we use silicon for those interconnects, which gives exemplary bandwidth+latency+heat performance.
Chiplets are different physical chips, potentially from different nodes or processes. They're physically connected prepackaging. It's beneficial compared to sticking them on a board because then you get the shortest interconnects, and you aren't limited by pinout density. This should let you get the best bandwidth to power ratio, and better latency.
The difference is that until recently, consumers could swap/add RAM/NPU/GPU/etc. Now they're more often epoxied together on the same chip, even if they were produced on two different silicon wafers.
For example, the newer Intel Lunar Lake chip packages together a CPU+iGPU+NPU chiplet and two LPDDR5X RAM chiplets [0][1][2]. If laptop manufacturers want to offer different amounts of RAM, they have to buy a different CPU SKU for 16GB RAM vs 32GB RAM. Panther Lake, the succeeding generation, reversed this and will support off-package RAM modules, but some reasonable people might assume that in the long term RAM will generally be on-package for anything that's not a server/HEDT.
You won't have to worry about making sure the RAM you buy is on the CPU & Motherboard QVL list, but also you won't ever buy RAM by itself at all.
Intel's Clearwater Forest Xeon chip has 12 CPU chiplets, connected together in groups of 4 CPUs each connected to the same 1-of-3 "base" chiplets, and then 2 I/O chiplets. For a total of 17 chiplets, depending on how you count the "base" chiplets which are 'just' fabric, memory controllers, and L3 cache (which is all shared directly between all 4 CPU's). Each group of 4 CPU's sit on top of that base chiplet. [3]
> Are chiplets physically connected chips that come from different wafers?
Yes exactly. Take separate unpackaged chips, and put them all on one shared substrate (a "silicon interposer") that has wires to connect them. There are a bunch of different technologies to connect them. You can even stack dies.
I think typically you wouldn't stack logic dies due to power/cooling concerns (though you totally could). But you can definitely stack RAM (both SRAM and DRAM). Stacking is kind of a new process though as far as I understand it.
Intel has such a strong lead here, with Foveros doing stacked chips including ram (since the amazing, under-rated Lakefield), and with really good EIMB.
AMD is pretty famous for multi-chip, but they're only recently starting to do actually advanced integrating like Sea-of-Wires between chips. So far most of their chips have had big hot PHY to send data back & forth, rather than trying to make multiple chips that really can communicate directly with each other.
Interesting days ahead. The computer is on the chip now. A smaller domain of system building, with many of the same trade-offs & design challenges that it took to build a box.
Can someone experienced in this space explain this to me in mid-tier engineering terms? Are chiplets physically connected chips that come from different wafers? Or are they designs that can be ‘dragged and dropped’? If they’re physically connected (pre-packaging? Post-packaging?) how is this different than putting them on a board with interconnects?
If it’s design-level, and tapeout needs to happen for a given chiplet design, why does the article mention different process nodes? And also, how is this different than any fabs available IP?
Thanks!
Chiplets are separate dies that come from different wafers. (It used to be called multi-chip modules (MCM) back in the day but now it's chiplets.) You can see the chiplets pretty clearly in photos of e.g. AMD Ryzen.
The benefit of multiple dies on one package is that on-package wires are denser and shorter which increases performance. Multiple chiplets on an interposer is even better.
https://imapsource.org/article/128222-enabling-heterogenous-...
Generally, multi-chip-modules are for multiple die attached to (advanced) PCB backplanes, while CoWoS (Chip-on-Wafer-on-Substrate) and Foveros are using (passive) silicon backplanes. It radically increases bump and interconnect density.
Wait, aren't multi-chip modules assembled out of chiplets?
Sort of - there's some distinction in the "multi-chip" vs "multi-chiplet". We're just not using the "MCM" term to describe those chips anymore. To some degree though, it's true in the sense of: "A rose by any other name would still smell as sweet".
The explanation by 'kurthr provides some insight as to why we don't use the term MCM for them today, because the term would imply that they're not integrated as tightly as they are today.
The best example of MCM's are probably Intel’s Pentium D and Core 2 Quad. In these older "MCM" designs, generally the multiple chips were mostly fully working chips - they each had their own last-level (e.g. L3) cache (LLC). They also happened to be manufactured on the same lithography node. When a core on Die A needed data that a core on Die B was working on, Die B had to send the data off the CPU entirely, down the motherboard's Front Side Bus (FSB), into the system RAM, and then Die A had to retrieve it from the RAM.
IBM POWER4 and POWER5 MCM's did share L3 cache though.
So parent was 'wrong' that "chiplets" were ever called MCM's. But right that "chips designed with multiple chiplet-looking-things" did used to be called "MCM's".
Today's 'chiplets' term implies that the pieces aren't fully-functioning by themselves, they're more like individual "organs". Functionality like I/O, memory controllers, and LLC are split off and manufactured on separate wafers/nodes. In the case of memory controllers that might be a bit confusing because back in the days of MCM's these were not in the same silicon, rather a separate chip entirely on the motherboard, but I digress.
Also MCM's lacked the kind of high-bandwidth, low-latency fabric for CPU's to communicate more directly with each other. For the Pentiums, that was organic substrates (the usual green PCB material) and routing copper traces between the dies. For the IBM's, that was an advanced ceramic-glass substrate, which had much higher bandwidth than PCB traces but still required a lot of space to route all the copper traces (latency taking a hit) and generated a lot of heat. Today we use silicon for those interconnects, which gives exemplary bandwidth+latency+heat performance.
Putting aside the terminology, a lot of people would have you believe that chiplets started in 2017 but they existed for 20-30 years before that.
Chiplets are different physical chips, potentially from different nodes or processes. They're physically connected prepackaging. It's beneficial compared to sticking them on a board because then you get the shortest interconnects, and you aren't limited by pinout density. This should let you get the best bandwidth to power ratio, and better latency.
The difference is that until recently, consumers could swap/add RAM/NPU/GPU/etc. Now they're more often epoxied together on the same chip, even if they were produced on two different silicon wafers.
For example, the newer Intel Lunar Lake chip packages together a CPU+iGPU+NPU chiplet and two LPDDR5X RAM chiplets [0][1][2]. If laptop manufacturers want to offer different amounts of RAM, they have to buy a different CPU SKU for 16GB RAM vs 32GB RAM. Panther Lake, the succeeding generation, reversed this and will support off-package RAM modules, but some reasonable people might assume that in the long term RAM will generally be on-package for anything that's not a server/HEDT.
You won't have to worry about making sure the RAM you buy is on the CPU & Motherboard QVL list, but also you won't ever buy RAM by itself at all.
Intel's Clearwater Forest Xeon chip has 12 CPU chiplets, connected together in groups of 4 CPUs each connected to the same 1-of-3 "base" chiplets, and then 2 I/O chiplets. For a total of 17 chiplets, depending on how you count the "base" chiplets which are 'just' fabric, memory controllers, and L3 cache (which is all shared directly between all 4 CPU's). Each group of 4 CPU's sit on top of that base chiplet. [3]
0: https://www.flickr.com/photos/130561288@N04/albums/721777203...
1: (pages 3 and 4) https://www.intel.com/content/www/us/en/content-details/8244...
2: (also pages 3 and 4, but the technical detail in the rest of these slides is much more detailed) https://hc2024.hotchips.org/assets/program/conference/day1/5...
3: page 12 https://hc2025.hotchips.org/assets/program/conference/day1/1...
> Are chiplets physically connected chips that come from different wafers?
Yes exactly. Take separate unpackaged chips, and put them all on one shared substrate (a "silicon interposer") that has wires to connect them. There are a bunch of different technologies to connect them. You can even stack dies.
https://www.imec-int.com/en/articles/chiplets-piecing-togeth...
I think typically you wouldn't stack logic dies due to power/cooling concerns (though you totally could). But you can definitely stack RAM (both SRAM and DRAM). Stacking is kind of a new process though as far as I understand it.
Let's first produce some more cheap RAM, ok?
Then we can have chiplets.
Intel has such a strong lead here, with Foveros doing stacked chips including ram (since the amazing, under-rated Lakefield), and with really good EIMB.
AMD is pretty famous for multi-chip, but they're only recently starting to do actually advanced integrating like Sea-of-Wires between chips. So far most of their chips have had big hot PHY to send data back & forth, rather than trying to make multiple chips that really can communicate directly with each other.
Interesting days ahead. The computer is on the chip now. A smaller domain of system building, with many of the same trade-offs & design challenges that it took to build a box.