It's easily imaginable that there are new CPU features that would help with building an efficient Java VM, if that's the CPU's primary purpose. Just from the top of my head, one might want a form of finer-grainer memory virtualization that could enable very cheap concurrent garbage collection.
But having Java bytecode as the actual instruction set architecture doesn't sound too useful. It's true that any modern processor has a "compilation step" into microcode anyway, so in an abstract sense, that might as well be some kind of bytecode. But given the high-level nature of Java's bytecode instructions in particular, there are certainly some optimizations that are easy to do in a software JIT, and that just aren't practical to do in hardware during instruction decode.
What I can imagine is a purpose-built CPU that would make the JIT's job a lot easier and faster than compiling for x86 or ARM. Such a machine wouldn't execute raw Java bytecode, rather, something a tiny bit more low-level.
Jazelle worked for its target market (or at least, I've never seen anyone claim otherwise).
But its target market wasn't "faster java". Instead Jazelle promised better performance than an interpreter, with lower power draw than an interpreter, but without the memory footprint and complexity of a JIT. It was never meant to be faster than a JIT.
Jazelle made a lot of sense in the early 2000s where dumb phones where running J2ME applets on devices with only 1-4MB of memory, but we quickly moved onto smartphones with 64MB+ of memory, and it just made more sense to use a proper JIT.
---------
JavaStation might as well been vaporware. Sure, the product line existed, but the promised "Super JavaStation" with a "java coprocessor" never arrived, so you were really just paying sun for a standard computer with Java pre-installed.
I briefly worked in a team that implemented a JVM on a mobile OS (before the iPhone) and one of the senior devs said Jazelle was in effect very inefficient because of all the context switching between ARM mode and Jazelle mode.
Turned out a carefully tuned ARM JVM was in practice th best
I never understood why AOT never took off for Java. The write once run anywhere quickly faded as an argument, the number of platforms that a software package needs to support is rather small.
Developers pay for tools gladly when the pricing model isn’t based on how much money you’re making.
I’m happy to drop a fixed 200e/mo on Claude but I’d never sign paperwork that required us to track user installs and deliver $0.02 per install to someone
Especially not if those kind of contracts don't survive an acquisition because then your acquisition is most likely dead in the water. The acquirer would have to re-negotiate the license and with a little luck they'd be screwed over because they have nowhere else to go.
> I never understood why AOT never took off for Java.
GraalVM native images certainly are being adopted, the creation of native binaries via GraalVM is seamlessly integrated into stacks like Quarkus or Spring Boot. One small example would be kcctl, a CLI client for Kafka Connect (https://github.com/kcctl/kcctl/). I guess it boils down to the question of what constitutes "taking off" for you?
But it's also not that native images are unambiguously superior to running on the JVM. Build times definitely leave to be desired, not all 3rd party libraries can easily be used, not all GCs are supported, the closed world assumption is not always practical, peak performance may also be better with JIT. So the way I see it, AOT compiled apps are seen as a tactical tool by the Java community currently, utilized when their advantages (e.g. fast start-up) matter.
That said, interesting work is happening in OpenJDK's Project Leyden, which aims to move more work to AOT while being less disruptive to the development experience than GraalVM native binaries. Arguably, if you're using CDS, you are using AOT.
It simply defaults to an open world where you could just load a class from any source at any time to subclass something, or straight up apply some transformation to classes as they load via instrumentation. And defaults matter, so AOT compilation is not completely trivial (though it's not too bad either with GraalVM's native image, given that the framework you use (if any) supports it).
Meanwhile most "AOT-first" languages assume a closed-world where everything "that could ever exist" is already known fully.
We did see a recent attempt to do hardware-based memory management again with Vypercore, but they ran out of money.
I think part of the problem with any performance-related microarchitectural innovation is that unless you are one of the big players (i.e. Qualcomm, Apple, Intel, AMD, Nvidia) then you already have a significant performance disadvantage just due to access to process nodes and design manpower. So unless you have an absolutely insane performance trick, it's still not going to make sense to buy your chip.
They have the volume as well, if you do carve out a niche they’ll just add it and roll over you.
That’s held for decades though I think it only really worked when computers where doubling in speed every 12-18 months, for a while they scaled horizontally (more cores) over radical IPC improvements so we might see the rise of proper co-processors again (but nothing stops the successful ones getting put on die, like Strix Point is already heading).
It's easily imaginable that there are new CPU features that would help with building an efficient Java VM, if that's the CPU's primary purpose. Just from the top of my head, one might want a form of finer-grainer memory virtualization that could enable very cheap concurrent garbage collection.
But having Java bytecode as the actual instruction set architecture doesn't sound too useful. It's true that any modern processor has a "compilation step" into microcode anyway, so in an abstract sense, that might as well be some kind of bytecode. But given the high-level nature of Java's bytecode instructions in particular, there are certainly some optimizations that are easy to do in a software JIT, and that just aren't practical to do in hardware during instruction decode.
What I can imagine is a purpose-built CPU that would make the JIT's job a lot easier and faster than compiling for x86 or ARM. Such a machine wouldn't execute raw Java bytecode, rather, something a tiny bit more low-level.
What happened to
1. Sun's JavaStation, 2. ARM's Jazelle, ??? 3. Profit!
Jazelle worked for its target market (or at least, I've never seen anyone claim otherwise).
But its target market wasn't "faster java". Instead Jazelle promised better performance than an interpreter, with lower power draw than an interpreter, but without the memory footprint and complexity of a JIT. It was never meant to be faster than a JIT.
Jazelle made a lot of sense in the early 2000s where dumb phones where running J2ME applets on devices with only 1-4MB of memory, but we quickly moved onto smartphones with 64MB+ of memory, and it just made more sense to use a proper JIT.
---------
JavaStation might as well been vaporware. Sure, the product line existed, but the promised "Super JavaStation" with a "java coprocessor" never arrived, so you were really just paying sun for a standard computer with Java pre-installed.
I briefly worked in a team that implemented a JVM on a mobile OS (before the iPhone) and one of the senior devs said Jazelle was in effect very inefficient because of all the context switching between ARM mode and Jazelle mode. Turned out a carefully tuned ARM JVM was in practice th best
It's more like JITs got good.
I never understood why AOT never took off for Java. The write once run anywhere quickly faded as an argument, the number of platforms that a software package needs to support is rather small.
Because developers don't like to pay for tools.
https://en.wikipedia.org/wiki/Excelsior_JET
https://www.ptc.com/en/products/developer-tools/perc
https://www.aicas.com/products-services/jamaicavm/
It is now getting adopted because GraalVM and OpenJ9 are available for free.
Also while not being proper Java, Android does AOT since version 5, mixed JIT/AOT since version 7.
EDIT: Fixed the sentence regarding Android versions.
You don't have to pay for dotnet AOT.
Developers pay for tools gladly when the pricing model isn’t based on how much money you’re making.
I’m happy to drop a fixed 200e/mo on Claude but I’d never sign paperwork that required us to track user installs and deliver $0.02 per install to someone
Especially not if those kind of contracts don't survive an acquisition because then your acquisition is most likely dead in the water. The acquirer would have to re-negotiate the license and with a little luck they'd be screwed over because they have nowhere else to go.
> I never understood why AOT never took off for Java.
GraalVM native images certainly are being adopted, the creation of native binaries via GraalVM is seamlessly integrated into stacks like Quarkus or Spring Boot. One small example would be kcctl, a CLI client for Kafka Connect (https://github.com/kcctl/kcctl/). I guess it boils down to the question of what constitutes "taking off" for you?
But it's also not that native images are unambiguously superior to running on the JVM. Build times definitely leave to be desired, not all 3rd party libraries can easily be used, not all GCs are supported, the closed world assumption is not always practical, peak performance may also be better with JIT. So the way I see it, AOT compiled apps are seen as a tactical tool by the Java community currently, utilized when their advantages (e.g. fast start-up) matter.
That said, interesting work is happening in OpenJDK's Project Leyden, which aims to move more work to AOT while being less disruptive to the development experience than GraalVM native binaries. Arguably, if you're using CDS, you are using AOT.
Well, one aspect is how dynamic the platform is.
It simply defaults to an open world where you could just load a class from any source at any time to subclass something, or straight up apply some transformation to classes as they load via instrumentation. And defaults matter, so AOT compilation is not completely trivial (though it's not too bad either with GraalVM's native image, given that the framework you use (if any) supports it).
Meanwhile most "AOT-first" languages assume a closed-world where everything "that could ever exist" is already known fully.
Except when they support dynamic linking they pay the indirect call cost that JITs can remove.
dynamic class loading is a major issue, and it's an integral feature. Realistically, there are very few cases that AOT and Java make sense.
People want to run things other than Java.
We did see a recent attempt to do hardware-based memory management again with Vypercore, but they ran out of money.
I think part of the problem with any performance-related microarchitectural innovation is that unless you are one of the big players (i.e. Qualcomm, Apple, Intel, AMD, Nvidia) then you already have a significant performance disadvantage just due to access to process nodes and design manpower. So unless you have an absolutely insane performance trick, it's still not going to make sense to buy your chip.
They have the volume as well, if you do carve out a niche they’ll just add it and roll over you.
That’s held for decades though I think it only really worked when computers where doubling in speed every 12-18 months, for a while they scaled horizontally (more cores) over radical IPC improvements so we might see the rise of proper co-processors again (but nothing stops the successful ones getting put on die, like Strix Point is already heading).
(2016)
Related: JOP: A Java Optimized Processor for Embedded Real-Time Systems (2005) [0]
It's an implementation of the Java virtual machine in hardware, also FPGA-based, see chapter 7.1 Hardware Platforms.
[0] https://backend.orbit.dtu.dk/ws/files/4127855/thesis.pdf