It's an interesting idea. I'm butchering TCC (tiny c compiler) for a side project/experiment, and using arenas sped it up 2x. This of course requires the memory limit to be specified in advance, but for my situation that's fine.
> I feel that a strict separation between O(N) compiler output and O(1) intermediate processing artifacts [...]
I don't follow. He has just said that although the size of the arena is finite, the input and output are unbounded, and the compiler does its work by processing "a sequence of chunks" (i.e. those things that will fit into the finitely sized arena). That's not "O(1) intermediate processing artifacts". It's still O(n).
> [...] can clarify compiler’s architecture, and I won’t be too surprised if O(1) processing in compilers would lead to simpler code
This doesn't seem like an intuitive conclusion at all. There's more recordkeeping needed now, and more machinery in need of being implemented, and one should expect that this scheme would make for things that are neither simple nor easy.
We haven't even gotten around to addressing how "statically allocating" a fixed size arena that your program necessarily subdivides into pieces (before moving onto the next chunk and doing the same) is just "dynamic allocation with extra steps". (If you want or just think that it would be neat to write/control your own allocator, then fine, but... say that.)
Having said that, if this is really all just a roundabout way to get the Rust people to actually give a damn about memory use and sell the idea that "you really shouldn't require more than 4GB of memory just to bootstrap the compiler and/or build other medium-to-very-large programs," then hey that's great.
Static memory allocation requires hardcoding an upper limit of size of everything. For example, if you limit each string to be at most 256 bytes, then a string with only 10 bytes will waste 246 bytes of memory.
If you limit string length to 32 bytes it will waste fewer memory but when a string longer than 32 bytes comes it cannot handle.
Your C++ compiler already implements a solution to that called short string optimization. Strings start out as small byte buffers that can be easily be passed around. When they grow beyond that, the fixed buffer is swapped out for pointer to another allocation on the heap. There's no (immediate) reason that allocation has to come from a direct call to the system allocator though, and it usually doesn't. It can just as easily come from an allocation pool that was initialized at startup.
Even if you needed to hardcode upper size limits, which your compiler already does to some extent (the C/C++ standards anticipate this by setting minimum limits for certain things like string length), you wouldn't actually pay the full price on most systems because of overcommit. There are other downsides to this depending on implementation details like how you reclaim memory and spawn compiler processes, so I'm not suggesting it as a good idea. It's just possible.
> if you limit each string to be at most 256 bytes, then a string with only 10 bytes will waste 246 bytes of memory.
No? Unless you limit each string to be exactly 256 bytes but that's silly.
> If you limit string length to 32 bytes it will waste fewer memory but when a string longer than 32 bytes comes it cannot handle.
Not necessarily. The early compilers/linkers routinely did "only the first 6/8 letters of an identifier are meaningful" schtick: the rest was simply discarded.
How does this work? Files need to reference other files eg. for calling functions from other modules, which means semantic analysis needs both files in memory to check the types. This is especially complicated with mutual recursion across modules (separate compilation doesn't apply here). If you're building a language like C where everything requires forward declarations, then maybe, but anything more modern seems difficult.
Yes, you can dump your IR straight to the disk and then stream it to process further. That's how quite a number of compilers (and assemblers) were written back in the 70s and it was quite painful.
IIRC, Unix's original as works that way: during assembly, the text and data sections are written into separate temporary files, and then they are merged together into the a.out. And yes, it's slow.
It's an interesting idea. I'm butchering TCC (tiny c compiler) for a side project/experiment, and using arenas sped it up 2x. This of course requires the memory limit to be specified in advance, but for my situation that's fine.
> I feel that a strict separation between O(N) compiler output and O(1) intermediate processing artifacts [...]
I don't follow. He has just said that although the size of the arena is finite, the input and output are unbounded, and the compiler does its work by processing "a sequence of chunks" (i.e. those things that will fit into the finitely sized arena). That's not "O(1) intermediate processing artifacts". It's still O(n).
> [...] can clarify compiler’s architecture, and I won’t be too surprised if O(1) processing in compilers would lead to simpler code
This doesn't seem like an intuitive conclusion at all. There's more recordkeeping needed now, and more machinery in need of being implemented, and one should expect that this scheme would make for things that are neither simple nor easy.
We haven't even gotten around to addressing how "statically allocating" a fixed size arena that your program necessarily subdivides into pieces (before moving onto the next chunk and doing the same) is just "dynamic allocation with extra steps". (If you want or just think that it would be neat to write/control your own allocator, then fine, but... say that.)
Having said that, if this is really all just a roundabout way to get the Rust people to actually give a damn about memory use and sell the idea that "you really shouldn't require more than 4GB of memory just to bootstrap the compiler and/or build other medium-to-very-large programs," then hey that's great.
How does static allocation avoid wasting memory?
Static memory allocation requires hardcoding an upper limit of size of everything. For example, if you limit each string to be at most 256 bytes, then a string with only 10 bytes will waste 246 bytes of memory.
If you limit string length to 32 bytes it will waste fewer memory but when a string longer than 32 bytes comes it cannot handle.
Your C++ compiler already implements a solution to that called short string optimization. Strings start out as small byte buffers that can be easily be passed around. When they grow beyond that, the fixed buffer is swapped out for pointer to another allocation on the heap. There's no (immediate) reason that allocation has to come from a direct call to the system allocator though, and it usually doesn't. It can just as easily come from an allocation pool that was initialized at startup.
Even if you needed to hardcode upper size limits, which your compiler already does to some extent (the C/C++ standards anticipate this by setting minimum limits for certain things like string length), you wouldn't actually pay the full price on most systems because of overcommit. There are other downsides to this depending on implementation details like how you reclaim memory and spawn compiler processes, so I'm not suggesting it as a good idea. It's just possible.
> if you limit each string to be at most 256 bytes, then a string with only 10 bytes will waste 246 bytes of memory.
No? Unless you limit each string to be exactly 256 bytes but that's silly.
> If you limit string length to 32 bytes it will waste fewer memory but when a string longer than 32 bytes comes it cannot handle.
Not necessarily. The early compilers/linkers routinely did "only the first 6/8 letters of an identifier are meaningful" schtick: the rest was simply discarded.
How does this work? Files need to reference other files eg. for calling functions from other modules, which means semantic analysis needs both files in memory to check the types. This is especially complicated with mutual recursion across modules (separate compilation doesn't apply here). If you're building a language like C where everything requires forward declarations, then maybe, but anything more modern seems difficult.
Yes, you can dump your IR straight to the disk and then stream it to process further. That's how quite a number of compilers (and assemblers) were written back in the 70s and it was quite painful.
IIRC, Unix's original as works that way: during assembly, the text and data sections are written into separate temporary files, and then they are merged together into the a.out. And yes, it's slow.