Looking at things like e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... , this is very obviously specific to the underlying hardware's control model in all sorts of ways. If the hardware has a gazillion control registers, then the software will have to handle a gazillion registers and will therefore need several gazillion definitions in order to do it. The hardware won't get any simpler just because it takes a long time to compile the driver.
> Compile times increase superlinearly with program size. How many cycles and developer hours are wasted if this duplication is not necessary?
Care to weight in the impact on developer hours and cycles of having to debug problems caused in how the autogenerated headers are created?
If the code is not expected to change and was already subjected to verification and validation, ensuring that it stays the same is a good way to avoid wasting time tracking bugs on code paths that assuredly were already verified.
The other time that you saw it was also probably me. It's from this talk, which is about how a large amount of generated protocol buffer code at Google led to a quadratic increase in compile times: https://youtu.be/rHIkrotSwcc?t=720.
TL;DW: The reasoning is because if you use a distributed build system, then your compile time is gated by the file with the longest compile time (tail latency). The more files you have, the greater chance that one of them takes a long time. When you generate source files, you tend to produce more files than if you didn't.
Most users don't use a distributed build system to compile the kernel, so on further thought, in that case compile times probably scale closer to linear with the number of translation units. But wasted cycles are still wasted cycles, and regardless of how exactly compile times scale, you should still consider the cost of longer compile times when you duplicate code.
Not knowing the context of the quote, I can guess a few causes:
* Compiler optimizations mostly work on a per-function basis, so that the 'n' in O(n) or O(n²) is the size of a function and not the size of the entire codebase. Not all optimization algorithms are linear, and while good superlinear optimizations should have code threshold cutoffs, in practice these may be omitted. Function sizes follow something like a power law distribution, which means larger code bases contain larger functions that are more expensive to optimize.
* For C/C++, you compile code by pasting all the header definitions into your translation unit, and feeding that to the actual lexer/parser/codegen. Larger projects are probably going to have correspondingly larger headers, and files will probably include more of them, so that the actual translation unit sizes are going to be larger even than the on-disk source files would indicate.
* This is more speculative, but larger projects are also probably likelier to blow out the various caches on your computer, so that the operations you figure are O(1) (such as hashtable lookup!) are actually more like O(lg N) in practice. And as you get larger projects that are less likely to fit any cache, the effect of the non-constant time becomes more apparent.