This is really cool. As a high-performance computing professional, I've often wondered how much energy is wasted due to inefficient code and how much that is a problem as planetary compute scales up.
For me, it feels like a moral imperative to make my code as efficient as possible, especially when a job will take months to run on hundreds of CPU.
Year ago, I posted here that there should be some sort of ongoing Green X-Prize for this style of Linux kernel optimization. It's still crazy to me that this doesn't exist.
One would hope, but when I last gave this some thought... If you are in the C-suite, and could deploy your best devs to maybe save some unknown % on energy costs, or have them work on a well-defined new feature that grows your ARR 5% next year, which would you do?
Also, would you share all new found efficiencies with your competitors?
It's a public good where the biz that creates it captures very little of the value it generates, so investment in this kind of optimization is likely far below optimal.
> I've often wondered how much energy is wasted due to inefficient code and how much that is a problem as planetary compute scales up.
I personally believe the majority is wasted. Any code that runs in an interpreted language, JIT/AOT or not, is at a significant disadvantage. On performance measurements it's as bad as 2x to 60x worse than the performance of the equivalent optimized compiled code.
> it feels like a moral imperative to make my code as efficient as possible
Although we're still talking about fractions of a Watt of power here.
> especially when a job will take months to run on hundreds of CPU.
To the extent that I would say _only_ in these cases are the optimizations even worth considering.
It is unfortunate that many software engineers continue to dismiss this as "premature optimization".
But as soon as I see resources or server costs gradually rising every month (even on idle usage) costing into the tens of thousands which is a common occurrence as the system scale, then it becomes unacceptable to ignore.
When you achieve expertise you know when to break the rules. Until then it is wise to avoid premature optimization. In many cases understandable code is far more important.
I was working with a peer on a click handler for a web button. The code ran in 5-10ms. You have nearly 200ms budget before a user notices sluggishness. My peer "optimized" the 10ms click handler to the point of absolute illegibility. It was doubtful the new implementation was faster.
Depending on your spend on infrastructure and the business revenue, if the problem is not causing the business to increase spending on infrastructure each month or if there’s little to no rise in user complaints over slow downs, then the “optimization” isn’t worth it and is then premature.
Most commonly, If the costs increase as the users increase it then becomes an issue with efficiency and the scaling is not good nor sustainable which can easily destroy a startup.
In this case, the Linux kernel is directly critical for applications in AI, real time systems, networking, databases, etc and performance optimizations and makes a massive difference.
This article is a great example of properly using compiler optimizations to significantly improve performance of the service. [0]
My experience with HPC is only tangential to being a sysadmin for data taking and cluster management for a high energy physics project; I am interested on your thoughts about using generative AI to search out for potentially power inefficient code paths in codebases for potential improvement.
For the completely uninitiated, taking the most critical code paths uncovered via profiling and asking an LLM to rewrite it to be more efficient might give an average user some help in optimization. If your code takes more than a few minutes to run, you definitely should invest in learning how to profile, common optimizations, hardware latencies and bandwidths, etc.
With most everything I use at the consumer level these days, you can just feel the excessive memory allocations and network latency oozing out of it, signaling the inexperience or lack of effort of the developers.
For me, it feels like a moral imperative to make my code as efficient as possible, especially when a job will take months to run on hundreds of CPU.