The nature of this PR looks like it’s very LLM-friendly - it’s essentially translating existing code into SIMD.
LLMs seem to do well at any kind of mapping / translating task, but they seem to have a harder time when you give them either a broader or less deterministic task, or when they don’t have the knowledge to complete the task and start hallucinating.
It’s not a great metric to benchmark their ability to write typical code.
Sure, but let's still appreciate how awesome it is that this very difficult (for a human) PR is now essentially self-serve.
How much hardware efficiency have we left on the the table all these years because people don't like to think about optimal use of cache lines, array alignment, SIMD, etc. I bet we could double or triple the speeds of all our computers.
LLMs seem to do well at any kind of mapping / translating task, but they seem to have a harder time when you give them either a broader or less deterministic task, or when they don’t have the knowledge to complete the task and start hallucinating.
It’s not a great metric to benchmark their ability to write typical code.