Yup. Controversial opinion incoming, but I actually can't wait for the same thing to happen to ARM. The RISC crowd so quickly forgot the purpose of the RISC architecture. Every time I hear someone tell me how the M1 is so fast compared to x86 I just get this picture in my head of a guy in a 1992 Civic DX with a 2 foot wing revving his engine at the Lambo parked next to him. [1]
They argue that we should be benchmarking saturated single-core performance, but then show in their chosen benchmarks that the M1 is very competitive, benchmarking just below an i9 9880and above a Ryzen 7 4750.
That's a super-fast chip! No excuses like "oh it's Apple's first CPU" are needed - it's right up there.
And in real world usage it's kind of a useless thing to benchmark. See their note on how they had to wrestle with our Cinebench R23 program to get it to accept the load (threads locked to 2, affinity needed to be set after initiating the run but before the benchmark actually started and needed to be reapplied after the first pass) and a cleaner execution would almost certainly be welcome. It would also allow us to test cores working at their full potential.
It kind of shows how these micro-benchmarks aren't very reflective of real-world use.
RISC = Reduced Instruction Set Computer. Reduced instructions result in (supposedly) reduced complexity per instruction (which has failed over time). While each instruction is "simpler" it takes more instructions to accomplish a task. It would stand to reason that it would take more clock cycles to accomplish the same goal on ARM than x86. [1]
ARM processors regularly outperform x86 in terms of instructions per cycle. They have to in order to do the same amount of work. That is literally the whole idea of RISC. Simpler instructions executed quickly. [2]
But this is not why they are fundamentally not as capable. The reason for that is the pipeline. Or relative lack of pipe-lining. [3]
But again, you are comparing a Civic to a Lambo. You have a 3.5w CPU which, per watt, you have the more efficient processor. But that's not good enough. ARM folks love to stand on the efficiency soapbox and preach about performance. You want to have the flexibility of out-of-order execution but don't want to admit the shortcomings that come with it. You have the best performance per watt but you want best overall performance. Which, if you skew every metric in favor of sliced up processors and non-real world benchmarks then sure... Your phone is almost as fast as an entry level laptop when running native code on a single core. If x86 appears slower than ARM it's only because ARM fans have thrown everything but the kitchen sink into their test bench and found the absolute most favorable conditions possible. Anything but admit that their $1200 cell (which was never designed to be faster than your laptop) is "faster per watt per core per thread" than a $150 x86 processor.
All current Intel/AMD processors (since Intel Core/AMD K6) are RISC cores with a complex instruction decoder in front of them. There's more to a processor core than the instruction set, and both pipelining and out-of-order execution have nothing to do with the instruction set architecture. There have been in-order x86 processors (Transmeta and Via, Intel before the Pentium Pro), and there are plenty out-of-order RISC processors: POWER4 and up, SPARC T4, RISC-V, many ARM Cortex models.
That article is incredibly stupid. From a user perspective, “single core” performance is irrelevant. What matters is single thread performance, and multi thread performance of the entire processor.
[1] https://wccftech.com/why-apple-m1-single-core-comparisons-ar...