State machines are great for complex situations, but when it comes to performanc...

ashvardanian · on June 21, 2024

In the context of State Machines and Automatas - Intel HyperScan might be a better reference point. But the idea is the same. With a trivial PoC using Python wrappers over SIMD libraries one can get a 3x boost over the native `wc` CLI on a modern CPU, memory-mapping a very average SSD: https://github.com/ashvardanian/StringZilla/tree/main/cli

teo_zero · on June 22, 2024

Sorry, but your wc implementation does nothing to detect words, it just counts the spaces. Of course you don't need a state machine for that!

jxndnxu · on June 20, 2024

My understanding of the article's use of scalable was "fixed overhead more or less regardless of the complexity of the state machine and input" not "fastest implementation available"