Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The example code in the article left out the lookup table to convert each character into a single bit representation. All the tricky xor and popcount stuff could have been a 26 byte array just as easily and been O(n).


There's no lookup table. The example code in the article does the conversion using this expression:

    1 << (s[i + j] as u32 - 'a' as u32)


Yes, for some additional explanation for those that might use it, that's "ASCII math": it assumes all lowercase ASCII Latin letters (which most Unicode encodings including UTF-8 and UTF-32 inherit). ASCII was intentionally designed so that the lowercase letters a-z are in English alphabet order and so subtracting 'a' from the letter gives you an alphabet position number from 0-25. (The same applies in ASCII for the upper case range A-Z and the numerical range 0-9, though doing math between ranges is less fun.)

Then you've just got a standard single 1 left shifted 0-25 positions. (So 'a' is 1 and 'z' is 1*2^25.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: