Hacker Newsnew | past | comments | ask | show | jobs | submit | Aaryan44's commentslogin

Kayvon and Kunle are amazing - I took CS149 Parallel Programming two quarters ago and loved it :)


lucky! i took it 11 years ago.

would love to revisit the material, especially in this new era of specialized processing units and UMA.


good news - we've actually included optimized causal and non-causal versions of the flash attention backwards pass with TK - would love for you to check them out!

causal: https://github.com/HazyResearch/ThunderKittens/blob/main/exa...

non-causal: https://github.com/HazyResearch/ThunderKittens/blob/main/exa...


Awesome. Do you happen to have a benchmark against the latest (v9.1) cuDNN implementation?


@pama, if useful - here are utilization numbers for our attention backwards kernels (causal and non-causal, head dim = 64): https://github.com/HazyResearch/ThunderKittens/blob/main/att...


amazing work! thank you!


Thanks @lucidrains :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: