Hacker Newsnew | past | comments | ask | show | jobs | submit | njkumarr's commentslogin

demo: https://flappybird.njkumar.com/

blogpost: https://njkumar.com/optimizing-flappy-bird-world-model-to-ru...

I finally got some time to put some development into this, but I optimized a flappy bird diffusion model to run around 30FPS on my Macbook, and around 12-15FPS on my iPhone 14 Pro. More details about the optimization experiments in the blog post above, but surprisingly trained this model on a couple hours of flappy bird data and 3-4 days of training on a rented A100.

World models are definitely going to be really popular in the future, but I think there should be more accessible ways to distribute and run these models, especially as inference becomes more expensive, which is why I went for an on-device approach.

Let me know what you guys think!


Thank you for taking the time to read my article!

For your 2nd point, to clarify I actually generate 300 new tokens on top of that initial prompt, not just using the short prompt, so with precomputation of the prompt + token generation it should come out to about 306 tokens.

For your 1st and 3rd point you are definitely correct, looking back, I should've focused probably on using the torch profiler to track what point my CPU overhead started to decrease in order to assess compute-bound regions in my workflow better, rather than napkin math on A100 specs.


lol


i agree, it seems like a pretty insecure thing to say. he is currently an undergrad so he probably doesn't really realize masters programs can actually help someone's career.


hackernews is usually pretty negative and runs with their narrative is


this is so dope


agreed


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: