Hacker Newsnew | past | comments | ask | show | jobs | submit | starzmustdie's submissionslogin
1.A minimal hackable implementation of policy gradients (GRPO, PPO, REINFORCE) (github.com/zafstojano)
1 point by starzmustdie 1 day ago | past | discuss
2.Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning (github.com/open-thought)
1 point by starzmustdie 7 months ago | past
3.Show HN: Word Game Bench – evaluating language models on word puzzles (wordgamebench.github.io)
1 point by starzmustdie on Aug 30, 2024 | past
4.Show HN: Answers to Chip Huyen's ML Interview Questions (github.com/zafstojano)
3 points by starzmustdie on March 15, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: