starzmustdie's submissions

1.		A minimal hackable implementation of policy gradients (GRPO, PPO, REINFORCE) (github.com/zafstojano)
		1 point by starzmustdie 1 day ago \| past \| discuss
2.		Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning (github.com/open-thought)
		1 point by starzmustdie 7 months ago \| past
3.		Show HN: Word Game Bench – evaluating language models on word puzzles (wordgamebench.github.io)
		1 point by starzmustdie on Aug 30, 2024 \| past
4.		Show HN: Answers to Chip Huyen's ML Interview Questions (github.com/zafstojano)
		3 points by starzmustdie on March 15, 2024 \| past