They are testing with a different dataset. The authors saying that they have not... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		7thpower on Jan 29, 2025 \| parent \| context \| favorite \| on: An analysis of DeepSeek's R1-Zero and R1 They are testing with a different dataset. The authors saying that they have not tested on the version of o3 that has not seen the training set.

pertymcpert on Jan 30, 2025 [–]

Yeah...the whole point is that you're testing the model on something it hasn't seen already. If the problems were in the training set by definition the model has seen them before.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact