This is so interesting, I am curious as to why, can you (or anyone) please provide any resources or insightful comments about it, they would really help a ton out here, thanks!
Gpt3 was trained on completion data so it likely saw lots of raw chess games layed out in whatever standard format moves are listed in, while 3.5 was post trained on instruct data (talking back and forth) which would have needed to explicitly include those chess games as conversational training data for it to retain as much as it would otherwise
As far as I remember, it's post-training that kills chess ability for some reason (GPT-3 wasn't post-trained).