This causes it to get the question wrong for me, when testing, and only if I manually prompt normal CoT does it get it right.
Is there any papers showing a merit to this approach? It seems extremely counter-intuitive.
This causes it to get the question wrong for me, when testing, and only if I manually prompt normal CoT does it get it right.
Is there any papers showing a merit to this approach? It seems extremely counter-intuitive.