O3 can now play geoguessr better than master level human players. It can also beat master level Codeforces competitive programmers. I wouldn't discount the ability of AI to make sense of images far better than humans possibly could, all the while beating them at logical thinking, especially in a restricted domain like driving.
AI isn’t magic. If there isn’t enough information in the inputs, you can’t expect reliable results. It’s the same principle in all of software: garbage in, garbage out.
If there simply isn’t enough visual information, vision-only will fail.
That is not a debunking... That's someone running a similar experiment and getting a different result. That would debunk the claim that Teslas can never detect a painted wall. It does not debunk the claim that Teslas will sometimes fail to detect a painted wall.
And in a safety-critical system, the distinction is not mere pedantry.
Theoretically, if there's not enough visual information for AI drivers, then there's not enough visual information for human drivers, and that's a problem with the road. (Which, to be sure, occasionally there are roads like this: e.g. merging onto a higher-speed thoroughfare from a lower level, with a very short distance between "where you're in a position to see the merging traffic (and not that much of it)" and "where the roads have fully merged (and there's no shoulder)".)