Formal proofs have so much potential in this context
We're at the point of diminishing returns from scaling and RL is the only way to see meaningful improvements
Very hard to improve much via RL without some way to tell if the code works without requiring compilation
Logic based languages like Prolog take this to the logic extreme, would love to see people revisit that idea
Whereas the benchmark gains seem by new OpenAI, Grok and Claude models don't feel accompanied by vibe improvement
Absurd to say Deepseek is CCP controlled while ignoring the govt connection here
There has never been a shred of evidence for security researchers, model analysis, benchmarks, etc that supports this.
It's a complete delusion in every sense.