On the other hand, I've been using Claude Code for the past several months at work in several C++ projects. It's been fine at understanding C++. It just generates a lot of boilerplate, doesn't follow DRY, and gets persnickety with tests.
I've started adding this to all of my new conversations and it seems to help:
You are a principal software engineer. I report to you. Do not modify files. Do not write prose. Only provide observations and suggestions so that I can learn from you.
My question to the LLM then follows in the next paragraph. Foregoing most of the LLM's code-writing capabilities in favor of giving observations and ideas seems to be a much better choice for productivity. It can still lead me down rabbit holes or wrong directions, but at least I don't have to deal with 10 pages of prose in its output or 50 pages of ineffectual code.
Gemini & ChatGPT have not done well at writing or analyzing OpenGL like rendering code for me, as well. And for many algorithms, it's not good at explaining them as well. And for some of the classical algorithms, like cascading shadow mapping, even articles written by people and example source code that I found is wrong or incomplete.
Learning "the old ways" is certainly valuable, because the AIs and the resources available are bad at these old ways.
I've been recently working with Opus 4.5 and GPT-5.2. Both have been unable to migrate a project from using ARB shaders to 3.3 and GLSL. And I don't mean migrating the shaders themselves, just changing all the boring glue code that tells the application to use GLSL and manage those instead of feeding the ARB shaders directly.
No matter how I sliced it, I could not get a simple cube to have the shadows as described in the paper.
I've also recently tried to get Opus 4.5 to move the Job system from Doom 3 BFG to the original codebase. Clean clone of dhewm3, pointed Opus to the BFG Job system codebase, and explained how it works. I have also fed it the Fabien Sanglard code review of the job system: https://fabiensanglard.net/doom3_bfg/threading.php
I did that because, well, I had ported this job system before and knew it was something pretty "pluggable" and could be implemented by an LLM. Both have failed. I'm yet to find a model that does this.
It's funny, I've also been trying to use AI to implement (simpler) shadow mapping code and it has failed. I eventually formed a very solid understanding of the problem domain myself and achieved my goals with hand written code.
I might try to implement this paper, great find! I love this 2000-2010 stuff
In what scenarios are they terrible? I hope not every scenario. I've found Codex adequate for refactoring and unit tests. I've not used it in anger to write any significant new code.
I suppose part of the problem is that training a model on publicly available C++ isn't going to be great because syntactically broken code gets posted to the web all the time, along with suboptimal solutions. I recall a talk saying that functional languages are better for agents because the code published publicly is formally correct.
I use ChatGPT with C++ but in very limited manner. So far it was overall win. I watch the code very closely of course and usually end up doing few iterations (mostly optimizing for speed, reliability, concurrency).
I use Claude to generate C++ 23, it usually performs well. It takes a bit of nudging to avoid repeating itself, reusing existing functionality, not altering huge portions without running tests, etc. But generally it is helpful and knows what to do.
I had the same experience. C++ doesn't even compile or I have to tell it all the time "use C++23 features". I tried to learn OpenGL with it. This worked out a bit, since I had to spot the errors :D
Have you tried to do any OpenGL or Vulkan work with it? Very frustrating.
React and HTML, though, pretty awesome.