Having worked with a greenfield project that has significant amount of LLM output in it, I’m not sure if I agree. There’s all sorts of weird patterns, insufficient permission checking, weird tests that don’t actually test things, etc. It’s like building a house on sand.
I’ve used Claude to create copies of my tests, except instead of testing X feature, it tests Y feature. That has worked reasonably well, except that it has still copied tests from somewhere else too. But the general vibe I get is that it’s better at copying shit than creating it from scratch.
That's why I never have it generate any significant amount of code for those. I get juuuuuust enough to start understanding the structure and procedure and which parts of the API docs I should even be looking at for the problem at hand, and start from there on my own. I need a lay of the land, not an entire architecture. I build that part.
This is where we as software engineers need to be on the ball - just because an LLM wrote it doesn't mean it's right, doesn't mean we can let go of all the checks and balances and best practices we've developed over decades.
Set up tooling like tests and linters and the like. Set rules. Mandate code reviews. I've been using LLMs to write tests and frequently catch it writing tests that don't actually have any valuable assertions. It only takes a minute to fix these.
> Set up tooling like tests and linters and the like. Set rules. Mandate code reviews. I've been using LLMs to write tests and frequently catch it writing tests that don't actually have any valuable assertions. It only takes a minute to fix these.
You can do all that, but it still remains a case of "I'm only interested in the final result".
When I read LLM generated systems (not single functions), it looks very ... alien to me. Even juniors don't put together systems that have this uncanny valley feel to it.
I suppose the best way to describe it would be to say that everything lacks coherency, and if you are one of these logical-mind people who likes things that make sense, it's not fun wading through a field of Chesterton's Fences as your f/time job.
I noticed this too. Everything lacks consistency, wrapped in headings that are designed to make me feel good. It’s uncomfortable reading one thing that seems so right followed by code that feels wrong but my usual instincts about why help less because of how half right it is.
(But still, LLMs have helped me investigate and write code that is beyond me)
> (But still, LLMs have helped me investigate and write code that is beyond me)
They haven't done that yet[1], but they have sped up things via rubber-ducking, and for documentation (OpenSSL is documentation is very complete, very thorough, but also completely opaque).
------------------------------------
[1] I have a project in the far future where they will help me do that, though. It all depends on whether I can get additional financial security so I can dedicate some time to a new project :-(
I’ve used Claude to create copies of my tests, except instead of testing X feature, it tests Y feature. That has worked reasonably well, except that it has still copied tests from somewhere else too. But the general vibe I get is that it’s better at copying shit than creating it from scratch.