It’s also just fluff and straight up wrong at parts. This wasn’t checked by a human or at least a human who understands enough to catch inaccuracies. For example for “Plan-then-execute” (which is presented as some sort of novel pattern rather than literally just how Claude Code works right out of the box) it says:
“Plan phase – The LLM generates a fixed sequence of tool calls before seeing any untrusted data
Execution phase – A controller runs that exact sequence. Tool outputs may shape parameters, but cannot change which tools run”
But of course the agent doesn’t plan an exact fixed sequence of tool calls and rigidly stick to it, as it’s going to respond to the outputs which can’t be known ahead of time. Anyone who’s watched Claude work has seen this literally every day.
This is just more slop making it to the top of HN because people out of the loop want to catch up on agents and bookmark any source that seems promising.
For me it's not really the length, I'm surprised that on HN news of all places you'd think it's that. Outside of the obvious em-dashes (yes I know people have always used em-dashes but one every other sentence does catch the eye right?), it's the common and completely self-assured phrasing like
"which is why you get that "barrage of micro ideas" breaking through during focus."
This for me was the biggest tell. Unless OP gives their credentials who talks like this? The smartest people I know in related fields go out of their way to avoid sounding like they can diagnose your exact subjective experience based on such little information. The absolute lack of reasonable hedging is the giveaway for me.
Mind explaining the process you tried? As someone who’s generally not had any issue getting LLMs to sort out my side projects (ofc with my active involvement as well), I really wonder what people who report these results are trying. Did you just open a chat with claude code and try to get a single context window to one shot it?
So I can tell you don’t use these tools, or at least much, because at the speed of development with them you’ll be knee deep in tech debt in a day, not a month, but as a corollary can have the same agentic coding tools undergo the equivalent of weeks of addressing tech debt the next day. Well, I think this applies to greenfield AI-first oriented projects that work this way from the get go and with few humans in the loop (human to human communication definitely becomes the rate limiting step). But I imagine that’s not the nature of your work.
Yes if I went hard on something greenfield I'm sure I'll be knee deep in tech debt in less than a day.
That being said, given the quality of code these things produce, I just don't see that ever stopping being the case. These things require a lot of supervision and at some point you are spending more time asking for revisions than just writing it yourself.
There's a world of difference between an MPV which, in the right domain, you can get done much faster now, and a finished product.
I think you missed the your parent post's phrase "in the specific areas _I_ work in" ... LLMs are a lot better at crud and boilerplate than novel hardware interfaces and a bunch of other domains.
But why would it take a month to generate significant tech debt in novel domains, it would accrue even faster then right? The main idea I wanted to get across is that iteration speed is much faster so what's "tech debt" in the first pass, can be addressed much faster in future passes, which will happen on the order of days rather than sprints in the older paradigm. Yes the first iterations will have a bunch of issues but if you keep your hands on the controller you can get things to a decent state quickly. I think one of the biggest gaps I see in devs using these tools is what they do after the first pass.
Also, even for novel domains, using tools like deep research and the ability of these tools to straight up search through the internet, including public repos during the planning phase (you should be planning first before implementing right? You're not just opening a window and asking in a few sentences for a vaguely defined final product I hope) is a huge level up.
If there are repos, papers, articles, etc of your novel domain out there, there's a path to a successful research -> plan -> implement -> iterate path out there imo, especially when you get better at giving the tools ways to evaluate their own results, rather than going back and forth yourself for hours telling them "no, this part is wrong, no now this part is wrong, etc etc"
> shutting down all communications and power are our only defense against a runaway AI system
Wouldn't a centralized ability to shut down all communications and power also be one of the most vulnerable targets to an runaway AI attack though? Seems like a double edged sword if I've ever seen one.
Sorry if this is obvious, but are there actually any systems that "choose one over the other"? My impression's always been it was either vision + LIDAR, or vision alone. Are there any examples of LIDAR alone?
Don't ultimately even the ones which are vision + LIDAR ultimately have to choose priority in terms of one or the other for "What do you do if LIDAR says it is blocked and sight says it is clear' or visa-versa?" Trying to handle edge-cases where say LIDAR thinks that sprinker mist is a solid object and to swerve to avoid it and say vision which thinks that an optical illusion is a real path and not a brick wall.
Since the current traffic infrastructure was built for human drivers with vision, you’ll probably need some form of vision to navigate today’s roads. The only way I could picture lidar only working would be on a road system specially made for machine driving.
Roomba (specifically the brand of the American company iRobot) only added lidar in 2025 [1]. Earliest Roombas navigated by touch (bumping into walls), and then by cameras.
But if you use "roomba" as a generic term for robot vacuum then yes, Chinese Ecovacs and Xiaomi introduced lidar-based robot vacuums in 2015 [2].
> Earliest Roombas navigated by touch (bumping into walls)
My ex got a Roomba in the early 2010s and it gave me an irrational but everlasting disdain for the company.
They kept mentioning their "proprietary algorithm" like it was some amazing futuristic thing but watching that thing just bump into something and turn, bump into something else and turn, bump into something again and turn again, etc ... it made me hate that thing.
Now when my dog can't find her ball and starts senselessly roaming in all the wrong directions in a panic, I call it Roomba mode.
In my experience multi-agent orchestration frameworks usually accomplish vague to unnoticable to straight up worse results compared just getting used to the vanilla tools before impulsively installing the daily flavor of "I made Claude Code better". I'm guessing you've probably already noticed by now these come out daily. But a look at the repo shows that they do at least halfway use sub-agents in the way most people are starting to realize they're (currently at least) most helpful imo, which is managing context bloat in the main chats. Not a fan of wishfully creating "expert" agents which amount to little more than prompts asking Claude to a good job at the task. I'm honestly not sure why that couldn't be a slash command at that point.
Thanks for the reply! Yeah, it was the code in the repo that triggered me to ask the question. It looks well written with correct looking abstractions. So feels like it's worth a try if you've only been using vanilla Claude Code. But trying everything is time consuming.
Friend, you are putting too much effort to debate a topic that is implicitly banned on this website. This post has already been hidden from the front page. Hacker News is openly hostile to anything that even mildly paints a handful of billionaires in a poor light. But let's continue to deify Dang as the country descneds openly into madness.
I also see it back now too, despite it being removed earlier. Do you have faith in the HN algo? Position 22 despite having more votes and comments and being more recent than all of the posts above it?
lol. Always fun to watch HN remove hightly relevant topics from the top of the front page. To their credit they usually give us about an hour to discuss before doing so. How kind of them.
“Plan phase – The LLM generates a fixed sequence of tool calls before seeing any untrusted data
Execution phase – A controller runs that exact sequence. Tool outputs may shape parameters, but cannot change which tools run”
But of course the agent doesn’t plan an exact fixed sequence of tool calls and rigidly stick to it, as it’s going to respond to the outputs which can’t be known ahead of time. Anyone who’s watched Claude work has seen this literally every day.
This is just more slop making it to the top of HN because people out of the loop want to catch up on agents and bookmark any source that seems promising.
reply