> But on the nuclear issue, it's not a good sign that he's explicitly saying that this AGI future is a threat to nuclear deterrence and the triad. Like, where do you go up from there? That's the highest level of alarm that any government can have. This isn't a boy crying wolf, it's the loudest klaxon you can possibly make.
This is not new. Anthropic has raised these concerns in their system cards for previous versions of Opus/Sonnet. Maybe in slightly more dryer terms, and buried in a 100+ page PDF, but they have raised the risk of either
a) a small group of bad actors w/ access to frontier models, technical know-how (both 'llm/ai how to bypass restrictions' and making and sourcing weapons) to turn that into dirty bombs / small nuclear devices and where to deploy them.
b) the bigger, more scifi threat, of a fleet of agents going rogue, maybe on orders of a nation state, to do the same
I think option a is much more frightening and likely. option b makes for better scifi thrillers, and still could happen in 5-30ish(??) years.
Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.
Is there a term for AI-fueled dev psychosis? "AI architecture astronaut" ? There should be one if not. Or maybe just AI-fueled hucksterisim...
I recognize 100% that a tool to manage ai agents with long term context tracking is going to be a big thing. Many folks have written versions of this already. But mashing together the complexity of k8s with a hodge podge of lotr and mad max references is not it.
Its like the complexity of J2EE combined with AI-fueled solipsim and a microdosing mushroom regime gone off the rails. What even are all the layers of abstractions here? and to build what? What actual apps or systems has this thing built? AFAICT it has built gas town, and nothing else. Not surprising that it has eaten its own tail.
The amount of jargon, ai art, pop culture references, and excessive complexity going on here is truly amazing, and I would assume its satire if I didn't know Yegge's style and previous writings. Its like someone looked at the amount of overlapping and confusing tools Anthropic has released around Claude Code, and said "hold my beer, hand me 3 red bulls and a shot of espresso, I can top that!".
I do think a friend of mine nailed it though with this quote:
"This whole "I'm using agents to write so much software" building-in-public trend, but without actually showing what they built, reminds me of the people selling courses on stock trading or drop shipping."
The amount of get-rich quick schemes around any new tech are boundless. As yegge himself points out in the post towards the end, you'd be surprised what you can pull off with a ridiculous blog post, big-tech reputation, and excessive LOC dev-tools in a hype-driven market. How could it be wrong if it aligns so closely with so many CEOs dreams?
The performance of hardware today is even more mind-boggling compared to what most people (SRE managers, devs, CTOs) are willing to pay for when it comes to cloud compute.
even more so when considered in the context of dev 'remote workstations'. I benchmarked perf on AWS instances that was at least 5x slower than an average m1 macbook, and cost hundreds of dollars a dev per month (easily), and the macbook was a sunk cost!
These exist, typically made by Panasonic or Sony, and cost upwards of 20k USD. HDTVtest has compared them to the top OLED consumer tvs in the past. Film studios use the reference models for their editing and mastering work.
Sony specifically targets the reference with their final calibration on their top TVs, assuming you are in Cinema or Dolby Vision mode, or whatever they call it this year.
_low_type_ is early days still, but I think this approach is clearly the future of ruby typing. If this gets baked into the language for full “compile” time support and minimal performance impact, it will be amazing: https://github.com/low-rb/low_type
Previously, RBS-inline was the closest answer to typed Ruby, it was the JSDoc of Ruby. Recently, when I stumbled upon low_type and tried it out in irb, it finally felt like ”this is it, this is the TS of Ruby” and with runtime validation.
I like it, it deserves attention, especially for those who are seeking for typed Ruby. With this, you can finally experience it, and the syntax feels more ergonomic than with Sorbet.
It is definitely better than RBS and Sorbet. But unless Github / 37Signals or Shopify decide to use it, it is highly unlikely Ruby Core will consider it.
Out of all three I think Shopify have the highest possibilities. There may be additional usefulness interms of ZJIT.
Yeah, this is very real, and I think it can inflict paralysis on programmers with a certain level of experience and 'i know better' syndrome. Or even a 'it _might_ be better' type syndrome.
Sometimes, you might really know better, and it doesn't matter. You build the thing with the wrong tools, with a crummy framework, with a product at the end that will probably not succeed. But that is okay, hopefully you learn something and your team and your org learn something.
And if not, that is okay, sometimes its just a job and you need a paycheck and a place to be from 9 to 5.
This is why I love the bootstrapping stories here on HN.
Like one anecdote where they were building an "app" for automatic hotel reservations IIRC.
The "app" was a form that fed into a Google Sheet, where the founders themselves took the data and called the hotels.
When they got some income, they automated small bits of the process one by one.
Sometimes it's good to just have _something_ out there and see if there's a market for it before carefully crafting a beautiful piece of software nobody will ever use. It all depends on whether you're doing it to solve a problem for someone or for the process of writing code. Both are perfectly valid reasons.
It's totally fine to prototype, but you need to take care when you try to morph a prototype into a real product.
Very often people just take the shortest path from a to b, every single time. So you start with a reasonably shoddy prototype, but then you add some small feature, repeat 1000 times and now you still have a shoddy prototype but it's actually a pretty big project and it's all completely cursed because at no point did anyone do any actual software engineering. And of course now it's too big to rewrite or fix so the only way forward is to keep building on this completely broken mess.
At some point you need to scrap the prototype and replace it with something proper, or at least have a solid plan for how you're going to morph the prototype into something proper. This is often challenging and time consuming work, so a lot of developers tend to never really do it. They just execute the shortest path to get each new feature implemented over and over for years while number of bugs keeps increasing and velocity keeps decreasing because nothing makes sense, everything is more difficult than it should be etc.
But …you have to give the MCP the creds somehow. Maybe it’s via a file on disk (bad), maybe via an env var (less bad). Maybe you do it via your password CLI that you biometricly auth to, which involves a timeout of some sort for security, but that often means you can’t leave an agent unattended.
In any case, how is any of this better than a CLI? CLIs have the same access models and tradeoffs, and a persistent agent will plumb the depths of your file system and environment to find a token to do a thing if your prompt was “do a thing, use tool/mcp/cli”.
I think the answer is it doesn't fit in any definition of a _good_ monitoring stack, but we are stuck with it. It has largely become the blessed protocol, specification, and standard for OSS monitoring, along every axis (logging, tracing, collecting, instrumentation, etc)...its a bit like the efforts that resulted in J2EE and EJBs back in the day, only more diffuse and with more varied implementations.
And we don't really have a simpler alternative in sight...at least in the java days there was the disgust and reaction via struts, spring, EJB3+, and of course other languages and communities.
Not sure how we exactly we got into such an over-engineered mono-culture in terms of operations and monitoring and deployment for 80%+ of the industry (k8s + graf/loki/tempo + endless supporting tools or flavors), but it is really a sad state.
Then you have endless implementations handling bits and pieces of various parts of the spec, and of course you have the tools to actually ingest and analyze and report on them.
This is not new. Anthropic has raised these concerns in their system cards for previous versions of Opus/Sonnet. Maybe in slightly more dryer terms, and buried in a 100+ page PDF, but they have raised the risk of either
a) a small group of bad actors w/ access to frontier models, technical know-how (both 'llm/ai how to bypass restrictions' and making and sourcing weapons) to turn that into dirty bombs / small nuclear devices and where to deploy them. b) the bigger, more scifi threat, of a fleet of agents going rogue, maybe on orders of a nation state, to do the same
I think option a is much more frightening and likely. option b makes for better scifi thrillers, and still could happen in 5-30ish(??) years.
reply