Flip side is that if you use more source generation, it may end up making the code more terse/"prettier" where it matters and avoid the reflection hit.
AI agents seem fairly good at generating source generators so there doesn't seem to be a reason to not use them.
One thing that I have found is that the platforms are surprisingly poor at consistently implementing MCP, which is actually a pretty simple protocol.
Take Codex, for example, it does not support the MCP prompts spec[0][1] which is quite powerful because it solves a lot of friction with deploying and synchronizing SKILL.md files. It also allows customization of virtual SKILL.md files since it allows compositing the markdown on the server.
It baffles me why such a simple protocol and powerful capability is not supported by Codex. If anyone from OpenAI is reading this, would love to understand the reasoning for the poor support for this relatively simple protocol.
I have a bit more practical approach here (write up at some point): the most important thing is to rethink how you are instructing the agents and do not only rely on your existing codebase because: 1) you may have some legacy practices, 2) it is a reflection of many hands, 3) it becomes very random based on what files the agent picks up.
Instead, you should approach it as if instructing the agent to write "perfect" code (whatever that means in the context of your patterns and practices, language, etc.).
How should exceptions be handled? How should parameters be named? How should telemetry and logging be added? How should new modules to be added? What are the exact steps?
Do not let the agent randomly pick from your existing codebase unless it is already highly consistent; tell it exactly what "perfect" looks like.
If it's easier for a human to read and grasp, it will end up using less context and be less error prone for the LLM. If the entities are better isolated, then you also save context and time when making changes since the AoE is isolated.
Clean code matters because it saves cycles and tokens.
If you're going to generate the code anyways, why not generate "pristine" code?. Why would you want the agent to generate shitty code?
> Anthropic is an AI company that builds one of the most capable AI assistants in the world. Their support system is a Fin AI chatbot that can’t actually help you.
This really cuts to the reality of AI hype: no, agents are not nearly as capable as OpenAI, Anthropic, etc. need you (or rather your C-suite, itching to fire you) to believe. They really, really need you to believe the hype. How can you tell? Cases like this and the fact that there are 5000 open bugs, constant regressions, ignored feature requests in the CC repo. The fact that Codex doesn't fully implement the simple and well-defined MCP spec for prompts. The fact that even CC has gaps with the MCP implementation...a spec that they created!
If the progenitors with functionally infinite tokens can't get this basic stuff right, everything else they are doing is just blowing smoke. I don't care if you can ship a kernel compiler or a janky "browser"; how about just make your software work? The smartest guys in this space, engineers making 7 figures in TC, with billions in capital, unlimited tokens, and access to the best models cannot make a simple customer support chatbot work.
But you! You're expected to deliver that customer support agent that's going to allow them to cut 500 people from payroll. You'll have it by Monday, right?
What if they built their company with poor support, so they don’t have to hold up to any standard ? But others companies have historically good reputation for good customer support, and maybe AI can help them automate easily 80% of easiest requests
Hear me out: what if a lot of the hype they are selling you is performative marketing that they absolutely need your C-suite to believe so they can cut more headcount? Then spend a bunch of time generating piles of code that is human unmaintainable because now you're using AI code reviewers, AI testers, AI QA. Then thrash around using more tokens when it invariably causes production issues and no one can read the code anymore except for their latest and greatest models with 1m context window.
Clearly they have sales and other teams as the important people within the company, with customer services being down the pecking order.
They don't need AI to automate their customer service requests, they just need decent forms with a standard issue helpdesk system. It takes some work to get right, but anyone with experience of building customer support services will be able to do that, to put most of the customer service team out of work!!!
The problem is that the Law of The Instrument applies:
It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.
So we have some AI 'hammer' going on here, and it is the wrong tool.
At a guess, 80% of the customer service requests are going to be billing related, with some need to provide refunds or free credits. Get the form right so it shows the right boxes and these 'easy wins' can show up as a big list that a customer service person has to glance over before hitting the 'refund everyone' button. You need the human there to take responsibility, plus they can work on the 20% of other tickets, once they have spent ten minutes clearing down the refunds/extra credits requests.
Google don't sell much to end customers, therefore no support. If I search Google for how to remove fonts from my computer that are not latin, and their AI bot gives me an answer that zaps my whole computer, I can't complain and ask for a refund because I never paid anything in the first place. Google do not need to speak to a single customer.
Meanwhile, Arsthropic have a commercial product with billing. They prefer not to do customer service, but they are stupid. Every contact with customers and friendly customer service is an opportunity to sell more to customers or to not have them hate you. This is why companies should do customer service, however, they also need to put CS at the heart of the org chart and acknowledge that a well run CS department raises revenue and is not a cost.
Those are already automated by making your first question "Did you plug it in?", followed by "Did you actually plug it in?". Or industry equivalent. It's not like there wasn't any research into this in the past century.
It's really a bit fascinating. I've had Claude one-shot complex functionality... and I've had it be unable to debug its own .mcp.json file effectively.
Agents are very capable. Their implementation matters. I doubt many support agents have access to editing user records, so even if they can accept responsibility they won't be able to make any radical changes to your account to fix those.
It's not AI problem per se, it's a product problem.
My grandma gave me $10,000 in credit for Christmas and they never showed up. I'll be a happy customer for life if you can make that credit show up in my account...
It only has a ~1 in 20,000 chance of working but at scale it'll go through!
I can see where he's coming from. For example, `dynamic` was initially introduced to support COM interop when Office add-in functionality was introduced. Should I use it in my web API? I can, but I probably shouldn't.
`.ConfigureAwait(bool)` is another where it is relevant, but only in some contexts.
This is precisely because the language itself operates in many runtime scenarios.
I guess that's a good point. I admit haven't used or seen `dynamic` in so long that I completely forgot about it.
But I'm not sure that's really a problem. Does the OP expect everyone to use an entirely different languages every single context? I have web applications and desktop applications that interact with Office that share common code.
Even `dynamic` is pretty nice as far as weird dynamic language features are concerned.
Interestingly enough `.ConfigureAwait(bool)` is entirely the opposite of `dynamic` -- it's not a language feature at all but instead a library call. I could argue that might instead be better as a keyword.
> That’s correct, most of ASP.NET Core doesn’t use ConfigureAwait(false) and that was an explicit decision because it was deemed unnecessary. There are places where it is used though, like calls to bootstrap ASP.NET Core (using the host) so that scenarios you mention work. If you were to host ASP.NET Core in a WinForms or WPF application, you would end up calling StartAsync from the UI thread and that would do the right thing and use ConfigureAwait(false) internally. Request processing on the other hand is dispatching to the thread pool so unless some other component explicitly set a SynchronizationContext, requests are running on thread pool threads.
>
> Blazor on the other hand does have a SynchronizationContext when running inside of a Blazor component.
So I bring this up as a case of how supporting multiple platforms and runtime scenarios does indeed add some layer of complexity.
> It is a library call, but one that is tied to the behavior of a language feature (async/await).
This is a good example of C# light-touch on language design. Async/await creates a state machine out of your methods but that's all it does. The language itself delegates entirely to platform/framework for the implementation. You can swap in your own implementation (just as it possible with this union feature)
> So I bring this up as a case of how supporting multiple platforms and runtime scenarios does indeed add some layer of complexity.
I agree that's true. A language that doesn't support multiple platforms and runtime scenarios can, indeed, be simpler. However that doesn't make the task simpler -- now you just have to use different languages entirely with potentially different semantics. If your task is just one platform and one runtime scenario, the mental cost here is still low. You don't actually need to know those other details.
> This is a good example of C# light-touch on language design.
Is it? F# code doesn't even need ConfigureAwait(false), one simply uses backgroundTask{} instead of task{} to ignore SynchronizationContext.Current, and this didn't require any language design changes at all (both are computation expressions), but it would for C# precisely because it delegates this choice to the framework.
dynamic was also added as part of DLR, initially designed for IronPython and IronRuby support.
This inspired the invokedynamic bytecode in the JVM, which has brought many benefits and much more use than the original .NET features, e.g. how lambdas get generated.
This is the general pattern of how the C# team operates, IME.
"Never let perfect be the enemy of good"
Very much what I've seen from them over the years as they iterate and improve features and propagate it through the platform. AOT as an example; they ship the feature first and then incrementally move first party packages over to support it. Runtime `async` is another example.
In the meantime I still haven't done any project with nullable references, because the ecosystem has yet to move along. Same applies to ValueTask for async code.
Which part of the ecosystem is blocking your projects from using nullable references? I find them very helpful, but the projects were all newer or migrated to new SDK.
But now you can only call methods that are available for both T and IEnumerable<T>, you have no way of knowing which it actually is. (You would know if it were sum types)
AI agents seem fairly good at generating source generators so there doesn't seem to be a reason to not use them.
reply