Hacker Newsnew | past | comments | ask | show | jobs | submit | danesparza's commentslogin

"Vigilante Road Repair" - cool band name - I call it!

And cameras inside the office

Ah, a British convergence! That phrase always makes me think of this now (from the Vicar of Dibley): https://youtu.be/37ficiqoE6U

RIP Emma Chambers


"Lemon's attorney, Abbe Lowell, said last week that a magistrate judge rejected charges against Lemon. A source told ABC News that Bondi last week was "enraged" at the magistrate judge's decision to not charge the journalist."

And then governments use this data, but can wash their hands of it saying "we didn't collect it"

> then governments use this data, but can wash their hands of it saying "we didn't collect it"

These are CMMS and HHS data. The government literally collected it. On government forms.

This thread is Exhibit A for how the tech-privacy community so often trips itself up. We have abuse of government data at hand. It’s clear. It’s sharp. Nobody denies the government has the data, how they got the data or how they’re using it.

So instead we go into parallel construction and advertising dragnets and a bunch of stuff that isn’t clear cut, isn’t relevant, but is someone’s bogeybear that has to be scratched.


Yes, retroactively manufactured cause for a warrant to find only the information you want.

Also, don't forget that profit maximization means selling to the highest bidder, which might not be US govt. Certainly, there is means, motive, and opportunity for individuals with access to sell this info to geopolitical adversaries, and it is BY FAR the easiest way for adversaries to acquire it.

It has happened before and it will happen again.


It means selling to all bidders, since it's information and not a tangible asset.

They've stopped obtaining warrants. ICE claims they can enter homes forcefully without a judge-signed warrant. Judges have released at least one victim seized this way.

Can you provide a news link to this? As I understand it, courts have historically followed the precedent that “you can’t suppress the body”, meaning even if the method of an arrest is illegal, you don’t have to let the person go if their arrest is otherwise valid.


I wasn’t clear. I’m referring to a news link indicating that judges have released folks due to valid arrest warrants but invalid means of arresting folks.

They didn't have a valid warrant. Without a judge's review, they broke down his door and entered, armed, and abducted him.

I understand, but do you have a news link to where the judge released him?


Isn't that moving the goalposts? The comment you were asking under specifically said these arrests were made without obtaining a warrant.

ICE uses administrative warrants; and while administrative warrants do not allow for seizures inside a home, see my comment about the legal argument of “you can’t suppress the body” for why there’s not a whole lot that can be done if they do decide to kick down your door. The latest Serious Trouble podcast goes into this at the 12 minute mark. https://www.serioustrouble.show/p/120-days

In this case the story didn’t make it clear whether or not they even had an administrative warrant. I’d be interested to find out if they did.


This statement is true. If you are downvoting because it is incorrect, I'd appreciate an explicit correction. Other posters provided links in this thread.

* https://www.wired.com/story/us-judge-rules-ice-raids-require...

* https://www.minnpost.com/metro/2026/01/judge-orders-release-...

> A federal judge in Minnesota on Thursday ordered the release of a Liberian man four days after heavily armed immigration agents broke into his home using a battering ram and arrested him.

> U.S. District Judge Jeffrey Bryan said in his ruling that the agents violated Garrison Gibson’s Fourth Amendment rights against unlawful search and seizure.


The ironic thing is that palantir has been operationalizating data gathered by the NSA and reselling as "ai targeting" to another country's military. But yes usually the loophole goes the other way.

Maybe what we're really seeing now though is the feedback loop, the information laundering industrial complex that is the surveillance economy.


Source? My understanding was that palantir didn't take ownership of data themselves but rather came in and set up a new system for the org to use.

"Allow us to use your data to improve our service." ...by selling your data to improve our service's profitability.

Dreamhost

Some quotes from the article stand out: "Claude after working for some time seem to always stop to recap things" Question: Were you running out of context? That's why certain frameworks like intentional compaction are being worked on. Large codebases have specific needs when working with an LLM.

"I've never interacted with Rust in my life"

:-/

How is this a good idea? How can I trust the generated code?


The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up.

This is the key to the whole thing in my opinion.

If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work.


Fuzzing handles the logic verification, but I'd be more worried about the architectural debt of mapping GC patterns to Rust. You often end up with a mess of Arc/Mutex wrappers and cloning just to satisfy the borrow checker, which defeats the purpose of the port.

That will vary depending on how the code is architected to begin with, and the problem domain. Single-ownership patterns can be refactored into Rust ownership, and a good AI model might be able to spot them even when not explicitly marked in the code.

For some problems dealing with complex general graphs, you may even find it best to use a Rust-based general GC solution, especially if it can be based on fast concurrent GC.


Yeah and he claims a pass rate of 99.96%. At that point you might be running into bugs in the original implementation.

Not really. Due to combinatorial explosion some path is hard to hit randomly in this kind of source code. I would have preferred if after 2M random battles the reference implementation had 99% code coverage, than 99% pass rate.

I don't know anything about Pokemon, but I briefly looked at the code. "weather" seemed like a self contained thing I could potentially understand. Looking at https://github.com/vjeux/pokemon-showdown-rs/blob/master/src...

> NOTE: ignoringAbility() and abilityState.ending not fully implemented

So it is almost certain even after 99.96% pass rate, it didn't hit battle with weather suppressing Pokemon but with ability ignored. Code coverage driven testing loop would have found and fixed this one easily.


Good catch. I should really look at the code before commenting on it.

I'm very skeptical, but this is also something that's easy to compare using the original as a reference implementation, right? providing lots of random input and fixing any disparities is a classic approach for rewriting/porting a system

This only works up to a certain point. Given that the author openly admits they don't know/understand Rust, there is a really high likelihood that the LLM made all kinds of mistakes that would be avoided, and the dev is going to be left flailing about trying to understand why they happen/what's causing them/etc. A hand-rewrite would've actually taught the author a lot of very useful things I'm guessing.

It seems like they have something like differential fuzzing to guarantee identical behavior to the original, but they still are left with a codebase they cannot read...

Hopefully they have a test suite written by QA otherwise they're for sure going to have a buggy mess on their hands. People need to learn that if you must rewrite something (often you don't actually need to) then an incremental approach best.

1 month of Claude Code would be an incremental approach

It would honestly try to one-shot the whole conversion in a 30 minute autonomous session


> often you don't actually need to

Feels like this one is always a mistake that needs to be made for the lesson to be learned.


At this point it seems pretty clear that all projects ported from Ruby to Python, then Python to Typescript, must now be ported to Rust. It will solve almost all problems of the tech industry…

His goal was to get a faster oracle that encoded the behavior of Pokemon that he could use for a different training project. So this project provides that without needing to be maintainable or understandable itself.

Back of the envelope, they'll need to use this on the order of a billion times to break even, under the (laughable) assumption that running claude code uses comparable compute as the computer he's running his code on. So more like hundreds of billions or trillions, I'd guess.

I think it could work if they have tests with good coverage, like the "test farm" described by someone who worked in Oracle.

My answer to this is to often get the LLMs to do multiple rounds of code review (depending on the criticality of the code, doing reviews on every commit. but this was clearly a zero-impact hobby project).

They are remarkably good at catching things, especially if you do it every commit.


> My answer to this is to often get the LLMs to do multiple rounds of code review

So I am supposed to trust the machine, that I know I cannot trust to write the initial code correctly, to somehow do the review correctly? Possibly multiple times? Without making NEW mistakes in the review process?

Sorry no sorry, but that sounds like trying to clean a dirty floor by rubbing more dirt over it.


It sounds to me like you may not have used a lot of these tools yet, because your response sounds like pushback around theoreticals.

Please try the tools (especially either Claude Code with Opus 4.5, or OpenAI Codex 5.2). Not at all saying they're perfect, but they are much better than you currently think they might be (judging by your statements).

AI code reviews are already quite good, and are only going to get better.


Why is the go-to always "you must not have used it" in lieu of the much more likely experience of having already seen and rejected first-hand the slop that it churns out? Synthetic benchmarks can rise all they want; Opus 4.5 is still completely useless at all but the most trivial F# code and, in more mainstream affairs, continues to choke even on basic ASP.NET Core configuration.

About a year ago they sucked at writing elixir code.

Now I use them to write nearly 100% of my elixir code.

My point isn’t a static “you haven’t tried them”. My point is, “try them every 2-3 months and watch the improvements, otherwise your info is outdated”


> It sounds to me like you may not have used a lot of these tools yet

And this is more and more becoming the default answer I get whenever I point out obvious flaws of LLM coding tools.

Did it occur to you that I know these flaws precisely because I work a lot with, and evaluate the performance of, LLM based coding tools? Also, we're almost 4y into the alleged "AI Boom" now. It's pretty safe to assume that almost everyone in a development capacity has spent at least some effort evaluating how these tools do. At this point, stating "you're using it wrong" is like assuming that people in 2010 didn't know which way to hold a smartphone.

Sorry no sorry, but when every criticism towards a tool elecits the response that people are not using it well, then maybe, just maybe, the flaw is not with all those people, but with the tool itself.


Spending 4 years evaluating something that’s changing every month means almost nothing, sorry.

Almost every post exalting these models’ capabilities talks about how good they’ve gotten since November 2025. That’s barely 90 days ago.

So it’s not about “you’re doing it wrong”. It’s about “if you last tried it more than 3 months ago, your information is already outdated”


> Spending 4 years evaluating something that’s changing every month means almost nothing, sorry.

No need to be sorry. Because, if we accept that premise, you just countered your own argument.

If me evaluating these things for the past 4 years "means almost nothing" because they are changing sooo rapidly...then by the same logic, any experience with them also "means almost nothing". If the timeframe to get any experience with these models befor said experience becomes irelevant is as short as 90 days, then there is barely any difference between someone with experience and someone just starting out.

Meaning, under that premise, as long as I know how to code, I can evaluate these models, no matter how little I use them.

Luckily for me though, that's not the case anyway because...

> It’s about “if you last tried it more than 3 months ago,

...guessss what: I try these almost every week. It's part of my job to do so.


Implementation -> review cycles are very useful when iterating with CC. The point of the agent reviewer is not to take the place of your personal review, but to catch any low hanging fruit before you spend your valuable time reviewing.

> but to catch any low hanging fruit before you spend your valuable time reviewing.

And that would be great, if it wern't for the fact that I also have to review the reviewers review. So even for the "low hanging fruit", I need to double-check everything it does.

Which kinda eliminates the time savings.


That is not my perspective. I don't review every review, instead use a review agent with fresh context to find as much as possible. After all automated reviews pass, I then review the final output diff. It saves a lot of back and forth, especially with a tight prompt for the review agent. Give the reviewer specific things to check and you won't see nearly as much garbage in your review.

Well, you can review its reasoning. And you can passively learn enough about, say, Rust to know if it's making a good point or not.

Or you will be challenged to define your own epistemic standard: what would it take for you to know if someone is making a good point or not?

For things you don't understand enough to review as comfortably, you can look for converging lines of conclusions across multiple reviews and then evaluate the diff between them.

I've used Claude Code a lot to help translate English to Spanish as a hobby. Not being a native Spanish speaker myself, there are cases where I don't know the nuances between two different options that otherwise seem equivalent.

Maybe I'll ask 2-3 Claude Code to compare the difference between two options in context and pitch me a recommendation, and I can drill down into their claims infinitely.

At no point do I need to go "ok I'll blindly trust this answer".


Wait until you start working with us imperfect humans!

Humans do have capacity for deductive reasoning and understanding, at least. Which helps. LLMs do not. So would you trust somebody who can reason or somebody who can guess?

People work different than llms they fond things we don't and the reverse is also obviously true. As an example, a stavk ise after free was found in a large monolithic c++98 codebase at my megacorp. None of the static analyzers caught it, even after modernizing it and getting clang tidy modernize to pass, nothing found it. Asan would have found it if a unit test had covered that branch. As a human I found it but mostly because I knew there was a problem to find. An llm found and explained the bug succinctly. Having an llm be a reviewer for merge requests males a ton of sense.

> How is this a good idea? How can I trust the generated code?

You don't. The LLMs wrote the code and is absolutely right. /s

What could possibly go wrong?


Same way you trust any auto translation for a document. You wrote it in English (or whatever language you’re most proficient in), but someone wants it in Thai or Czech, so you click a button and send them the document. It’s their problem now.

As much as I wanted to roll my eyes, this did give me a chuckle.


Next time take a long exposure picture with your phone. You might be able to see it that way.


They originally wanted to make a helicopter. (Not kidding)


More like a ultra-light plane / helicopter hybrid or so, isn't it?


Sounds like an autogyro, indeed one of the lightest and cheapest ways to build a plane. They are basically planes with a free-spinning helicopter rotor for lift.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: