Hacker Newsnew | past | comments | ask | show | jobs | submit | zer00eyz's commentslogin

On top of that there are probably a few more hits for the containers, vm and hypervisor, all those pods have monitoring etc. All the layers of abstraction are just stacks of turtles giving the illusion of being easier but adding complexity and cost/overhead.

It is a security product, so unless they want to deal with the exfiltration charges on the data it's probably better to keep it in AWS. Thats the nasty double edge sword of "cloud", and how we're all getting locked in.

All the bits on their own seem to make perfect sense, but it's become apparent that the orchestra has been blind folded and given noise canceling head phones.


And to market their AI security product.

The above comment needs to be higher.

IF we had a black box programing language, and handed it over to this system, it would never be able to do anything with it past its context window.

Hey kids I hear you like agents, so we made an agent write agents till we got better agents.


> something competitive with Nvidia for AI training

Apple is counting on something else: model shrink. Every one is now looking at "how do we make these smaller".

At some point a beefy Mac Studio and the "right sized" model is going to be what people want. Apple dumped a 4 pack of them in the hands of a lot of tech influencers a few months back and they were fairly interesting (expensive tho).


> Apple is counting on something else: model shrink

The most powerful AI interactions I've had involved giving a model a task and then fucking off. At that point, I don't actually care if it takes 5 minutes or an hour. I've cued up a list of background tasks it can work on, and that I can circle back to when I have time. In that context, smaller isn't even the virtue at hand–user patience is. Having a machine that works on my bullshit questions and modelling projects at one tenth the speed of a datacentre could still work out to being a good deal even before considering the privacy and lock-in problems.


What "tooling" do you use to let AIs work unattended for long periods?

> What "tooling" do you use to let AIs work unattended for long periods?

Claude and Kagi Assistant. I tried tooling up a multi-model environment in Ollama and it was annoying. It's just searching the web, building models and then running a test suite against the model to refine it.


Cool? And it has nothing to do with what kind of consumer hardware Apple should sell. If your use cases are literally "bigger model better" then the you should always use cloud. No matter how much computing power Apple squeezes into their device it won't be a mighty data center.

For running the model once it’s been trained, all a datacenter does is give you lower latency. Once the devices have a large enough memory to host the model locally, then the need to pay datacenter bills is going to be questioned. I’d rather run OpenClaw on my device plugged into a local LLM rather than rely on OpenAI or Claude.

> At some point a beefy Mac Studio and the "right sized" model is going to be what people want.

It's pretty clear that this isn't going to happen any time soon, if ever. You can't shrink the models without destroying their coherence, and this is a consistently robust observation across the board.


I don’t think it’s about literally shrinking the models via quantization, but rather training smaller/more efficient models from scratch

Smaller models have gotten much more powerful the last 2 years. Qwen 3.5 is one example of this. The cost/compute requirements of running the same level intelligence is going down


I have said for a while that we need a sort of big-little-big model situation.

The inputs are parsed with a large LLM. This gets passed on to a smaller hyper specific model. That outputs to a large LLM to make it readable.

Essentially you can blend two model type. Probabilistic Input > Deterministic function > Probabilistic Output. Have multiple little determainistic models that are choose for specific tasks. Now all of this is VERY easy to say, and VERY difficult to do.

But if it could be done, it would basically shrink all the models needed. Don't need a huge input/output model if it is more of an interpreter.


There are no practically useful small models, including Qwen 3.5. Yes, the small models of today are a lot more interesting than the small models of 2 years ago, but they remain broadly incoherent beyond demos and tinkering.

Yes, but bigger models are still more capable. Models shrinking (iso-performance) just means that people will train and use more capable models with a longer context.

Of course they are! Both are important and will be around and used for different reasons

Cheaper than what you’d expect though. You could get a nice setup for $20-40k 6mo ago. As far as enterprise investments go, that’s a rounding error.

Not all enterprises are the same, I imagine many companies have different departments working with local optimums, so someone who could benefit from it to get more productivity might not have access to it because the department that is doing hardware acquisition is being measured in isolation.

I think it’s a little unnecessary to lecture somebody on HN about how enterprises come in different shapes and sizes. It’s pretty clear what I’m implying here if you aren’t actively trying to assume the most reduced, least charitable version of my statement.

Drop that down to 5k, and make it useful.

Give every iPhone family a in house Siri that will deal with canceling services and pursuing refunds.

Your customer screw up results in your site getting an agent drive DDOS on its CS department till you give in.

Siri: "Hey User, here's your daily update, I see you haven't been to the gym, would you like me to harass their customer service department till they let you out of their onerous contract?"


I’m running modest setup using a mistral model (24B) on a 9070 (AMD) and 32gb of ram. $1800 machine at the time I built it. It ultimately boils down to what you want to do with it. For me, it’s basically a drafting tool. I use it to break through writer’s block, iterate, or just throw out some ideas. Sometimes summarize but that can be hit or misss.

I don’t need the latest and greatest and I fine tuned LM studio enough that I get acceptable results in 30 to 90 seconds that help me keep moving ahead. I am not a software engineer, I am definitely not as much of a “coder” as the average person on HN. So if I can do it for less than $2000, I bet a lot of (smarter/experience coding) people could see great results for $5000.

You can get an M3 ultra Mac studio with 96gb ram for $4000. If you’re willing to go up to $6k it’s 256gb. Wayyyyy more firepower than my setup. I imagine plenty powerful for a lot of people.


Next up:

Spend 1.99 and get a chest full of Anthropic emeralds, that you can redemem for Claude Chests, and a chance at winning a million more tokens.

Or watch this 3 minute ad, for 1000 tokens.

I did not think this day would come this soon, but I assure you that anthropic has no moat.


> if you've ever tried telling a toddler "no"

Parenting is rough! Good for you, for sticking to your guns.

> The plaintiff, Kaley, started using YouTube at age 6 and Instagram at 11.

Who was at the wheel here? If we call up all Kaleys teachers from this time frame and ask them "were Kaleys parents checked out" what do you think the answer would be? For as bad as education has gotten, I sympathize with with teachers because parents have gotten FAR worse.

It's not like we don't know these things about peoples behavior on devices... maybe it's something that should be talked about in school, along with how credit works, and how to file taxes.

Do we need to tell parents "it's 10am, have your kids touched grass yet?"... "It's 10pm did you take the tablet and phone away so they go the fuck to sleep?" --

"touch grass" as a meme/slang is literally people poking fun at the constantly on line. It's "hazing" and "bullying" to drive social correction.


For 30 (60's to 90's) years we told parents "It's 10pm do you know where your kids are", with an AD, on TV. We came home to empty houses and go in with a key around our neck.

Now, we call the police, and arrest parents, if kids are outside, unsupervised. https://www.cnn.com/2024/12/22/us/mother-arrested-missing-so...

When I was a child in the 80s and 90s, we had "jobs" as kids... Mowing lawns, Paper routes and so on. Now if you go offer to mow your neighbors lawn, the cops get called: https://www.fox8live.com/2023/07/26/officer-surprises-young-...

Parents are afraid to let their kids out of their site, and for those of us who have been pragmatic because we understand the data (and not the fear) they tend to look down on us.

Talk to any one who is Gen X and they will tell you that we basically got thrown out side all day (and had fun). Parents cant say "go outside and play" so kids end up getting handed devices... and they are going to play and explore and do the dumb things that gets them in trouble.

> those child safety systems we put in place

Except we have denormalized things that SHOULD be perfectly fine. And as fewer kids get to go outside unattended with friends, it pushes their peers to go "online" to socialize.

Maybe the government needs to run commercials "Its 10am, why isnt your child outside playing with the neighbor kids unsupervised"


> r + 1 == count ? 0 : r + 1

It'is legal!

But looks like a few more steps when compiled.


> but it’s also a token fire lol.

I get much better results out of having Claude much much more task focused. I only want it to ever make the smallest possible change.

There seems to be a fair bit of research to back this up: https://medium.com/design-bootcamp/when-more-becomes-less-wh...

It's also may be why people seem to find "swarms" of agents so effective. You have one agent ingesting what you're describing. Then it delegates a task off to another agent with the minimal context to get the job done.

I would be super curious about the quality of output if you asked it to write out prompts for the days work, and then fed them in clean, one at a time.


I also find value in minimizing step width so that seems to track.

On this particular project, there are a lot of moving parts and we are, in many cases , not just green-fielding, we are making our own dirt… so it’s a very adaptive design process. Sometimes it’s possible, but often we cannot plan very far ahead so we keep things extremely modular.

We’ve had to design our own protocols for control planes and time synchronization so power consumption can be minimized for example, and in the process make it compatible with sensor swarm management. Then add connection limits imposed by the hardware, asymmetric communication requirements, and getting a swarm of systems to converge on sub millisecond synchronized data collection and delivery when sensors can reboot at any time…as you can imagine this involves a good bit of IRL experimentation because the hardware is also a factor (and we are also having to design and build that)

It’s very challenging but also rewarding. It’s amazing for a small team to be able to iterate this fast. In our last major project it was much, much slower and more tedious. The availability of AI has shifted the entire incentive structure of the development process.


> Animal Care and Control Team (ACCT) Philly sent her some pics of the pooch in question.

100 percent this dog is named after a bullet.

Because thats how Philly rolls.

In case you dont know:

HitchBOT got murdered in Philly https://en.wikipedia.org/wiki/HitchBOT

Bill Burr's Philly set: https://www.reddit.com/r/cowboys/comments/1il5msw/in_honor_o...

Don't get me wrong Philly is great, but Philly is... something.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: