More

hasperdi · 2025-12-05T09:01:47 1764925307

Even LinkedIn is now down. Opening linkedin.com gives me a 500 server error and Cloudflare at the bottom. Quite embarassing.

asmor · 2025-12-05T09:18:39 1764926319

At least they were available when Front Door was down!

hasperdi · 2025-12-02T18:10:44 1764699044

Sure Bun has its benefits, but I don't see the strategic reasons why Anthropic is doing this

skybrian · 2025-12-02T18:34:00 1764700440

Apparently Claude Code being built on Bun was considered a good enough reason? But it looks more strategic for Bun since they’re VC-backed and get a good exit:

> Claude Code ships as a Bun executable to millions of users. If Bun breaks, Claude Code breaks. Anthropic has direct incentive to keep Bun excellent.

…

> Bun's single-file executables turned out to be perfect for distributing CLI tools. You can compile any JavaScript project into a self-contained binary—runs anywhere, even if the user doesn't have Bun or Node installed. Works with native addons. Fast startup. Easy to distribute.

> Claude Code, FactoryAI, OpenCode, and others are all built with Bun.

…

> Over the last several months, the GitHub username with the most merged PRs in Bun's repo is now a Claude Code bot. We have it set up in our internal Discord and we mostly use it to help fix bugs. It opens PRs with tests that fail in the earlier system-installed version of Bun before the fix and pass in the fixed debug build of Bun. It responds to review comments. It does the whole thing.

> This feels approximately a few months ahead of where things are going. Certainly not years.

…

> We've been prioritizing issues from the Claude Code team for several months now. I have so many ideas all the time and it's really fun. Many of these ideas also help other AI coding products.

…

> Instead of putting our users & community through "Bun, the VC-backed startups tries to figure out monetization" – thanks to Anthropic, we can skip that chapter entirely and focus on building the best JavaScript tooling.

https://bun.com/blog/bun-joins-anthropic

m0llusk · 2025-12-02T18:13:12 1764699192

Turn every potentially useful development tool into some LLM hype bullshit to grow the bubble.

viraptor · 2025-12-02T18:12:54 1764699174

Same thought and I can't wait for a video with a very confused Theo...

I mean, it's likely very important for them to have a fast and sandboxed code executor available. But it's not like Bun would fight against improvements there or refuse paid work on specific areas, right?

hasperdi · 2025-12-01T20:31:52 1764621112

and can be faster if you can get an MOE model of that

dormento · 2025-12-01T20:46:09 1764621969

"Mixture-of-experts", AKA "running several small models and activating only a few at a time". Thanks for introducing me to that concept. Fascinating.

(commentary: things are really moving too fast for the layperson to keep up)

hasperdi · 2025-12-01T21:29:57 1764624597

As pointed out by a sibling comment. MOE consists of a router and a number of experts (eg 8). These experts can be imagined as parts of the brain with specialization, although in reality they probably don't work exactly like that. These aren't separate models, they are components of a single large model.

Typically, input gets routed to a number of of experts eg. top 2, leaving the others inactive. This reduces number of activation / processing requirements.

Mistral is an example of a model that's designed like this. Clever people created converters to transform dense models to MOE models. These days many popular models are also available in MOE configuration

whimsicalism · 2025-12-01T20:57:56 1764622676

that's not really a good summary of what MoEs are. you can more consider it like sublayers that get routed through (like how the brain only lights up certain pathways) rather than actual separate models.

Mehvix · 2025-12-01T22:26:01 1764627961

The gains from MoE is that you can have a large model that's efficient, it lets you decouple #params and computation cost. I don't see how anthropomorphizing MoE <-> brain affords insight deeper than 'less activity means less energy used'. These are totally different systems, IMO this shallow comparison muddies the water and does a disservice to each field of study. There's been loads of research showing there's redundancy in MoE models, ie cerebras has a paper[1] where they selectively prune half the experts with minimal loss across domains -- I'm not sure you could disable half the brain and notice a stupefying difference.

[1] https://www.cerebras.ai/blog/reap

whimsicalism · 2025-12-02T20:08:32 1764706112

> I don't see how anthropomorphizing MoE <-> brain affords insight deeper than 'less activity means less energy used'.

I'm not saying it is a perfect analogy, but it is by far the most familiar one for people to describe what sparse activation means. I'm no big fan of over-reliance on biological metaphor in this field, but I think this is skewing a bit on the pedantic side.

re: your second comment about pruning, not to get in the weeds but I think there have been a few unique cases where people did lose some of their brain and the brain essentially routed around it.

miohtama · 2025-12-01T22:12:17 1764627137

All modern models are MoE already, no?

hasperdi · 2025-12-02T05:55:57 1764654957

That's not the case. Some are dense and some are hybrid.

MOE is not the holy grail, as there are drawbacks eg. less consistency, expert under/over-use

bigyabai · 2025-12-01T21:33:34 1764624814

>90% of inference hardware is faster if you run an MOE model.

tarruda · 2025-12-02T07:28:33 1764660513

Deepseek is already a MoE

hasperdi · 2025-12-01T20:30:50 1764621050

With quantization, converting it to an MOE model... it can be a fast walk

hasperdi · 2025-11-27T21:16:22 1764278182

One thing that the Chinese are really good at is cost innovation, reducing costs as many ways possible to make their products affordable for the majority. Their aim for good enough quality.

I bet the sales ratio of the Chinese vs the German sharpeners exceed 20:1

hasperdi · 2025-11-25T18:43:54 1764096234

If they did adopt JSON then it wouldn't be XMPP anymore.

As a user, I don't care much. But my experience with XMPP is that was not as solid as other solutions, including closed source ones. I could've been issues in clients' implementation, but overall it wasn't great

ezst · 2025-11-25T20:10:20 1764101420

> my experience with XMPP is that was not as solid as other solutions, including closed source ones.

Considering that WhatsApp is essentially an old/non-federating fork of ejabberd (from where this blog post originates), I think XMPP is doing alright in that regard.

greatgib · 2025-11-25T23:09:25 1764112165

Looking at how many years ago the fork open, you can easily guess that the protocol doesn't look anything at all like XMPP.

ezst · 2025-11-26T11:04:21 1764155061

For sure, it got continuously refined for WA-specific needs over the years, but XMPP very much still is there, lurking in the shadows: https://www.sciencedirect.com/science/article/abs/pii/S17422...

jauntywundrkind · 2025-11-25T18:50:54 1764096654

Extensible, extense thineself.

RadiozRadioz · 2025-11-25T21:30:47 1764106247

"not as solid" - please explain.

hasperdi · 2025-11-24T23:47:41 1764028061

Fun fact, you can use Ghostty and vibecode the shader you want. In fact, the other day I used Claude Code to create me a custom CRT shader.

hasperdi · 2025-11-22T19:06:31 1763838391

> .001% higher than the flu

That isn't true. Just from this paper https://pmc.ncbi.nlm.nih.gov/articles/PMC9115089 ... COVID-19 killed roughly five to seven times more hospitalized older adults than influenza.

Anecdotal, my uncle, several friends' relatives died from COVID during those lockdowns. I don't know / heard anyone died because of flu (in my extended circle of people I know)

listenallyall · 2025-11-22T23:12:30 1763853150

Most people are not a "hospitalized older adult". Yet they treated everyone, regardless of age, gender, health as if they were on death's door. The lock downs were absolutely overkill and went on far, far too long.

hasperdi · 2025-11-21T09:56:41 1763719001

I just tried that on their playground:

7B:Hi! I'm Olmo 3, an AI assistant created by the non-profit organization Ai2. I'm here to help with questions, ideas, or tasks you have—just let me know what you need! How can I assist you today? Rawr!

32B: Hi! I'm Olmo, a helpful AI assistant built by the Allen Institute for AI (Ai2). My knowledge is up to December 2024, and I'm designed to assist with a wide range of tasks. How can I help you today?

hasperdi · 2025-11-19T12:26:00 1763555160

If you share that API key here...

- we all get to use free LLM

- they'll learn to do it properly next time

So it's a win-win situation

embedding-shape · 2025-11-19T12:36:15 1763555775

If I instead don't, and let you know that the key is there in the source code, hopefully at least one deserving person might learn how to look through source code better, and none of the lazy people get it :)

So no.