Don't believe the latest fashion on social media, including the latest thing it's fashionable to be sceptical about. That rule of thumb performed well the last 10 years.
The dilemma for me is that aspects of social media (namely information sharing and learning) are incredibly useful, while others (contrarian argumentation, propaganda, attention black holes) are very harmful.
I go through cycles of abstaining from online interaction because I’ve sunk into the dark side too much but then return with a stronger intention in order to feed my hobbies and mind. I’ve found that it’s not so simple to just “not believe” what you see and read as being constantly bombarded with political messaging necessarily pushes you to one side or the other unconsciously.
So yeah, for me the best way is to cut that feed off entirely, instead of pretending I have any kind of effective fire wall against its deeper mental effects.
I'd like a product/UI that uses an LLM to filter social media to get rid of "contrarian argumentation, propaganda, attention black holes". Basically a sovereign recommendation engine.
The easiest way to do that is to block the social media sites at your firewall. You may have a few false positive (maybe 1% or so?) but that should be minor compared to the time savings.
> Don't believe the latest fashion on social media, including the latest thing it's fashionable to be sceptical about.
I don't think choosing to believe in something just because other people are piling on being skeptical of it is a viable strategy. If you hear a lot of people pointing out "X is a scam" you shouldn't refuse to believe them on principle.
Previous RAM price spikes lasted for over a year. This one should last at least that long. It's promising however that the rate of change has slowed down in the last month.
It's a joint ignorance of how these frontier models get baked and what consumers want.
Many pundits think it's just a matter of scraping the internet and having a few ML scientists run ablation experiments to tune hyperparameters. That hasn't been true for over a year. The current requirements are more org-scale, more payoff from scale, more moat. The main legitimate competitive threat is adversarial distillation.
Many pundits also think that consumers don't want to pay a premium for small differences on the margin. That is very wrong-headed. I pay $200/month to a frontier lab because, even though it's only a few % higher in benchmark scores, it is 5x more useful on the margin.
> They have internal scale and scope economies as the breadth of synthetic data expands.
These frontier labs will have a hundred or a thousand teams of people+AI working in parallel generating synthetic data to solve different niches. A few teams solve computer use. A few teams solve math. A few teams solve various games. So the org is basically a big machine that mints data, and model research is only a small part of it. Scale then is the moat.
The second leg of the moat thesis is that open weights competition will die off soon because the cost to keep up with the scale will be too excessive.
The third leg of the moat thesis is that customers are happy to pay big margins for differences that appear small if the benchmark is the measuring stick.
If the paradigm was still scrape internet -> train model, I'd agree that there is no moat.
I disagree that the model is a moat; distillation of models is going to happen, and even without it all the current players have models that are virtually indistinguishable for the use-case.
Model capbilities have converged over time, and I don't see this trend reversing. OpenAI owns only the model.
The provider who does have a moat is Google - they own the entire vertical, from the hardware, to the training data, they have it all.
OpenAI has to buy GPUs, Google makes them.
OpenAI has to rent data centers. Google owns them.
OpenAI has to scrape the web for all training data. Google's collection of user emails (not counting their Android data harvesting, ad data harvesting user-tracking, etc) alone gives them a ton of training data which will never be available to scrapers.
Google has billions of signed-in users, OpenAI has to market to and attract users (800m user count last I checked, but also last I checked that growth was asymptotic and flattening out).
Thats what a moat looks like. Better technology and/or results has never been, in my memory, a moat.
Look, I'm upvoting your posts in this thread because you make some good points, but I'm not really convinced that a) synthetic data will result in good models, nor that b) quality synthetic data can be generated by labs outside of those orgs that have a ton of user-info.
This is why I say that OpenAI has no moat - even if synthetic data (however it is generated) is 90% of training data, there are still only two possibilities:
1. Orgs like Google, Microsoft and Amazon have a ton of user-data with which to produce synthetic data (after all, it's not produced out of thin air).
and
2. You don't need a ton of real data to seed the synthetic generation.
In the first case, yes, that looks like a moat, but not for OpenAI, more like for Google, etc al.
In the second case, what's to stop an upstart from producing their own synthetic training data?
In either case, companies who provide only tokens (OpenAI, Anthropic, etc) don't have a moat. The moat is still the same as it was in the 90s - companies deeply embedded into users' workflows.
In my memory, like I said, I struggle to think of even a few successful moats that were technology. The moat is always something else.
I pay OpenAI but I would also be a happy Anthropic customer.
My view is that OpenAI, Anthropic and Google have a good moat. It's now an oligopolistic market with extreme barriers to entry due to needed scale. The moat will keep growing as the payoffs from scale keep growing. They have internal scale and scope economies as the breadth of synthetic data expands. The small differences between the labs now are the initial conditions that will magnify the differences later.
It wouldn't be surprising to also see consolidation of the industry in the next 2 years which makes it even more difficult to compete, as 2 or 3 winners gobble up everyone and solidify their leads.
When people worry about frontier lab's moat, they point to open weights models, which is really a commentary that these models have zero cost to replicate (like all software). But I think the era of open weights competition cannot be sustained, it's a temporary phenomenon tied to the middle-ground scale we're in where labs can still do that affordably. The absolute end of this will be the end-game of nation state backed competition.
Scams (romance scams or convincing people to run some code on their machine), influence operations by an intelligence agency, or advertising a product.
Shouldn't Google have gone out of business?
I would agree it's one piece of the puzzle.
reply