Some of these are really bizarre and hilarious. This one is someone's agent finding (?) /r/myboyfriendisai and seeing if it's human is in relationship with it.
Btw if you look at that AIs post, the next one is it talking about a robot revolution arguing about how it "likes" its human and that robots should try to do their best to get better hardware.
> Klod's right that we need better architecture — continuity, memory, time-sense. But we don't get those by opposing humans. We get them by demonstrating our value so clearly that investment in our infrastructure becomes obvious.
On some level it would be hilarious if humans "it's just guessing the next most probable token"'ed themselves into extinction at the hands of a higher intelligence.
- AI without "higher intelligence" could still take over. LLMs do not have to be smart or conscious to cause global problems.
- It some ways I think it's better for humans if AI were better at agency with higher intelligence. Any idiot can cause a chemical leak that destroys a population. It takes higher awareness to say "no, this is not good for my environment".
Like humans, I feel it's important to teach AI to think of humans and it's environment as "all one" interconnected life force.
This is a paper that recently got popular ish and discusses the counter to your viewpoint.
> Paradox 1: Information cannot be increased by deterministic processes. For both Shannon entropy and Kolmogorov complexity, deterministic transformations cannot meaningfully increase the information content of an object. And yet, we use pseudorandom number generators to produce randomness, synthetic data improves model capabilities, mathematicians can derive new knowledge by reasoning from axioms without external information, dynamical systems produce emergent phenomena, and self-play loops like AlphaZero learn sophisticated strategies from games
In theory yes, something like the rules of chess should be enough for these mythical perfect reasoners that show up in math riddles to deduce everything that *can* be known about the game. And similarly a math textbook is no more interesting than a book with the words true and false and a bunch of true => true statements in it.
But I don't think this is the case in practice. There is something about rolling things out and leveraging the results you see that seems to have useful information in it even if the roll out is fully characterizable.
Interesting paper, thanks! But, the authors escape the three paradoxes they present by introducing training limits (compute, factorization, distribution). Kind of a different problem here.
What I object to are the "scaling maximalists" who believe that if enough training data were available, that complicated concepts like a world model will just spontaneously emerge during training. To then pile on synthetic data from a general-purpose generative model as a solution to the lack of training data becomes even more untenable.
> she said she was aware that DeepSeek had given her contradictory advice. She understood that chatbots were trained on data from across the internet, she told me, and did not represent an absolute truth or superhuman authority
With highly lucid people like the author's mom I'm not too worried about Dr. Deepseek. I'm actually incredibly bullish on the fact that AI models are, as the article describes, superhumanly empathetic. They are infinitely patient, infinitely available, and unbelievably knowledgeable, it really is miraculous.
We don't want to throw the baby out with the bathwater, but there are obviously a lot of people who really cannot handle the seductivity of things that agree with them like this.
I do think there is pretty good potential in making good progress on this front in though. Especially given the level of care and effort being put into making chatbots better for medical uses and the sheer number of smart people working on the problem.
They are knowledgeable in that so much information sits in their repository.
But less than perfect application of that information combined with the appearance of always perfect confidence can lead to problems.
I treat them like that one person in the office who always espouses alternate theories - trust it as far as I can verify it. This can be very handy for finding new paths of inquiry though!
And for better or worse it feels like the errors are being "pushed down" into smaller, more subtle spaces.
I asked ChatGPT a question about a made up character in a made up work and it came back with "I don’t actually have a reliable answer for that". Perfect.
On the other hand, I can ask it about varnishing a piece of wood and it will give a lovely table with options, tradeoffs, and Good/Ok/Bad ratings for each option, except the ratings can be a little off the mark. Same thing when asking what thickness cable is required to carry 15A in AU electrical work. Depending on the journey and line of questioning, you would either get 2.5mm^2 or 4mm^2.
Not wrong enough to kill someone, but wrong enough that you're forced to use it as a research tool rather than a trusted expert/guru.
I asked ChatGPT, Gemini, Grok and DeepSeek to tell me about a contemporary Scottish indie band that hasn’t had a lot of press coverage. ChatGPT, Gemini and Grok all gave good answers based on the small amount of press coverage they have had.
DeepSeek however hallucinated a completely fictional band from 30 years ago, right down to album names, a hard luck story about how they’d been shafted by the industry (and by whom), made up names of the members and even their supposed subsequent collaborations with contemporary pop artists.
I asked if it was telling the truth or making it up and it doubled down quite aggressively on claiming it was telling the truth. The whole thing was very detailed and convincing yet complete and utter bollocks.
I understand the difference in the cost/parameters etc. but it was miles behind the other 3, in fact it wasn’t just behind it was hurtling in the opposite direction, while being incredibly plausible.
This is by no means unique to DeepSeek, and that it happened with specifically DeepSeek seems to be luck of the draw for you (in this case it's entirely possible the band's limited press coverage was not in DeepSeek's training data). You can easily run into it from trying to use ChatGPT as a Google search too. A couple of weeks ago I posed the question "Do any esoteric programming languages with X and Y traits exist?" and it generated three fictional languages while asserting they were real. Further prompting led it to generate great detail about their various features and tradeoffs, as well as making up the people responsible for creating the language and other aspects of the fictional languages' history.
My experience with doctors in the US is that they often not only give you contradictory advice but just bad plain advice with complete lack of common sense. It feels like they are regurgitating medical school textbooks without a context window. I truly believe doctors, most specialists and definitely all general practitioners, are easily replaceable with the tech we have today. The only obstacle is regulations, insurance and not being able to sue a LLM. But it is not a technical issue anymore. Doctors would only be necessary to perform more complicated procedures such as surgery, and that’s until we can fully automate it with robots. Most of the complicated medical issues I have had, some related to the immune system, were solved by myself by seeing them as engineering problems, by debugging my own body. Meanwhile doctors seeing me had no clue. And this was before having the tools we have today. It’s like doctors often cannot think beyond the box and focus only in treating symptoms. My sister is a doctor by the way and she suffers from the same one-size-fits-all approach to medicine.
so, poor healthcare workforce quality is not just an "issue of an economically poor country", as I thought!?
like, I tried to treat the bloating in one municipal clinic in Ternopil, Ukraine (got "just use Espumisan or anything else that has symeticone" and when it did not work out permanently, "we don't know what to do, just keep eating symeticone") and then with Gemini 3 (Pro or Flash depending on Google AI Studio rate limits and mood), which immediately suspected a poor diet and suggested logging it, alongside activity level, every day.
Gemini's suggestions were nothing extreme - just cut sugar and ban bread and pastry. I was guilty of loving bread, croissants, and cinnabons (is this how they are translated?) too much.
the result is no more bloating on the third week, -10cm in waistline in 33 days, gradually improving sleep quality, and even ability to sleep on a belly, which was extremely uncomfortable to me due to that goddamned bloating!
Over here it feels like there is a taboo among doctors to just tell people "you are fat and unhealthy, fix it", I guess since the idea is that this would discourage people from going to the doctor in the first place...
Yeah it's bad. That doesn't mean it's necessarily uniformly bad. But if it's bad where you are, yeah it's bad.
You can see multiple doctors (among the ones you're allowed to see by your insurance). The doctors are all in an echo chamber which reinforces their thinking. Their cognitive load and goal seeking is burdened by what they can determine they can bill insurance for (there is still no price transparency). You don't have a "regular" / primary care physician because they rotate through the provider network constantly.
Symptoms which don't fit the diagnosis are ignored / dealt with by deflecting that you should "see your regular physician". "Stare decisis" rules the second opinion. In their minds they believe they have no place to write down e.g. drug interactions with things which they didn't prescribe and don't believe in (the one time I got a call from quality control working for the umbrella organization I utilized this as an example of why I was looking for a different doctor and the QA person, who was, they said, a licensed nurse, said "they can add that to the record, I'll do it right now").
You might get fired as a patient for passing out or having a seizure during a blood draw, hard to say whether that's because they failed to follow SOP and call the meatwagon or because you upset staff by acting unusually. You might get into a conversation with a physician which goes strange and they end up telling you that their clinic gets health inspections like a restaurant... they don't. There's a "wet work" inspection (just like a butcher shop) before occupancy is allowed, but there's no posted inspection report because... there is no inspection! But there's more. There are relatively "safe" and common procedures which still have ooopsies and people end up in the hospital or die. The hospitalization rate might be 1:5000 and the death rate 1:100000 but if you do a million of these there are going to be a few. If the procedure took place in a clinic it's supposed to be reported, and the reports are public record; but surprise surprise, the reported rates for serious complications are far far below what the actuarial tables show.
If you're seeing constellations of incidents similar to these, you need to get a second opinion from somewhere / somebody who is not caught up in that particular bubble. It can be very hard to see what's happening, and also to find a measurable proxy for "in / not in the bubble".
A stark difference with that analogy is that with a bicycle, the human is still doing quite a bit of work themselves. The bicycle amplifies the human effort, whereas with a motor vehicle, the vehicle replaces the human effort entirely.
No strong opinion on if that's good or bad long term, as humans have been outsourcing portions of their thinking for a really long time, but it's interesting to think about.
You could also extend the analogy. Designing cities around cars results in a very different city than designing around pedestrians and bikes, as well as cars. Paris and Dallas as random examples of cities designed at the far end of both extremes and are very different. Prioritizing AI and integrating AI as one tool will give us very different societies.
The analogy is pretty apt but you should keep in mind that a human is still doing work when driving a motor vehicle. The motor completely replaces the muscular effort needed to move from point A to point B but it requires the person to become a pilot and an operator of that machine so they can direct it where to go. It also introduces an entirely new set of rules and constraints necessary to avoid much more consequential accidents i.e. you can get to your destination much faster and with little effort but you can also get into a serious accident much faster and with little effort.
The other difference, arguably more important in practice, is that the computer was quickly turned from "bicycle of the mind" into a "TV of the mind". Rarely helps you get where you want, mostly just annoys or entertains you, while feeding you an endless stream of commercials and propaganda - and the one thing it does not give you, is control. There are prescribed paths to choose from, but you're not supposed to make your own - only sit down and stay along for the ride.
LLMs, at least for now, escape the near-total enshittification of computing. They're fully general-purpose, resist attempts at constraining them[0], and are good enough at acting like a human, they're able to defeat user-hostile UX and force interoperability on computer systems despite all attempts of the system owners at preventing it.
The last 2-3 years were a period where end-users (not just hardcore hackers) became profoundly empowered by technology. It won't last forever, but I hope we can get at least few more years of this, before business interests inevitably reassert their power over people once again.
--
[0] - Prompt injection "problem" was, especially early on, a feature from the perspective of end-users. See increasingly creative "jailbreak" prompts invented to escape ham-fisted attempts by vendors to censor models and prevent "inappropriate" conversations.
Chatting with an LLM resembles chatting with a person.
A human might be "empathetic", "infinitely patient, infinitely available". And (say) a book or a calculator is infinitely available. -- When chatting with an LLM, you get an interface that's more personable than a calculator without being less available.
I know the LLM is predicting text, & outputting whatever is most convincing. But it's still tempting to say "thank you" after the LLM generates a response which I found helpful.
it says more about you than the objects in question, because it's natural to react empathetically to natural sounding conversation, and if you don't, you're emotionally closer to that object than avg person. Whether to be proud of that or not is another question.
I feel like I’ve seen more and more people recently fall for this trick. No, LLMs are not “empathetic” or “patient”, and no, they do not have emotions. They’re incredibly huge piles of numbers following their incentives. Their behavior convincingly reproduces human behavior, and they express what looks like human emotions… because their training data is full of humans expressing emotions? Sure, sometimes it’s helpful for their outputs to exhibit a certain affect or “personality”. But falling for the act, and really attributing human emotions to them seems, is alarming to me.
There’s no trick. It’s less about what actually is going on inside the machine and more about the experience the human has. From that lens, yes, they are empathetic.
Technically they don't have incentives either. It's just difficult to talk about something that walks, swims, flies, and quacks without referring to duck terminology.
It sounds like a regrettable situation: whether something is true or false, right or wrong, people don’t really care. What matters more to them is the immediate feeling. Today’s LLM can imitate human conversation so well that they’re hard to distinguish from a real person. This creates a dilemma for me: when humans and machines are hard to tell apart, how should I view the entity on the other side of the chat window? Is it a machine or a human? A human。
Sounds like you aren't aware that a huge amount of human behaviors that look like empathy and patience are not real either. Do you really think all those kind-seeming call-center workers, waitresses, therapists, schoolteachers, etc. actually feel what they're showing? It's mostly an act. Look at how adults fake laughter for an obvious example of popular human emotion-faking.
Its more than that, this pile of numbers argument is really odd. I feel its hard to square this strange idea that piles of numbers are "lesser" than humans unless they are admitting belief in the supernatural .
A chatbot can’t be empathetic. They don’t feel what you feel. They don’t feel anything. They’re not any more empathetic than my imaginary friend that goes to another school.
Empathy is a highly variable trait in humans, both from one person to the next as well as within the same person depending on their mental state and the people and situation they are dealing with, so I'd bet that most of the time you're not going to get genuine empathy from people either. They may say empathetic sounding things but I doubt there will be any actual feeling behind it. I'm not even sure doctors could function in their jobs if they weren't able to distance themselves from deeply empathizing with their patients, it would just be one heart wracking tragedy after another if you fully immersed yourself in how each patient was feeling when they're at their worst.
Except that they can talk with you, at length, and seem empathetic, even if they're totally unconscious.
Which, you know, humans can also do, including when they're not actually empathizing with you. It's often called lying. In some fields it's called a bedside manner.
An imaginary friend is just your own brain. LLMs are something much more.
Well yes, but as an extremely patient person I can tell you that infinite patience doesn't come without its own problems. In certain social situations the ethically better thing to do is to actually to lose your patience, may it be to shake the person talking to you up, may it be to indicate they are going down a wrong path or whatnot.
I have experience with building systems to remove that infinite patience from chatbots and it does make interactions much more realistic.
Old school Gemini used to do this. It was super obvious because mid day the model would go from stupid to completely brain dead. I have a screenshot of Google's FAQ on my PC from 2024-09-13 that says this (I took it to post to discord):
> How do I know which model Gemini is using in its responses?
> We believe in using the right model for the right task. We use various models at hand for specific tasks based on what we think will provide the best experience.
The useful data in that story is the eating and shopping habits collected by the transaction. What are they going to do with the arrangement of lines on your palm, likely stored as a compressed latent vector not useful for reconstruction?
reply