Think Siri+ChatGPT trained on all your email, documents, browser history, messages, movements, everything. All local, no cloud, complete privacy.
"Hey Siri, I had a meeting last summer in New York about project X, could you bring up all relevant documents and give me a brief summary of what we discussed and decisions we made. Oh and while you're at it, we ate at an awesome restaurant that evening, can you book a table for me for our meeting next week."
All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.
Just last night, we were entertaining our toddler with animal sounds. It worked with “Hey Siri, what does a goat sound like?”, then we were able to do horse, cow, sheep, boar, and it somehow got tripped up on pig, for which it responded with the Wikipedia entry and told us to look at the phone for more info.
You’ve touched on what is probably the biggest reason I don't use Siri more: Apple does not limit it to what’s important to me as user.
I have thousands of contacts, lots of photos, videos, and emails, all in Apple’s first-party apps and yet Siri is more likely to respond with a popular song or listing of news articles that’s only tangentially connected to my request.
This becomes more complicated when Siri is the interface on a homepod in a shared area. Who's data and preferences should be used? Ideally it would recognise different voices and give that person's data priority, but how much can/should be shared between users? Where are these data - they shouldn't be in the homepod, so it would have to task the phone with finding the answer. I'm sure something good could be done here, but it wouldn't be easy.
>All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.
Well, this is about adding ChatGPT-level smartness to Siri, not just the semi-dumb assistant of yore.
> I’m feeling nostalgic. Make me a playlist with 25 mellow indie rock songs released between 2000 and 2010 and sort them by release year, from oldest to most recent.
This doesn't just return a list of songs, it will create the playlist for you in Music.
> Check the paragraphs of text in my clipboard for grammar mistakes. Provide a list of mistakes, annotate them, and offer suggestions for fixes.
> Summarize the text in my clipboard
> Go back to the original text and translate it into Italian
I haven't tried it myself, but it has other integrations like "live text" where your phone can pull text out of an image and then could send that to GPT to be summarized.
Version 1.0.2 makes improvements for using it via Siri including on HomePod.
Today I asked Siri for the weather this week. She said daytime ranges from 31C to 23C, so I then asked "on what day is the temperature 31 celsius?". And, of course, what I got back was "it's currently twenty seven degrees".
The weather ones are so annoying: "Is it going to rain today?". "It looks like it's going to rain today". "What time is it going to rain today?". "It looks like it's going to rain today".
It seems ironic then that specific thing failed spectacularly for me today. Siri put the text "set a timer for 15 minutes" into the text field of a reminder. I have no clue why, and no timer was set.
But you know what? Still better than Alexa for managing my smart home stuff. By miles and miles, IMO.
And god help you if you give up halfway through a command with a prompt. “Cancel”, “stop” and “nevermind” don’t work for half of that for some reason, so you have to walk up and tap the HomePod to cancel.
> All of my current experience with Siri tells me there is a 50-50 chance of the result coming back as “Sorry, I have having trouble connecting to the network” or playing a random song from Apple Music.
Meanwhile, Google and Amazon have decided that the data center costs of their approach just aren't worth it.
>Google Assistant has never made money. The hardware is sold at cost, it doesn't have ads, and nobody pays a monthly fee to use the Assistant. There's also the significant server cost to process all those voice commands, though some newer devices have moved to on-device processing in a stealthy cost-cutting move. The Assistant's biggest competitor, Amazon Alexa, is in the same boat and loses $10 billion a year.
Yes. I dont understand the criticism of the current Siri in this context, the point of a language model on the device would be to derive intent and convert a colloquial command into a computer instruction.
Siri was so good before iOS 13, I'm not sure what they did in that release but it went from around 90-95% accuracy and 80-90% contextual understanding - down to 70% and 75% respectively.
As someone who dictates more than half of their messages and is an incredibly heavy user of Siri for performing basic tasks I really noticed this sudden decline in quality and it's never got back up there - in fact, iOS 16 really struggles with many basic words. Before iOS 13. I would have been able to dictate these two paragraphs likely without any errors however, I've just had to edit them in five places.
I thought the lack of ability to execute on current “easy” queries would indicate something about ability to execute something as complicated as figuring out the restaurant you ate at and making a reservation. At least anytime in the next few years.
I don’t think it does. This isn’t a hypothetical Siri v2 with some upgrades; it’s a hypothetical LLM chatbot speaking with Siri’s voice. I recall one of the first demonstrations of Bing’s ability was someone asking it to book him a concert where he wouldn’t need a jacket. It searched the web for concert locations, searched the web weather information, picked a location that fit the constraint and gave the booking link for that specific ticket. If you imagine an Apple LLM that has local rather than web search, it seems obvious that this exact ability that LLMs have to follow complicated requests and “figure things out” would be perfectly suited to reading your emails and figuring out which restaurant you mean. With ApplePay integration it could also go ahead and book for you.
Certainly not the only place, but you’re very right that it does house a large population of commenters like me who enjoy the “sport” of “being correct on the internet”.
And yet the parent makes a very specific (and correct) comment, that this wont be Siri with some upgrades, but Siri in the name only, with a totally different architecture.
Whereas yours and your sibling comment are just irrelevant meta-comments.
Siri today is built on what’s essentially completely different concepts from something like ChatGPT.
There are demos of using ChatGPT to turn normal English into Alexa commands and it’s pretty flawless. If you assume Apple can pretty easily leverage LLM tech on Siri and do it locally via silicon in the M3 or M4, it’s only a matter of chip lead time before Siri has multiple orders of magnitude improvement.
That experience likely isn’t transferable to Siri, that has deeper problems. People, me included, are reporting their problems with Siri, e.g. setting it to transcribing what they and Siri says as text on the screen, and then are able to show that given input as “Please add milk to the shopping list” results in Siri responding “I do not understand what speaker you refer to.”, in writing.
Likely problems like these could be overcome, but preparing better input would probably not address the root cause of the problems with Siri.
Microsoft voice assistant was equally dumb as Siri, but ChatGPT is another thing entirely. Wont even be the same team at all, is most likely.
So nothing about their prior ability, or lack thereof, to make Siri smart means anything about their ability to execute if they add a large LLM in there.
I love Steve Jobs' "bicycle for the mind" metaphor, and what you describe is the best possible example of this concept. A computer that does that would enable us to do so much more.
This is the sort of AI I want; a true personal assistant, not a bullshit generator.
It appears that we are tantalizingly close to have the perfect voice assistant. But for some inexplicable reason, it does not exist yet. Siri was introduced over a decade ago, and it seems that its development has not progressed as anticipated. Meanwhile, language models have made significant advancements. I am uncertain as to what is preventing Apple, a company with boundless resources, from enhancing Siri. Perhaps it is the absence of competition and the duopoly maintained by Apple and Google, both of whom seem reluctant to engage in a competitive battle within this domain.
It is probably a people problem. The people who really understood Siri have probably left, the managers left running it are scored primarily on not making any mistakes and staying off the headlines. Any engineers who understand what it would take to upgrade it aren't given the resources and spend their days on maintenance tasks that nobody really sees.
It's more likely a perverse incentive problem. Voice activated "assistants" weren't viewed as assistance for end users. They were universally viewed as one of two things: A way of treating the consumer as a product, or a feature check-box.
That Siri went from useful to far less useful had more to do with the aim to push products at you rather than actually accomplishing the task you set for Siri. If Apple actually delivers an assistant that works locally, doesn't make me the product, and generally makes it easier to accomplish my tasks, then that's a product worth paying for.
When anyone asks "who benefits from 'AI'?" the answer is almost invariably "the people running the AI." Microsoft and OpenAI get more user data, and subscriptions. Google gets another vehicle for attention-injection. But if I run Vicuna or Alpaca (or some eventual equivalent) on my hardware, I can ensure I get what I need, and that there's much less hijacking of my intentions.
So Microsoft, if you're listening: I don't want Bing Chat search, I want Cortana Local.
When was Siri ever useful? I have yet to encounter a voice "assistant" that can do more than search Google and set timers reliably, and Siri itself can't even do those very well.
I use it around 50 - 100 times per day. Mostly playing music, sending messages, controlling lights in the home, weather, timers, and turning on/off/opening apps on the TV
There are definite frustrations, mostly around playing music. Around 5% of the time, Siri will play the wrong album or artist because the artist name sounds like some other album name, or vice versa. I wish, here, that it used my Music playback history to figure out which one I meant
Doing what Siri is doing is not rocket science. It’s a simple intent based system where you give it patterns to understand intents and you trigger some API based on it.
Once you have the intents parsing, it should be just a matter of throwing man power at it and giving it better intents.
Yes, I have experience with building on top of such a system.
But the group managing Siri has probably been gutted in the past 10 years, and while the core is always simple the integrations and the QA testing to make sure it all keeps working is probably brittle and time consuming, and the core code is likely highly-patched spaghetti at this point.
It would be easy to write Siri again and make it a hundred times better, if you could start all over and only write the core features, and not have to validate against the whole product/feature matrix.
The problem with the rewrite of course would be that you won't be able to deliver that minimal viable product any more and you will have 10 years worth of product requirements and user expectations that you MUST hit for the 1.0 release (which must be a 1.0 and not an 0.1).
I've worked on lots of "simple" and "not rocket science" systems that were 10-years old, and it is always incredibly difficult due to the state of the code, the lack of resources, and the organizational inertia.
This is already felt in use of Stable Diffusion, where M2 is fully capable offline.
Anything that can be done to reduce the need to “dial out” for processing protects the individual.
It erodes the ability of business and governmental organizations to use knowledge of otherwise private matters to target and influence.
The potential of moving a HQ LLM like GPT to the edge to answer everyday questions reminds me of my move from Google to DDG as my default search engine.
Except it’s even a bigger deal than that. It reduces private data exhaust from search to zero, making going to the net a backup plan instead of a necessity.
Apple delivering this on device is a major threat to OpenAI, which will have to provide some LLM model with training that Apple can’t or won’t.
Savvy users will begin to leer at having to produce queries over the wire, feeding valuable data (proven by ShareGPT)
Even then, Apple will likely chose to or be forced to open up on device AI to allow user contributed apps like LORAs which would ask the question why does OpenAI need to exist?
Also fascinating the potential to do this at the Server level for enterprise. If Apple produced a stack for enterprise training it could replace generalized data compute needs, shifting IT back to local or intranet.
Apparently, you are not an actual user of Siri, because I get jack shit out of her. speech to text is infinitely worse than the first week Siri was released.
Yes and we should also have EU regulators at every design meeting for every company. They did such a good job with the GDPR making the user experience better on the web
Yes, alas they didn't leave room for a 'cookie preferences' cookie, so that whenever I choose the option 'reject all', it's of course going to ask me again, every time I visit the website.
saying that, their intentions were good, I'm always horrifically amazed at the number of cookies used whenever I see the preferences popup. I honestly had no idea how many tracking cookies were used by the average website.
>Think Siri+ChatGPT trained on all your email, documents, browser history, messages, movements, everything. All local, no cloud, complete privacy.
That sounds absolutely horrifying if you remove the "all local" part. And that part's a pipe dream anyway. Plus, when using a model you'd basically become subservient / limited to the type of data in the model, which would necessarily abide by Apple's TOS, so a couple of hundred million people would be the Apple TOS but in human form. I don't understand why apple fanboys don't get this. Apple is pretty shoddy when privacy is concerned. Are these apple employees making these posts?
Fat chance Apple will alow us to do this locally. More like, upgrade to Apple Cloud Plus to get these features. But yeah, I've also dreamt of what my Apple hardware could do.
"Hey Siri, I had a meeting last summer in New York about project X, could you bring up all relevant documents and give me a brief summary of what we discussed and decisions we made. Oh and while you're at it, we ate at an awesome restaurant that evening, can you book a table for me for our meeting next week."