Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> However, eventually we end up reaching a point where throwing more resources at it stops meaningfully improving performance.

Honestly asking - why? It's been my understanding that based on the Universal Approximation Theorem, given sufficient resources, a deep learning neural net can approximate any function to an arbitrary degree of accuracy. Is there any other theorem or even conjecture that would lead us to believe that the progress that we're seeing will slow down? Or is it just that you're claiming that we/it would run out of physical resources before reaching AGI?

As for training data, as I see it, with the wide deployment of AIs, and gradually of AI-driven robots, they'll soon be able to "takeoff" and rely primarily on the live data that they are collecting directly.



The universal approximation therorem is not as powerful as it sounds. Polynomials essentially satisfy it as well [0], the only hiccup being that the Universal Approximation Theroum is explicitly about neural networks.

The UAT is an existence proof, it says nothing about any particular method being capable of constructing such a network. In contrast, with polynomials we have several methods of constructing polynomials that are proven to converge to the desired function.

Indeed, polynomials have been widely used as universal approximators for centuries now, and are often amazingly successful. However, polynomials in this context are only good in low degrees, where they are inherently limited in how well they can approximate [1]. Beyond a certain point, increasing your degrees of freedom with polynomial approximators simply does not help and is generally counter productive, even though a higher degree polynomial is strictly more powerful than a lower degree one.

Looking at the current generative AI breakthrough, the UAT would say that today's transformer based architecture is no more powerful than a standard neurul net. However, it produces vastly superior results that could simply not be achieved by throwing more compute at the problem.

Sure, if you have an infinite dataset and infinite compute, you might have AGI. But at that point , you have basically just replicated the Chinese room thought experiment.

[0] See the Stone–Weierstrass theorem

[1] They are also used as arbitrary precision approximators, but that is when you compute them analytically instead of interporlating them from data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: