They are discussing a very specific approach and a paper that lays out the issues with pursuing this one specific type of generative AI. It's not about AI in general. The headline is a bit click-baity.
While the paper demonstrated strong diminishing returns in adding more data to modern neural networks in terms of image classifers, the video host is explaining how the same may effect apply to any nureal network based system with modern transformers.
While there are technically methods of generative AI that don’t use a neural network, they haven’t made much progress in recent decades and arn’t what most people mean when they hear or say generative AI, and as such I would say the title is accurate enough for a video meant for a general audience, though “Is there a fundamental limit to modern neural networks” might be more technically correct.
I think most people underestimate how big of a deal it’s going to be when this tech is pervasive in things like search engines or digital assistants. There are many times when I can’t figure out the right combination of words to put into a search engine to find the results. ChatGPT is already my go to when I want to figure out a movie or song from some random combination of foggy memories. Imagine after 10 more years of cpu/gpu innovations, and chat applications that have actually been designed for information retrieval, how much that is going to transform how we interact with data and information.
Full disclosure, I didn’t watch the video. I just can’t imagine that that headline isn’t going to look silly in 30 years.
Alternate theory we'll look back the same way we looked back on the claims that IBM watson was intelligent, or the claims in the 60s, 70s, 80s, 90s, 2000s, 2010s, that <insert technology x> was going to make computers truly intelligent.
Great video, thanks! Regarding the over representation of certain concepts/things I have been disappointed from day one by generative AI. If you want it to draw you something obscure it miserably fails and tries to fall back on stuff it knows. Also all the discriminatory biases generative AI has about different people because of lacking data sets. It is very obvious that it cannot "outperform" its own data input (like the exciting curve in the video) but that it will rather stagnate.
I think that's a good question. And a nice video. The findings in the paper seem to arrive at that conclusion and we might need to find a better approach. Mind that (as he pointed out) it doesn't rule out growth in AI. It just hints at probable stagnation with the current methods. I'm already fascinated by the current tech and the new possibilities. But AI is really hyped as of now and I too, think we should take the claims of the big AI companies with a grain of salt. I'm sure the scientists at OpenAI are already concerned with exactly this as they do research for the next generations of ChatGPT. It's a bit of a bummer that lots of the research get's done behind closed curtains and we're going to have to wait for a bit longer to find out.
A simple path forward, is to go from classifying single elements of training data, to classifying multiple elements and their relationship in the training data.
Slightly less simple, is to gather orders of magnitude more data, by just hooking the input to an IRL robot.
Another step, is for the NN to control the robot and decide which parts of the data require refinement, and focus on that.
There is a lot of ways to improve data acquisition still on the table, it isn't going to stop at creating large corpora and having humans to fine-tune them.
It's a "push as much data as a baby gets to train its NN" step, which is several orders of magnitude more, and more focused, than any training dataset in existence right now.
Even with diminishing returns, it's bound to get better results.
A simple path forward, is to go from classifying single elements of training data, to classifying multiple elements and their relationship in the training data.
Training data already has multiple labels.
Slightly less simple, is to gather orders of magnitude more data, by just hooking the input to an IRL robot.
An entire point of the paper and video is that massive increases in training set size are showing diminishing returns.
Another step, is for the NN to control the robot and decide which parts of the data require refinement, and focus on that.