As a mathematician dabbling in AI: we told you so
As a random schmuck who got the gist of it: i told you so
As me: yep, called it
… yes? This has been known since the beginning. Is it news because someone finally convinced Sam Altman?
Neural networks are universal estimators. “The estimate is wrong sometimes!*” is… what estimates are. The chatbot is not an oracle. It’s still bizarrely flexible, for a next-word-guesser, and it’s right often enough for these fuckups to become a problem.
What bugs me are the people going ‘see, it’s not reasoning.’ As if reasoning means you’re never wrong. Humans never misremember, or confidently espouse total nonsense. And we definitely understand brain chemistry and neural networks well enough to say none of these bajillion recurrent operations constitute the process of thinking.
Consciousness can only be explained in terms of unconscious events. Nothing else would be an explanation. So there is some sequence of operations which constitutes a thought. Computer science lets people do math with marbles, or in trinary, or on paper, so it doesn’t matter how exactly that work gets done.
Though it’s probably not happening here. LLMs are the wrong approach.
I think it’s a bit overzealous to say LLMs are the wrong approach. It is possible the math behind them would be useless to a true AI, but as far as I can tell, the only definitive statement we can make right now is that they can’t be the whole approach. Still, you’re absolutely right that there is a huge set of operations we haven’t figured out yet if we want a genuine AI.
My understanding of consciousness is that it isn’t one single sequence of operations, but a set of simultaneous ones. There’s a ton of stuff all happening at once for each of our senses. Take sight, for example. Just to see things, we need to measure light intensity, color definition, and spacial relationships, and then mix that all together in a meaningful way. Then we have to balance each of our senses, decide which one to focus on in the moment, or even to focus on several at once. And that hasn’t even touched on thoughts, emotions, social relationships, unconscious bodily functions, or the systems in place that let us switch things back and forth between conscious and unconscious, like breathing, or blinking, or walking, and so on. There are hundreds, maybe thousands of operations happening in our brains simultaneously at any given moment.
So, without a doubt, LLMs aren’t the most energy efficient way to do pattern recognition. But I find it hard to believe that a strong system for pattern recognition would be fully unusable in a greater system. If/when we figure the rest out, I’m sure an LLM could be used as a piece of a much greater puzzle… if we wanted to burn all that energy.
To the network, matrix operations are simultaneous. Not that I believe parallelism is necessary. Embodiment is more plausible, but we just don’t know.
We do know LLMs kinda suck. They’re a bad first attempt. Diffusion emerged at the same time, and now generates plausible video faster than a camera. Those models are witchcraft and the witchcraft keeps getting stronger. LLMs hit a plateau because this is what they do. An ungodly amount of money has been spend chasing marginal improvements at ballooning scale. Higher abstraction will likely emerge as companies shrink models down and pursue maximum training epochs… but they’ll always be more chatbot than person.
They do simultaneous workloads but each load is essentially performing the same function. A person, on the other hand, is a huge variety of differing functions all working in tandem; a set of vastly different operations that are all inextricably linked together to form who we are. No single form of generative algorithm is anything more than a tiny fraction of what we think of as a conscious being. They might perform the one operation they’re meant for on a much greater scale than we do, but we are nowhere near linking all those pieces together in a meaningful, cohesive stream.
Edit, think of the algorithms like adding more cores to a CPU. Sure, you can process workloads simultaneously, but each workload is interchangeable and can be arbitrarily assigned to each core or thread. Not so with people. Every single operation is assigned to a specialized part of the brain that only does that one specific type of operation. And you can’t just swap out the RAM or GPU; every piece is wired specifically to interact with only the other pieces they were grown together with.
sucks when the pesky limits of physical reality get in the way of your AI superintelligences from actually being intelligent.