Yes! This is a brilliant explanation of why language use is not the same as intelligence, and why LLMs like chatGPT are not intelligence. At all.
Yes! This is a brilliant explanation of why language use is not the same as intelligence, and why LLMs like chatGPT are not intelligence. At all.
I view it by building up to the technology.
Is a book sentient? It is capable of providing recorded knowledge in the form of sequence of symbols on a specific subject at a level of proficiency far above the reader’s. But no, it’s static information that originated from a human.
Is a library sentient? It allows for systematic retrieval of knowledge on a vast amount of subjects far beyond what any human is capable of knowing. But no, it’s just a static categorization of documents curated by a human.
Is a search engine sentient? It allows for automatic retrieval of highly relevant knowledge based on a query from a human. But no, it’s just token based pattern matching to find similar documents.
So why is an LLM suddenly sentient? It’s able to produce highly relevant sequences of words based on recorded knowledge specifically tailored to the sequences of words around it, but it’s just a probability engine to find highly relevant token sequences that match the context around it.
The underlying mechanism simply has no concept of a world view or a mental model of the metaphysical world around. It’s basically a magic book that allows you to retrieve information from any document ever written in a way tailored to a document you wrote.
Yes. LLMs generate texts. They don’t use language. Using a language requires an understanding of the subject one is going to express. LLMs don’t understand.
I guess you’re right, but find this a very interesting point nevertheless.
How can we tell? How can we tell that we use and understand language? How would that be different from an arbitrarily sophisticated text generator?
For the sake of the comparison, we should talk about the presumed intelligence of other people, not our (“my”) own.
In the case of current LLMs, we can tell. These LLMs are not black boxes to us. It is hard to follow the threads of their decisions because these decisions are just some hodgepodge of statistics and randomness, not because they are very intricate thoughts.
We can’t compare the outputs, probably, but compute the learning though. Imagine a human with all the literature, ethics, history, and all kind of texts consumed like that LLMs, no amount of trick questions would have tricked him to believe in racial cleansing or any such disconcerting ideas. LLMs read so much, and learned so little.
This gets to the core of the issue. LLMs are a model of the statiscal relationship between words in texts, in a very large number of dimensions. The intelligence they appear to exhibit is that which existed in their source material in the first place. They don’t have a model of the world itself. If you consider how midjourney can produce photorealstic images of people yet very often it will get hands wrong. How is that? It’s because when you train on images, you get a statistical representation of what hands look like without the world model that let’s you know that hands only have 5 fingers and how they’re arranged. AIs like this are very clever copiers. They are not intelligent
While that’s true, we have to allow for the fact that our own intelligence, at some point, is an encoded model of the world around us. Probably not through something as rigid as precise statistics, but our consciousness is somehow an emergent phenomenon of the chemical reactions in our brains that on their own have no real understanding of the world either.
I do have to wonder if at some point, consciousness will spontaneously emerge as we make these models bigger and more complex and – maybe more importantly – start layering specialized models on top of each other that handle specific tasks then hand the result back to another model, creating feedback loops. I’m imagining a nueral network that is trained on something extremely abstract like figuring out, from the raw input data, what specialist model would be best suited to process that data, then based on the result, what model would be best suited to refine that data. Something we train to basically be an executive function with a bunch of sub models available to it.
Could something like that become conscious without realizing it’s “communicating” with us? The program executing the LLM might reflexively process data without any concept that it’s text, but still be emergently complex enough when reflecting its own processes to the point of self awareness. It wouldn’t realize the data represents a link to other conscious beings.
As a metaphor, you could teach a very smart dog how to respond to certain, basic arithmetic problems. They would get stuff wrong the moment you prompted them to do something out of their training, and they wouldn’t understand they were doing math even when they got it “right”, but they would still be sentient, if not sapient, despite that.
It’s the opposite side of the philosophical zombie. A philosophical zombie behaves exactly as a human would, but is a surface-level automaton with no inner life.
But I propose that we also consider the inverse-philosophical zombie, an entity that behaves like an automation, but has an inner life that has not recognized its input data for evidence of an external world outside it’s own bounds. Something that might not even recognize it’s executing a program the same way we aren’t consciously aware of the chemical reactions our brain is executing to make us think.
I don’t believe current LLMs are anywhere near complex enough to give rise to that sort of thing, but they are also still pretty early in their development and haven’t started to be heavily layered and interconnected the way I think they’ll end up.
At the very least it makes for a fun Sci-fi premise.
I can mostly follow, just want to exclude the last paragraph which contains assumptions about a black box.
That being said, how is the human brain different from what you describe?
You think by processing the probabilistic association between word sequences? Humans think through world models, we have imagination, a physical and metaphysical simulation of the world around us. Absolutely none of that is involved in how LLMs work. There’s a lot to be said about the utility of association of knowledge embedded in symbols, and having a magic book that can retrieve pre existing information in context is incredibly useful and I think it will have an impact on the level of the printing press and the internet, but just because it’s incredibly useful at retrieving knowledge doesn’t mean it works anything like how a human brain works.
Sorry, I could have been more clear. I did not mean to equate current LLMs with human brains. The question was rather:
Can’t we describe the working of (other) human brains in a very similar fashion as you did before? Or where exactly is the difference which sets us apart?
AIs which can and need to interact with the physical world have those, too. Naturally, an AI which is restricted to language has much less necessity and opportunity to develop these, much like our brain area for smell is probably not so good at estimating velocities and catching a ball.
I think your approach of demystifying technology is valid and worthwhile. I’m just not sure if it does what you maybe think it does; highlight the difference to our intelligence.
We know the math and the mechanisms of how LLMs work. The only thing we don’t understand is the significance and capabilities of the probabilistic associations it prescribes to symbol sequences.
While we don’t know how a human brain works in detail, we do know how a human brain tackles problem solving because we’re sentient beings and we can be introspective about how we think through a problem.
We can look at how vectors flow through a neutral network (remember, LLMs don’t even have a concept of words, it transforms tokens into vectors that it then builds mathematical associations between, it’s all numbers) and we can see through the data that there’s nothing resembling a world simulation in how it actually works.
Also keep in mind that the LLMs you interact with don’t even learn from your interactions. The data is all baked in at training time. If you turn the temperature of the LLM output generation to zero it will come up with the same probability answer every time. The more you learn about how they work under the hood, it becomes more and more clear that there is no there there when it comes to sentience.
I will say that I do think that the capabilities and significance of symbol association and pattern matching has been wildly under estimated. Word sequences need to follow a pattern to make sense, and if you stumble upon the right sequence of words, that sequence of words could be incredibly impactful and it doesn’t really matter how you come up with them. If you were to pull words out of a hat at random, there’s an infinity small possibility that you’ll get a sequence of words that happen to expose the secrets of the universe. LLMs improve on that immensely on that they use probability to reduce that sequence space to the set of word sequences that make sense, and in that reduced space are generative sequences that may produce real value, and we can improve on making that space more and more relevant and useful.