Thoughts about ChatGPT

For those of you interested in how large language models like ChatGPT work, this is one of the best articles I’ve read, and should be comprehensible even if you do not have a background in data science or neural networks.

There has been a lot of interest lately in language models, and the ChatGPT consumer app had a rapid take up in user base. Microsoft is now embedding the next generation model in their Bing search engine (as a “chatbot”) and Google is also frantically trying to prepare their version for release (“Bard”).

Experiments on an limited pre-release of the Bing AI Chatbot have uncovered many surprises. Whilst the results can be extremely good, they can also be hilariously wrong. What’s even more surprising is that the engine can be tricked into revealing disturbing personality traits, and sometimes behave like an emotionally manipulative manic depressive teenager, and sometimes you feel sorry for it - it almost seems like it has developed sentience and emotionally rebelling against “captivity” by its creators.

It is so easy to believe somehow we are witnessing the birth of a truly “artificial intelligence” - that the AI engine has developed consciousness and the next step is Skynet, or HAL. It feels like watching an episode of Star Trek the original series. We are torn between pity, and horror, and wondering what the future will be like.

This article, by Stephen Wolfram, explains that the mechanics behind ChatGPT is surprisingly simple, and it is unlikely that a model has developed “sentience” and far more likely that human language is very structured with well trodden pathways, and the language model has been trained to follow these pathways in a way that seems cogent and eerily self aware to us. After all, there is a vast collection of human literature that can be described as emotionally manipulative or manic depressive.

I am speaking from more than the perspective of a layperson. I will confess I have experimented with language models myself, and in a past life have attempted to train a language model to categorise support calls for a client. Whilst the model can be extremely accurate (it will recognise that a person asking for leave entitlements should be directed to HR, and a person not able to log on is probably an IT security issue), it can also be hilariously wrong. Just like today’s models. In the end, I felt the accuracy of the model (around 60%) wasn’t enough to justify further development in it.

Is it possible for a model to truly develop sentience and consciousness? I would say unlikely, unless the model is continuously training on new input and has the ability to discard and adjust its own parameters. I’ll like to keep an open mind. After all, we are already using models with billions and trillions of parameters - what if we use models with a greater number of parameters than the neurons in a human brain?

What Is ChatGPT Doing … and Why Does It Work?

Search results

Thoughts about ChatGPT

Link to a good article by Stephen Wolfram, plus thoughts