7 questions about ChatGPT and LLMs to make you sound like an expert at parties

Dr Rosemary Francis
6 min readJul 31, 2023

--

OK maybe you go to better parties than me, but it seems that this year I have not been able to meet with any of my friends before generative AI and Large Language Models (LLMs) come up in conversation. Once everyone has shared a picture of Bart Simpson riding a unicorn while playing the ukulele, the topic tends to turn more technical. Sometimes also ethical. So here are a handful of questions to ask or to answer in order to impress your friends when the conversation inevitably turns to this topic.

What are Large Language Models?

Large language Models (LLMs) are built from large bodies of text to make artificial intelligence (AI) models that can pass as human. People often cite the Turing Test as a measure of success, where a machine passes if it is not possible to distinguish it from a human.

Google has led the way in developing LLMs, but many others have followed. Most people are now familiar with ChatGPT, in which an LLM is used to back a service that lets us ask questions about the world (or at least the world as represented by the body of text used to build the model).

Where did the data come from to train them?

LLMs are limited by the data sets that were used to train them. A lot of data is needed for training, so most organisations scrape public bodies of text such as Wikipedia, LinkedIn, WordPress, Flickr and other public blogs. The body of text that has been used to train a model obviously affects its style and worldview, and many people have raised issues of diversity, or perhaps a lack of diversity, in the data used to train most of the large language models we have today.

The largest publicly available bodies of text are overwhelmingly contributed by white men, many living in the economic North West. Wikipedia, for example, is mainly written by white men ages 25–35 in the United States, and as such it is missing large swaths of knowledge and world viewpoints. Until recently the page on Black history redirected to a page about the history of African Americans.

I don’t imagine that the model designers deliberately chose to include data from such a narrow slice of humanity, but a large body of text is needed to train models, and so the models have been built with what is currently available.

How are LLMs different from other human-like chat bots?

We’ve had chat bots and little helpers like Clippy the paper clip for many years, but large language models represent a new frontier in what computers can achieve. What distinguishes LLMs is the ability to have multi-shot interactions, where there is more than one round of prompt and reply. Some can even hold conversations.

In order to hold a conversation, it is necessary to track objects or people through the conversation so that, for example, a woman mentioned at the beginning can simply be referred to as “she” much later on. As humans we can track elements through a conversation quite easily, but until the advent of LLMs, AIs were unable to do that. It is this and other advances that make it so hard to distinguish between the new language models and humans. It is this that makes the new models so powerful and so dangerous.

Can we remove bias and bad behaviour from the chat models?

The problems with large language models have been well documented. As well as more obvious abuses of the technology to oppress, control, or disenfranchise people, models that have been deployed for seemingly good purposes can display toxic behaviour or simply generate incorrect results that are not corrected.

One way to try to get around these problems is to apply model specification. This means training a model with a second, more specialised, data set. This is a necessary step when productising a model to perform a task, such as automated customer service, that needs domain-specific knowledge. It is also an opportunity to provide human-guided feedback or to apply rule sets to try and eliminate some of the undesired behaviour from LLMs. You can, for example, train a model to use gender-neutral pronouns such as “they” when talking about someone of unknown gender instead of defaulting to “he” for everyone, as people often do in English. You can also provide feedback about other offensive language to improve the model.

It is not possible to completely eliminate all forms of conscious or unconscious bias from the model because it is not possible to do that for human-generated data. Models are further limited by the data that has been used to train them in the same way that one person is only the sum of their experiences. Model training can only go so far to change a model that has very strong biases in its training set.

What are LLMs good for?

So far LLMs have been suggested for applications such as improving customer service and understanding large bodies of text, but we haven’t even scratched the surface of what this amazing technology can do. As it develops it will affect more and more aspects of our lives, including education at all levels, personal lifestyles, and the workplace. LLMs will accelerate drug discovery and the development of life-saving medical procedures, and they’ll push the envelope in a way we haven’t previously been capable of doing.

Because AI can find patterns in data sets that are too large or too complex for human brains to effectively process, it will quickly expose knowledge that can benefit our world right now, instead of years down the road when we may have figured things out for ourselves. While LLMs aren’t a fix-all for the future, they’re an extremely promising technology that we’re privileged to watch develop from its infancy.

When thinking about how we should use LLMs, it is important to understand that they really tell a story in response to a prompt. So when you ask ChatGPT, “How are we able to see in colour”? what you are really saying is, “Tell me a story about human colour perception”. That story will be inspired by the data set used to train the model, but it won’t be filtered by the same cultural or moral constraints that we operate within as humans. It won’t be too worried whether the answer is correct, and the model has no concept of the consequences of its actions.

Because of the uncertainty in any response from an LLM, it is important only to use LLMs for tasks where a certain level of error can be tolerated or where the response can be easily verified, easily contested and changed. This is a good policy to apply to all uses of technology in which a machine is making a decision that affects a living being in some way. As machines start to expand their influence over our lives, it becomes increasingly critical to maintain an understanding of their limitations and control over the way they interpret and act on data.

Who benefits from large language models?

Just as the training sets do not represent the full spectrum of humankind, it is likely that the applications of LLMs will not be evenly distributed to all. Most technology companies are guilty of what is called digital colonialism, where data, labour, and natural resources are mined from the Global South to create technology that largely benefits the lucky 1% of humanity in wealthier countries. So while I’d like to see better diversity in the data used to train models, I’d also like to see the benefits of that data being shared more equally.

Will the models continue to improve?

The more data and voices that we can pour into these Large Language Models the more refined they will become. We are still developing the science and will certainly get better. One big problem for the technology now is self-reference. Increasing amounts of content online is now generated by LLMs. It is no longer possible to use public data to train a new LLM without including large amounts of data that has been generated by another LLM. This over time will degrade the quality of the models.

So the initial success of generative AI may be the downfall as online content becomes polluted with generated data. Or it might bring on a new era of language models as we are forced to think harder about the source of the training data, improving both diversity and data provenance.

--

--

Dr Rosemary Francis

Chief Scientist for HPC at Altair. Fellow of the Royal Achademy of Engineering. Member of the Raspberry Pi Foundation. Entrepreneur. Mum. Windsurfer