The AI Trust Gap

There is no doubt that AI is having a huge impact on the world. Statements by the likes of McKinsey and Goldman Sachs have predicted huge economic effects, though some more recent studies, such as this from the MIT Sloan School of Management, are more nuanced. ChatGPT claims 800 million users in mid-2025, and is currently the fifth most visited website, ahead of Wikipedia. 45% of all venture capital was directed to AI investments in 2024, and 53% in Q1 2025, according to Crunchbase.

The high level of interest and excitement in AI has yet to reflect in it actually being trusted by the public. Only 46% of people trust AI, according to a huge 2025 KPMG and University of Melbourne survey of 48,000 people across 47 countries. This is despite the fact that the same survey showed that 66% of respondents actually used AI regularly. Interestingly, this represents a significant drop in trust in AI from an earlier version of the same survey.

There are many factors that are contributing to this. Large language models (LLMs) underly all generative AI platforms like OpenAI’s ChatGPT, Claude’s Anthropic, Google’s Gemini and META’s LLAMA. These are probabilistic creatures, using layers of neural networks based on billions of parameters and massive swathes of training data in order to come up with answers to questions that people pose them. An LLM can write an essay or a poem, draw an image or create a video, write program code or compose a song. Its answers are fluent and usually plausible, but are not necessarily factual. From time to time, LLMs fabricate answers or respond with nonsense (“hallucinations”), which is a problem in those situations that require reliability. If you don’t like an AI poem, then no harm is done, but if you rely on an LLM to write your court submission to a judge and the LLM hallucinates fake precedent cases, then you may have a problem. One database that tracks legal AI issues already has over two hundred examples of court cases where LLMs has misbehaved in this way. A 2025 investigation by the BBC found that over half of AI answers to factual questions about the news were flawed, with either factual errors, made-up BBC quotes or BBC quotes that were real but doctored. With general hallucination rates of around 20% or more, and getting worse, this is a major issue for trust in AI.

A second element of trust is explainability and transparency. If an AI gives you an answer to a question, can it at least explain its logic, or the sources that it used? There are various claims in this regard, such as asking an AI to show its chain of reasoning step by step. If you do this, then an LLM will give you a sequence of logic, but it turns out that this explanation in itself is fiction. Such explanations, plausible though they may seem, bear no relationship to the internal processes of an LLM. Artificial neural networks are inherently black boxes whose processing simply cannot be analysed. Researchers have tried highlighting which “tokens” influence the output of an LLM, but this does not capture the reasoning process. Looking for which inputs influence a specific prediction or outcome is also fruitless.

An LLM’s answer to a prompt is driven by massive data-driven pattern matching based on its vast training data. They work by translating text into “tokens” which are mapped to a numerical representation called a vector embedding. These embedded tokens pass through “transformer layers“, each of which has a self-attention mechanism that allows the model to weight the importance of each token relative to others. This has the effect of linking related words or concepts that are in the input prompt. These are further processed by a neural network, and then this process is repeated across dozens of layers, each time refining things further. The model then predicts the next token in a sequence, one at a time, based on its training data. So, it chooses the most likely next word in its answer, and continues this word by word to generate its answer. What is happening is pattern recognition, not logical deduction. You can try and improve the outcomes by supplementing its training data by external files (retrieval augmented generation), but the underlying process is identical. This is why an LLM’s answers cannot be explained in the sense that a human could explain the step-by-step logic that they may use to solve a problem. LLMs are opaque. British Prime Minister Benjamin Disraeli was probably the first person to use the phrase “never apologise, never explain”, but LLMs could be said to be following that same path, at least in part. In fact, LLMs frequently apologise when an error is pointed out to them, but they are fundamentally incapable of explaining why they made an error.

Other issues may be affecting public trust in AI. The vast processing power required to both train and run LLMs is causing a boom in data centre growth, which in itself has an environmental impact. Around 3% of all electricity is now consumed by data centres, and this is growing rapidly. Another is the concern that AI may be taking away jobs, although here the picture is complex, with new jobs being created as well as some disappearing. AI models are heavily dependent on their training data, and have been shown to be biased in their answers if the same bias exists in training data.

A key factor that may undermine public trust in AI is the lack of consistency of LLMs. They are often used to answer questions, perhaps to research some facts or analyse some data. However, if you ask an LLM the same question many times, then you will not necessarily get the same answer each time: they are probabilistic in nature. Many people seem unaware of this basic truth. People are used to computer systems behaving reliably and consistently, giving the same answer to a calculation time after time, day after day. That is not what LLMs do. Even the very latest LLM models do not have very high consistency rates. One study into LLMs answering medical questions found a consistency rate of 45% – 75% across several LLMs tested. Other studies have found similar results.

These issues of reliability, bias, consistency and more are a major issue for the deployment of AI in society in general. Governments and businesses seek to gain productivity advantages from AI, but this will be an uphill struggle if the underlying technology is flawed. A critical element is AI education. LLMs in particular are a quite new phenomenon outside the world of AI research, and most people are only beginning to get used to their quirks and capabilities. There is no doubt that generative AI has a wide range of applications, but much of the key to success will be in applying it to areas where it is suited, and avoiding or restricting it in areas where it is not suited. We now have the bizarre situation that 15% of US employees are pretending to use AI to please their boss.

There is no shortage of examples in history of promising technology that was rejected by the public on trust issues. The airship aviation industry was derailed by the Hindenburg disaster. Asbestos was once a widely used construction product, while the nuclear industry was seriously impacted by Three Mile Island, Chernobyl and Fukushima. So far, AI has not had such a dramatic moment, though there have been plenty of embarrassing moments. However, only by much better education of the public and of politicians into how AI works can we expect better regulation and more productive use of generative AI. Generative AI is a technology of great promise, but public trust is a fragile thing.

Related Posts