Why are humans smarter than llamas: limitations of modern AI

In the past 30 years, Artificial Intelligence (AI) has improved by leaps and bounds. Suitably programmed computers are very well adapted for particular tasks and can outperform humans in various aspects. On the flip side, AI still falls short of human performance in many aspects. A human can see an instance of a new animal and identify that same animal in a different scenario with relative ease (called one-shot learning), whereas current AIs would need thousands, if not millions, of examples to be able to learn it. A human can learn a task and then learn a new task, whereas this seems to pose AI great troubles. A human brain consumes approximately 0.3kWh a day, compared to ChatGPT which requires about the same amount of energy to answer only a single request. For many tasks, even state of the art AI has a long way to go to outperform humans.

Why is that? After all, most modern AI architectures are based on neural networks (NNs), and these were originally designed to simulate a human brain. In this article, Francesco Di Lallo takes a look at some of the challenges in realising an ‘artificial general intelligence’:

How brains and NNs work

The brain works by receiving some input, for example from the eyes, which then provides some electrical stimulation to neurons in the visual cortex. Once a neuron has hit a certain threshold of stimulation, it activates and sends a signal to other neurons connected to it via synapses. This process then repeats as electrical signals are fired through the brain. Eventually, the input signal is processed, and a decision is reached. For example: input – lion running towards you; decision – run away.

An NN works in a somewhat similar way. It receives an input, such as an image, which passes through a series of nodes/neurons which are interconnected by synapses. Each of these nodes and synapses modify the incoming signal until it reaches the output, and a decision is made. For example, if the input is an image of a lion, you want the network to accurately classify it as a lion instead of a frog.

This description of the brain and neural network is obviously an over-simplification, but it is sufficient to discuss the topic at hand.

A neural network learns by seeing lots and lots of examples and making tiny adjustments to various parameters in its network. In the lion versus frog scenario, the network will see numerous examples of pictures of a lion or a frog, the network will make a guess and compare that with the right answer. If the guess is correct, it will ‘reinforce’ the correct neural pathways; and if the guess is incorrect, it will ‘weaken’ the same neural pathways. This process is called back-propagation and is similar to how computer scientists in the 1960’s understood how the brain learns.

Can we throw bigger computers and more data at the problem?

The human brain has approximately 100 billion neurons and 100 trillion synapses. This could be compared to a model with approximately 100 trillion trainable parameters. For comparison, Meta’s ‘Llama 3’ -the namesake of this article – ‘only’ has 70 billion trainable parameters. Surely the solution is to scale up Llama 3, or some other model, to 100 trillion parameters to achieve the same intelligence as a human?

Not quite.

A fundamental theorem in machine learning is that neural networks can deal with arbitrarily complicated tasks. More precisely, they can approximate any continuous function (result due to Cybenko in 1988). Therefore, in theory, given any ‘well behaved’ task, a sufficiently large NN will be able to solve it. While mathematicians might be content with the theoretical existence of an appropriate neural network, the challenge comes with actually how to build such a network. This proves to be easier said than done. There are limitations to how effective modern architectures and learning algorithms can be.

One such limitation is the vanishing gradient problem. Vanishing gradient occurs in deep networks, in which it is exceedingly difficult to adjust the parameters that occur early on in the network. For a deep enough network, adding more and more layers (giving rise to more parameters) is fruitless because the early parameters become untrainable. There are ways to try to circumvent this issue, such as batch normalisation and gradient clipping, but the issue persists. Simply making a network bigger is not going to solve the issue of artificial general intelligence.

To add to the problem, there are further practical restrictions to NN’s effectiveness such as the amount of available training data and computing resources. It is estimated that if ChatGPT had been trained on a single GPU, it would have taken approximately 355 years.

Since NNs require large amounts of data to train them, the availability of good data has become paramount. Google’s recent AI Overview tool is a case in point. Fuelled by information and data it has found on the web, it shows a quick summary of answers to search questions at the top of Google Search. However, it will be clear to anyone that has spent any time on the internet that the web is full of nonsense. Consequently, AI Overview helpfully suggested improving health by “eat[ing] at least one small rock a day” without realising that it had pulled that quote from the satirical newspaper The Onion. It is not only a matter of having a lot of data, but the data must have some reliability to it in order for an AI to train effectively.

Rigid computers versus malleable brains

The brain has the ability – known as neuroplasticity – to rewire itself and behave in new ways. When compared to the human brain most modern NNs have a comparatively rigid structure: they do not have the ability to ‘grow’ new nodes or form new synaptic connections. There are such things as forget gates and weight pruning but such solutions do not yet reach the neuroplasticity of the brain. Neuroplasticity seems to be a fundamental aspect of human learning and intelligence. The comparative rigidity of NNs seems to be a severe limitation to their ability to mimic humans.

By construction, NNs will fit to the problem that it has been given. An NN that has been trained to read handwriting will not be able to accurately classify animals without substantial re-training. A human is able to do both with relative ease. Lots of research is being done in multi-task learning, but beyond the naïve implementation of having different networks for different tasks, not much progress has been made.

One obvious issue facing a more generalisable network, for example, is how to input the information: say you want an AI that can play Tetris and do English-French translations, the network will need to be able to receive a series of words as input as well as an image – which turns out to not be that easy.

Other attempts at training multi-task NNs have noticed that AIs have difficulty retaining old information while they learn new information. This came to light when DeepMind attempted to train an NN to play 57 different Atari 2600 games. In March 2020, DeepMind managed to train an NN that performed above the human baseline in all 57 games – but the challenges in doing this highlight the scale of the ‘general intelligence’ problem.

Do neural networks really learn like the brain?

Solving the problem of NN architecture is not the end of the story. Another major limitation is how the network is trained. NNs rely on some omniscient external algorithm that can tell the NN the correct answer, how close or far from the correct answer it was and tell it how to go about changing the network parameters.

Without entering a theological discussion, it would appear that this is not what happens in the brain, which does not have this rigid environment that can provide exact error signals required for back-propagation. Brains may have some self-regulatory process within them that can assign credit to a ‘good answer’. A recent Oxford study proposed a new mechanism called ‘prospective configuration’ in which they describe a bear turning up to a river to fish. On seeing the river, the bear has learnt to expect to hear water and smell salmon. However, one day, the bear arrives at the river with blocked ears and so cannot hear the river. In classical back-propagation the parameters linking the visual and auditory neurons would be reduced but doing so would also reduce the parameters between the visual and olfactory neurons. This reduction would cause a catastrophic interference and would compromise the ability to smell the salmon on the bear’s next visit.

This is obviously not what happens in real life. Instead, the study proposes that the learning algorithm lets neurons settle into a prospective configuration: this configuration reduces the interference between information during learning so that weight modifications do not have such a drastic interference. Prospective configuration is a relatively new idea, but it already shows promising improvements in biologically relevant scenarios such as online learning (where the network learns immediately after each experience), continual learning with multiple tasks, learning in changing environments, learning with limited amount of training examples (including one-shot learning), and reinforcement learning. It will be interesting to see this area develop further.

The future

Contemporary AIs perform well at individual and specific tasks, whether it is playing a game of chess, translating an email into a different language, or finding coffee shops near you, but are not ‘generally intelligent’. There are at least two problems in current AI research which are obstacles to reaching artificial general intelligence: NN’s lack of neuroplasticity, and difficulty in efficient training. Overcoming either of these two obstacles is likely to have a big impact on progress towards artificial general intelligence.

At Barker Brettell we are following developments in AI and machine learning with great interest. We regularly deal with inventions at the cutting edge of AI and machine learning, so if you need advice on protecting intellectual property relating to AI and machine learning, please get in touch with us.

Share