Capturing AI’s Attention: Physics of Repetition, Hallucination, Bias and Beyond

We derive a first-principles physics theory of the AI engine at the heart of LLMs’ ‘magic’ (e.g. ChatGPT, Claude): the basic Attention head. The theory allows a quantitative analysis of outstanding AI challenges such as output repetition, hallucination and harmful content, and bias (e.g. from training and fine-tuning). Its predictions are consistent with large-scale LLM outputs. Its 2-body form suggests why LLMs work so well, but hints that a generalized 3-body Attention would make such AI work even better. Its similarity to a spin-bath means that existing Physics expertise could immediately be harnessed to help Society ensure AI is trustworthy and resilient to manipulation.

Frank Yingjie Huo, Neil F. Johnson

Read preprint >>

Coevolution of network and attitudes under competing propaganda machines

NPJ Complexity

Politicization of the COVID-19 vaccination debate has lead to a polarization of opinions regarding this topic. We present a theoretical model of this debate on Facebook. In this model, agents form opinions through information that they receive from other agents with flexible opinions and from politically motivated entities such as media or interest groups. The model captures the co-evolution of opinions and network structure under similarity-dependent social influence, as well as random network re-wiring and opinion change. We show that attitudinal polarization can be avoided if agents (1) connect to agents all across the opinion spectrum, (2) receive information from many sources before changing their opinions, (3) frequently change opinions at random, and (4) frequently connect to friends of friends. High Kleinberg authority scores among politically motivated media and two network components that are comparable in size can indicate the onset of attitudinal polarization.

Mikhail Lipatov, Lucia Illari, Neil Johnson, Sergey Gavrilets

View article >>