- Capturing AI’s Attention: Physics of Repetition, Hallucination, Bias and Beyond
We derive a first-principles physics theory of the AI engine at the heart of LLMs’ ‘magic’ (e.g. ChatGPT, Claude): the basic Attention head. The theory allows a quantitative analysis of outstanding AI challenges such as output repetition, hallucination and harmful content, and bias (e.g. from training and fine-tuning). Its predictions are consistent with large-scale LLM outputs. Its 2-body form suggests why LLMs work so well, but hints that a generalized 3-body Attention would make such AI work even better. Its similarity to a spin-bath means that existing Physics expertise could immediately be harnessed to help Society ensure AI is trustworthy and resilient to manipulation.
Frank Yingjie Huo, Neil F. Johnson
- Preventing the Spread of Online Harms: Physics of Contagion across Multi-Platform Social Media and Metaverses
We present a minimal yet empirically-grounded theory for the spread of online harms (e.g. misinformation, hate) across current multi-platform social media and future Metaverses. New physics emerges from the interplay between the intrinsic heterogeneity among online communities and platforms, their clustering dynamics generated through user-created links and sudden moderator shutdowns, and the contagion process. The theory provides an online `R-nought’ criterion to prevent system-wide spreading; it predicts re-entrant spreading phases; it establishes the level of digital vaccination required for online herd immunity; and it can be applied at multiple scales.
Chen Xu, Pak Ming Hui, Om K. Jha, Chenkai Xia, Neil F. Johnson
- Online Group Dynamics Reveal New Gel Science
A better understanding of how support evolves online for undesirable behaviors such as extremism and hate, could help mitigate future harms. Here we show how the highly irregular growth curves of groups supporting two high-profile extremism movements, can be accurately described if we generalize existing gelation models to account for the facts that the number of potential recruits is time-dependent and humans are heterogeneous. This leads to a novel generalized Burgers equation that describes these groups’ temporal evolution, and predicts a critical influx rate for potential recruits beyond which such groups will not form. Our findings offer a new approach to managing undesirable groups online — and more broadly, managing the sudden appearance and growth of large macroscopic aggregates in a complex system — by manipulating their onset and engineering their growth curves.
Pedro D. Manrique, Sara El Oud, Neil F. Johnson