The era of predictive AI Is almost over


Essay by Dean W. Ball: “Artificial intelligence is a Rorschach test. When OpenAI’s GPT-4 was released in March 2023, Microsoft researchers triumphantly, and prematurely, announced that it possessed “sparks” of artificial general intelligence. Cognitive scientist Gary Marcus, on the other hand, argued that Large Language Models like GPT-4 are nowhere close to the loosely defined concept of AGI. Indeed, Marcus is skeptical of whether these models “understand” anything at all. They “operate over ‘fossilized’ outputs of human language,” he wrote in a 2023 paper, “and seem capable of implementing some automatic computations pertaining to distributional statistics, but are incapable of understanding due to their lack of generative world models.” The “fossils” to which Marcus refers are the models’ training data — these days, something close to all the text on the Internet.

This notion — that LLMs are “just” next-word predictors based on statistical models of text — is so common now as to be almost a trope. It is used, both correctly and incorrectly, to explain the flaws, biases, and other limitations of LLMs. Most importantly, it is used by AI skeptics like Marcus to argue that there will soon be diminishing returns from further LLM development: We will get better and better statistical approximations of existing human knowledge, but we are not likely to see another qualitative leap toward “general intelligence.”

There are two problems with this deflationary view of LLMs. The first is that next-word prediction, at sufficient scale, can lead models to capabilities that no human designed or even necessarily intended — what some call “emergent” capabilities. The second problem is that increasingly — and, ironically, starting with ChatGPT — language models employ techniques that combust the notion of pure next-word prediction of Internet text…(More)”