Article by Cal Newport: “Much of the euphoria and dread swirling around today’s artificial-intelligence technologies can be traced back to January, 2020, when a team of researchers at OpenAI published a thirty-page report titled “Scaling Laws for Neural Language Models.” The team was led by the A.I. researcher Jared Kaplan, and included Dario Amodei, who is now the C.E.O. of Anthropic. They investigated a fairly nerdy question: What happens to the performance of language models when you increase their size and the intensity of their training?
Back then, many machine-learning experts thought that, after they had reached a certain size, language models would effectively start memorizing the answers to their training questions, which would make them less useful once deployed. But the OpenAI paper argued that these models would only get better as they grew, and indeed that such improvements might follow a power law—an aggressive curve that resembles a hockey stick. The implication: if you keep building larger language models, and you train them on larger data sets, they’ll start to get shockingly good. A few months after the paper, OpenAI seemed to validate the scaling law by releasing GPT-3, which was ten times larger—and leaps and bounds better—than its predecessor, GPT-2.
Suddenly, the theoretical idea of artificial general intelligence, which performs as well as or better than humans on a wide variety of tasks, seemed tantalizingly close. If the scaling law held, A.I. companies might achieve A.G.I. by pouring more money and computing power into language models. Within a year, Sam Altman, the chief executive at OpenAI, published a blog post titled “Moore’s Law for Everything,” which argued that A.I. will take over “more and more of the work that people now do” and create unimaginable wealth for the owners of capital. “This technological revolution is unstoppable,” he wrote. “The world will change so rapidly and drastically that an equally drastic change in policy will be needed to distribute this wealth and enable more people to pursue the life they want.”
It’s hard to overstate how completely the A.I. community came to believe that it would inevitably scale its way to A.G.I. In 2022, Gary Marcus, an A.I. entrepreneur and an emeritus professor of psychology and neural science at N.Y.U., pushed back on Kaplan’s paper, noting that “the so-called scaling laws aren’t universal laws like gravity but rather mere observations that might not hold forever.” The negative response was fierce and swift. “No other essay I have ever written has been ridiculed by as many people, or as many famous people, from Sam Altman and Greg Brockton to Yann LeCun and Elon Musk,” Marcus later reflected. He recently told me that his remarks essentially “excommunicated” him from the world of machine learning. Soon, ChatGPT would reach a hundred million users faster than any digital service in history; in March, 2023, OpenAI’s next release, GPT-4, vaulted so far up the scaling curve that it inspired a Microsoft research paper titled “Sparks of Artificial General Intelligence.” Over the following year, venture-capital spending on A.I. jumped by eighty per cent…(More)”.