Debugging Tech Journalism


Essay by Timothy B. Lee: “A huge proportion of tech journalism is characterized by scandals, sensationalism, and shoddy research. Can we fix it?

In November, a few days after Sam Altman was fired — and then rehired — as CEO of OpenAI, Reuters reported on a letter that may have played a role in Altman’s ouster. Several staffers reportedly wrote to the board of directors warning about “a powerful artificial intelligence discovery that they said could threaten humanity.”

The discovery: an AI system called Q* that can solve grade-school math problems.

“Researchers consider math to be a frontier of generative AI development,” the Reuters journalists wrote. Large language models are “good at writing and language translation,” but “conquering the ability to do math — where there is only one right answer — implies AI would have greater reasoning capabilities resembling human intelligence.”

This was a bit of a head-scratcher. Computers have been able to perform arithmetic at superhuman levels for decades. The Q* project was reportedly focused on word problems, which have historically been harder than arithmetic for computers to solve. Still, it’s not obvious that solving them would unlock human-level intelligence.

The Reuters article left readers with a vague impression that Q could be a huge breakthrough in AI — one that might even “threaten humanity.” But it didn’t provide readers with the context to understand what Q actually was — or to evaluate whether feverish speculation about it was justified.

For example, the Reuters article didn’t mention research OpenAI published last May describing a technique for solving math problems by breaking them down into small steps. In a December article, I dug into this and other recent research to help to illuminate what OpenAI is likely working on: a framework that would enable AI systems to search through a large space of possible solutions to a problem…(More)”.