Article by Carl Zimmer: “Scientists publish more than 10 million studies and other publications a year. Some of those findings will add to humanity’s storehouse of knowledge. But some will be wrong.
To assess a study, scientists can replicate it to see if they get the same result. But seven years ago, a team of hundreds of scientists set out to find a faster way to judge new scientific literature. They built artificial intelligence systems to predict whether studies would hold up to scrutiny.
The project, funded by the Defense Advanced Research Projects Agency, or DARPA, was called Systematizing Confidence in Open Research and Evidence — SCORE, for short. The idea came from Adam Russell, then a program manager for the agency. He envisioned generating a kind of credit score for science.
“People can say, ‘Hey, this is likely to be robust, we can premise a policy on it,’” said Dr. Russell, who is now at the University of Southern California. “‘But this? Nah, this might make for a book in the airport.’”
The SCORE team inspected hundreds of studies, running many of them again, to better understand what makes research hold up. Now it is publishing a raft of papers on those efforts.
For now, a scientific credit score remains a dream, the researchers say. Artificial intelligence cannot make reliable predictions…
For more than 15 years, some scientists have been trying to change the culture. They started by documenting the extent of the problem. In the early 2010s, Dr. Nosek and colleagues replicated 100 psychology papers — and matched the original results only 39 percent of the time.
In another project, Dr. Nosek teamed up with cancer biologists to replicate 50 experiments on animals and human cells. Fewer than half of the results withstood their scrutiny…(More)”.