The free tool, which creates what the team calls TLDRs (the common Internet acronym for ‘Too long, didn’t read’), was activated this week for search results at Semantic Scholar, a search engine created by the non-profit Allen Institute for Artificial Intelligence (AI2) in Seattle, Washington. For the moment, the software generates sentences only for the ten million computer-science papers covered by Semantic Scholar, but papers from other disciplines should be getting summaries in the next month or so, once the software has been fine-tuned, says Dan Weld, who manages the Semantic Scholar group at AI2…
Weld was inspired to create the TLDR software in part by the snappy sentences his colleagues share on Twitter to flag up articles. Like other language-generation software, the tool uses deep neural networks trained on vast amounts of text. The team included tens of thousands of research papers matched to their titles, so that the network could learn to generate concise sentences. The researchers then fine-tuned the software to summarize content by training it on a new data set of a few thousand computer-science papers with matching summaries, some written by the papers’ authors and some by a class of undergraduate students. The team has gathered training examples to improve the software’s performance in 16 other fields, with biomedicine likely to come first.
The TLDR software is not the only scientific summarizing tool: since 2018, the website Paper Digest has offered summaries of papers, but it seems to extract key sentences from text, rather than generate new ones, Weld notes. TLDR can generate a sentence from a paper’s abstract, introduction and conclusion. Its summaries tend to be built from key phrases in the article’s text, so are aimed squarely at experts who already understand a paper’s jargon. But Weld says the team is working on generating summaries for non-expert audiences….(More)”.