Academic writing is getting harder to read—the humanities most of all


The Economist: “Academics have long been accused of jargon-filled writing that is impossible to understand. A recent cautionary tale was that of Ally Louks, a researcher who set off a social media storm with an innocuous post on X celebrating the completion of her PhD. If it was Ms Louks’s research topic (“olfactory ethics”—the politics of smell) that caught the attention of online critics, it was her verbose thesis abstract that further provoked their ire. In two weeks, the post received more than 21,000 retweets and 100m views.

Although the abuse directed at Ms Louks reeked of misogyny and anti-intellectualism—which she admirably shook off—the reaction was also a backlash against an academic use of language that is removed from normal life. Inaccessible writing is part of the problem. Research has become harder to read, especially in the humanities and social sciences. Though authors may argue that their work is written for expert audiences, much of the general public suspects that some academics use gobbledygook to disguise the fact that they have nothing useful to say. The trend towards more opaque prose hardly allays this suspicion…(More)”.

Behaviour-based dependency networks between places shape urban economic resilience


Paper by Takahiro Yabe et al: “Disruptions, such as closures of businesses during pandemics, not only affect businesses and amenities directly but also influence how people move, spreading the impact to other businesses and increasing the overall economic shock. However, it is unclear how much businesses depend on each other during disruptions. Leveraging human mobility data and same-day visits in five US cities, we quantify dependencies between points of interest encompassing businesses, stores and amenities. We find that dependency networks computed from human mobility exhibit significantly higher rates of long-distance connections and biases towards specific pairs of point-of-interest categories. We show that using behaviour-based dependency relationships improves the predictability of business resilience during shocks by around 40% compared with distance-based models, and that neglecting behaviour-based dependencies can lead to underestimation of the spatial cascades of disruptions. Our findings underscore the importance of measuring complex relationships in patterns of human mobility to foster urban economic resilience to shocks…(More)”.

How Your Car Might Be Making Roads Safer


Article by Kashmir Hill: “Darcy Bullock, a civil engineering professor at Purdue University, turns to his computer screen to get information about how fast cars are traveling on Interstate 65, which runs 887 miles from Lake Michigan to the Gulf of Mexico. It’s midafternoon on a Monday, and his screen is mostly filled with green dots indicating that traffic is moving along nicely. But near an exit on the outskirts of Indianapolis, an angry red streak shows that cars have stopped moving.

A traffic camera nearby reveals the cause: A car has spun out, causing gridlock.

In recent years, vehicles that have wireless connectivity have become a critical source of information for transportation departments and for academics who study traffic patterns. The data these vehicles emit — including speed, how hard they brake and accelerate, and even if their windshield wipers are on — can offer insights into dangerous road conditions, congestion or poorly timed traffic signals.

“Our cars know more about our roads than agencies do,” said Dr. Bullock, who regularly works with the Indiana Department of Transportation to conduct studies on how to reduce traffic congestion and increase road safety. He credits connected-car data with detecting hazards that would have taken years — and many accidents — to find in the past.

The data comes primarily from commercial trucks and from cars made by General Motors that are enrolled in OnStar, G.M.’s internet-connected service. (Drivers know OnStar as the service that allows them to lock their vehicles from a smartphone app or find them if they have been stolen.) Federal safety guidelines require commercial truck drivers to be routinely monitored, but people driving G.M. vehicles may be surprised to know that their data is being collected, though it is indicated in the fine print of the company’s privacy policy…(More)”.

How Years of Reddit Posts Have Made the Company an AI Darling


Article by Sarah E. Needleman: “Artificial-intelligence companies were one of Reddit’s biggest frustrations last year. Now they are a key source of growth for the social-media platform. 

These companies have an insatiable appetite for online data to train their models and display content in an easy-to-digest format. In mid-2023, Reddit, a social-media veteran and IPO newbie, turned off the spigot and began charging some businesses for access to its data. 

It turns out that Reddit’s ever-growing 19-year warehouse of user commentary makes it an attractive resource for AI companies. The platform recently reported its first quarterly profit as a publicly traded company, thanks partly to data-licensing deals it made in the past year with OpenAI and Google.

Reddit Chief Executive and co-founder Steve Huffman has said the company had to stop giving away its valuable data to the world’s largest companies for free. 

“It is an arms race,” he said at The Wall Street Journal’s Tech Live conference in October. “But we’re in talks with just about everybody, so we’ll see where these things land.”

Reddit’s huge amount of data works well for AI companies because it is organized by topics and uses a voting system instead of an algorithm to sort content quality, and because people’s posts tend to be candid.

For the first nine months of 2024, Reddit’s revenue category that includes licensing grew to $81.6 million from $12.3 million a year earlier.

While data-licensing revenue remains dwarfed by Reddit’s core advertising sales, the new category’s rapid growth reveals a potential lucrative business line with relatively high margins.

Diversifying away from a reliance on advertising, while tapping into an AI-adjacent market, has also made Reddit attractive to investors who are searching for new exposure to the latest technology boom. Reddit’s stock has more than doubled in the past three months.

The source of Reddit’s newfound wealth is the burgeoning market for AI-useful data. Reddit’s willingness to sell its data to AI outfits makes it stand out, because there is only a finite amount of data available for AI companies to gobble up for free or purchase. Some executives and researchers say the industry’s need for high-quality text could outstrip supply within two years, potentially slowing AI’s development…(More)”.

Setting the Standard: Statistical Agencies’ Unique Role in Building Trustworthy AI


Article by Corinna Turbes: “As our national statistical agencies grapple with new challenges posed by artificial intelligence (AI), many agencies face intense pressure to embrace generative AI as a way to reach new audiences and demonstrate technological relevance. However, the rush to implement generative AI applications risks undermining these agencies’ fundamental role as authoritative data sources. Statistical agencies’ foundational mission—producing and disseminating high-quality, authoritative statistical information—requires a more measured approach to AI adoption.

Statistical agencies occupy a unique and vital position in our data ecosystem, entrusted with creating the reliable statistics that form the backbone of policy decisions, economic planning, and social research. The work of these agencies demands exceptional precision, transparency, and methodological rigor. Implementation of generative AI interfaces, while technologically impressive, could inadvertently compromise the very trust and accuracy that make these agencies indispensable.

While public-facing interfaces play a valuable role in democratizing access to statistical information, statistical agencies need not—and often should not—rely on generative AI to be effective in that effort. For statistical agencies, an extractive AI approach – which retrieves and presents existing information from verified databases rather than generating new content – offers a more appropriate path forward. By pulling from verified, structured datasets and providing precise, accurate responses, extractive AI systems can maintain the high standards of accuracy required while making statistical information more accessible to users who may find traditional databases overwhelming. An extractive, rather than generative,  approach allows agencies to modernize data delivery while preserving their core mission of providing reliable, verifiable statistical information…(More)”

Bad data costs Americans trillions. Let’s fix it with a renewed data strategy


Article by Nick Hart & Suzette Kent: “Over the past five years, the federal government lost $200-to-$500 billion per year in fraud to improper payments — that’s up to $3,000 taken from every working American’s pocket annually. Since 2003, these preventable losses have totaled an astounding $2.7 trillion. But here’s the good news: We already have the data and technology to greatly eliminate this waste in the years ahead. The operational structure and legal authority to put these tools to work protecting taxpayer dollars needs to be refreshed and prioritized.

The challenge is straightforward: Government agencies often can’t effectively share and verify basic information before sending payments. For example, federal agencies may not be able to easily check if someone is deceased, verify income or detect duplicate payments across programs…(More)”.

Data for Better Governance: Building Government Analytics Ecosystems in Latin America and the Caribbean


Report by the Worldbank: “Governments in Latin America and the Caribbean face significant development challenges, including insufficient economic growth, inflation, and institutional weaknesses. Overcoming these issues requires identifying systemic obstacles through data-driven diagnostics and equipping public officials with the skills to implement effective solutions.

Although public administrations in the region often have access to valuable data, they frequently fall short in analyzing it to inform decisions. However, the impact is big. Inefficiencies in procurement, misdirected transfers, and poorly managed human resources result in an estimated waste of 4% of GDP, equivalent to 17% of all public spending. 

The report “Data for Better Governance: Building Government Analytical Ecosystems in Latin America and the Caribbean” outlines a roadmap for developing government analytics, focusing on key enablers such as data infrastructure and analytical capacity, and offers actionable strategies for improvement…(More)”.

AI, huge hacks leave consumers facing a perfect storm of privacy perils


Article by Joseph Menn: “Hackers are using artificial intelligence to mine unprecedented troves of personal information dumped online in the past year, along with unregulated commercial databases, to trick American consumers and even sophisticated professionals into giving up control of bank and corporate accounts.

Armed with sensitive health informationcalling records and hundreds of millions of Social Security numbers, criminals and operatives of countries hostile to the United States are crafting emails, voice calls and texts that purport to come from government officials, co-workers or relatives needing help, or familiar financial organizations trying to protect accounts instead of draining them.

“There is so much data out there that can be used for phishing and password resets that it has reduced overall security for everyone, and artificial intelligence has made it much easier to weaponize,” said Ashkan Soltani, executive director of the California Privacy Protection Agency, the only such state-level agency.

The losses reported to the FBI’s Internet Crime Complaint Center nearly tripled from 2020 to 2023, to $12.5 billion, and a number of sensitive breaches this year have only increased internet insecurity. The recently discovered Chinese government hacks of U.S. telecommunications companies AT&T, Verizon and others, for instance, were deemed so serious that government officials are being told not to discuss sensitive matters on the phone, some of those officials said in interviews. A Russian ransomware gang’s breach of Change Healthcare in February captured data on millions of Americans’ medical conditions and treatments, and in August, a small data broker, National Public Data, acknowledged that it had lost control of hundreds of millions of Social Security numbers and addresses now being sold by hackers.

Meanwhile, the capabilities of artificial intelligence are expanding at breakneck speed. “The risks of a growing surveillance industry are only heightened by AI and other forms of predictive decision-making, which are fueled by the vast datasets that data brokers compile,” U.S. Consumer Financial Protection Bureau Director Rohit Chopra said in September…(More)”.

Scientists Scramble to Save Climate Data from Trump—Again


Article by Chelsea Harvey: “Eight years ago, as the Trump administration was getting ready to take office for the first time, mathematician John Baez was making his own preparations.

Together with a small group of friends and colleagues, he was arranging to download large quantities of public climate data from federal websites in order to safely store them away. Then-President-elect Donald Trump had repeatedly denied the basic science of climate change and had begun nominating climate skeptics for cabinet posts. Baez, a professor at the University of California, Riverside, was worried the information — everything from satellite data on global temperatures to ocean measurements of sea-level rise — might soon be destroyed.

His effort, known as the Azimuth Climate Data Backup Project, archived at least 30 terabytes of federal climate data by the end of 2017.

In the end, it was an overprecaution.

The first Trump administration altered or deleted numerous federal web pages containing public-facing climate information, according to monitoring efforts by the nonprofit Environmental Data and Governance Initiative (EDGI), which tracks changes on federal websites. But federal databases, containing vast stores of globally valuable climate information, remained largely intact through the end of Trump’s first term.

Yet as Trump prepares to take office again, scientists are growing more worried.

Federal datasets may be in bigger trouble this time than they were under the first Trump administration, they say. And they’re preparing to begin their archiving efforts anew.

“This time around we expect them to be much more strategic,” said Gretchen Gehrke, EDGI’s website monitoring program lead. “My guess is that they’ve learned their lessons.”

The Trump transition team didn’t respond to a request for comment.

Like Baez’s Azimuth project, EDGI was born in 2016 in response to Trump’s first election. They weren’t the only ones…(More)”.

AI Investment Potential Index: Mapping Global Opportunities for Sustainable Development


Paper by AFD: “…examines the potential of artificial intelligence (AI) investment to drive sustainable development across diverse national contexts. By evaluating critical factors, including AI readiness, social inclusion, human capital, and macroeconomic conditions, we construct a nuanced and comprehensive analysis of the global AI landscape. Employing advanced statistical techniques and machine learning algorithms, we identify nations with significant untapped potential for AI investment.
We introduce the AI Investment Potential Index (AIIPI), a novel instrument designed to guide financial institutions, development banks, and governments in making informed, strategic AI investment decisions. The AIIPI synthesizes metrics of AI readiness with socio-economic indicators to identify and highlight opportunities for fostering inclusive and sustainable growth. The methodological novelty lies in the weight selection process, which combines statistical modeling and also an entropy-based weighting approach. Furthermore, we provide detailed policy implications to support stakeholders in making targeted investments aimed at reducing disparities and advancing equitable technological development…(More)”.