Paper by Eric Martínez, Francis Mollica and Edward Gibson: “Whereas principles of communicative efficiency and legal doctrine dictate that laws be comprehensible to the common world, empirical evidence suggests legal documents are largely incomprehensible to lawyers and laypeople alike. Here, a corpus analysis (n = 59) million words) first replicated and extended prior work revealing laws to contain strikingly higher rates of complex syntactic structures relative to six baseline genres of English. Next, two preregistered text generation experiments (n = 286) tested two leading hypotheses regarding how these complex structures enter into legal documents in the first place. In line with the magic spell hypothesis, we found people tasked with writing official laws wrote in a more convoluted manner than when tasked with writing unofficial legal texts of equivalent conceptual complexity. Contrary to the copy-and-edit hypothesis, we did not find evidence that people editing a legal document wrote in a more convoluted manner than when writing the same document from scratch. From a cognitive perspective, these results suggest law to be a rare exception to the general tendency in human language toward communicative efficiency. In particular, these findings indicate law’s complexity to be derived from its performativity, whereby low-frequency structures may be inserted to signal law’s authoritative, world-state-altering nature, at the cost of increased processing demands on readers. From a law and policy perspective, these results suggest that the tension between the ubiquity and impenetrability of the law is not an inherent one, and that laws can be simplified without a loss or distortion of communicative content…(More)”.
The Power of Volunteers: Remote Mapping Gaza and Strategies in Conflict Areas
Blog by Jessica Pechmann: “…In Gaza, increased conflict since October 2023 has caused a prolonged humanitarian crisis. Understanding the impact of the conflict on buildings has been challenging, since pre-existing datasets from artificial intelligence and machine learning (AI/ML) models and OSM were not accurate enough to create a full building footprint baseline. The area’s buildings were too dense, and information on the ground was impossible to collect safely. In these hard-to-reach areas, HOT’s remote and crowdsourced mapping methodology was a good fit for collecting detailed information visible on aerial imagery.
In February 2024, after consultation with humanitarian and UN actors working in Gaza, HOT decided to create a pre-conflict dataset of all building footprints in the area in OSM. HOT’s community of OpenStreetMap volunteers did all the data work, coordinating through HOT’s Tasking Manager. The volunteers made meticulous edits to add missing data and to improve existing data. Due to protection and data quality concerns, only expert volunteer teams were assigned to map and validate the area. As in other areas that are hard to reach due to conflict, HOT balanced the data needs with responsible data practices based on the context.
Comparing AI/ML with human-verified OSM building datasets in conflict zones
AI/ML is becoming an increasingly common and quick way to obtain building footprints across large areas. Sources for automated building footprints range from worldwide datasets by Microsoft or Google to smaller-scale open community-managed tools such as HOT’s new application, fAIr.
Now that HOT volunteers have completely updated and validated all OSM buildings in visible imagery pre-conflict, OSM has 18% more individual buildings in the Gaza strip than Microsoft’s ML buildings dataset (estimated 330,079 buildings vs 280,112 buildings). However, in contexts where there has not been a coordinated update effort in OSM, the numbers may differ. For example, in Sudan where there has not been a large organized editing campaign, there are just under 1,500,000 in OSM, compared to over 5,820,000 buildings in Microsoft’s ML data. It is important to note that the ML datasets have not been human-verified and their accuracy is not known. Google Open Buildings has over 26 million building features in Sudan, but on visual inspection, many of these features are noise in the data that the model incorrectly identified as buildings in the uninhabited desert…(More)”.
Relational ethics in health care automation
Paper by Frances Shaw and Anthony McCosker: “Despite the transformative potential of automation and clinical decision support technology in health care, there is growing urgency for more nuanced approaches to ethics. Relational ethics is an approach that can guide the responsible use of a range of automated decision-making systems including the use of generative artificial intelligence and large language models as they affect health care relationships.
There is an urgent need for sector-wide training and scrutiny regarding the effects of automation using relational ethics touchstones, such as patient-centred health care, informed consent, patient autonomy, shared decision-making, empathy and the politics of care.
The purpose of this review is to offer a provocation for health care practitioners, managers and policy makers to consider the use automated tools in practice settings and examine how these tools might affect relationships and hence care outcomes…(More)”.
Governing mediation in the data ecosystem: lessons from media governance for overcoming data asymmetries
Chapter by Stefaan Verhulst in Handbook of Media and Communication Governance edited by Manuel Puppis , Robin Mansell , and Hilde Van den Bulck: “The internet and the accompanying datafication were heralded to usher in a golden era of disintermediation. Instead, the modern data ecology witnessed a process of remediation, or ‘hyper-mediation’, resulting in governance challenges, many of which underlie broader socioeconomic difficulties. Particularly, the rise of data asymmetries and silos create new forms of scarcity and dominance with deleterious political, economic and cultural consequences. Responding to these challenges requires a new data governance framework, focused on unlocking data and developing a more data pluralistic ecosystem. We argue for regulation and policy focused on promoting data collaboratives, an emerging form of cross-sectoral partnership; and on the establishment of data stewards, individuals/groups tasked with managing and responsibly sharing organizations’ data assets. Some regulatory steps are discussed, along with the various ways in which these two emerging stakeholders can help alleviate data scarcities and their associated problems…(More)”
Civic Monitoring for Environmental Law Enforcement
Book by Anna Berti Suman: “This book presents a thought-provoking inquiry demonstrating how civic environmental monitoring can support law enforcement. It provides an in-depth analysis of applicable legal frameworks and conventions such as the Aarhus Convention, with an enlightening discussion on the civic right to contribute environmental information.
Civic Monitoring for Environmental Law Enforcement discusses multi- and interdisciplinary research into how civil society uses monitoring techniques to gather evidence of environmental issues. The book argues that civic monitoring is a constructive approach for finding evidence of environmental wrongdoings and for leveraging this evidence in different institutional fora, including judicial proceedings and official reporting for environmental protection agencies. It also reveals the challenges and implications associated with a greater reliance on civic monitoring practices by institutions and society at large.
Adopting original methodological approaches to drive inspiration for further research, this book is an invaluable resource for students and scholars of environmental governance and regulation, environmental law, politics and policy, and science and technology studies. It is also beneficial to civil society actors, civic initiatives, legal practitioners, and policymakers working in institutions engaged in the application of environmental law…(More)”
Using AI to Map Urban Change
Brief by Tianyuan Huang, Zejia Wu, Jiajun Wu, Jackelyn Hwang, Ram Rajagopal: “Cities are constantly evolving, and better understanding those changes facilitates better urban planning and infrastructure assessments and leads to more sustainable social and environmental interventions. Researchers currently use data such as satellite imagery to study changing urban environments and what those changes mean for public policy and urban design. But flaws in the current approaches, such as inadequately granular data, limit their scalability and their potential to inform public policy across social, political, economic, and environmental issues.
Street-level images offer an alternative source of insights. These images are frequently updated and high-resolution. They also directly capture what’s happening on a street level in a neighborhood or across a city. Analyzing street-level images has already proven useful to researchers studying socioeconomic attributes and neighborhood gentrification, both of which are essential pieces of information in urban design, sustainability efforts, and public policy decision-making for cities. Yet, much like other data sources, street-level images present challenges: accessibility limits, shadow and lighting issues, and difficulties scaling up analysis.
To address these challenges, our paper “CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series” introduces a multicity dataset of labeled street-view images and proposes a novel artificial intelligence (AI) model to detect urban changes such as gentrification. We demonstrate the change-detection model’s effectiveness by testing it on images from Seattle, Washington, and show that it can provide important insights into urban changes over time and at scale. Our data-driven approach has the potential to allow researchers and public policy analysts to automate and scale up their analysis of neighborhood and citywide socioeconomic change…(More)”.
This is AI’s brain on AI
Article by Alison Snyder Data to train AI models increasingly comes from other AI models in the form of synthetic data, which can fill in chatbots’ knowledge gaps but also destabilize them.
The big picture: As AI models expand in size, their need for data becomes insatiable — but high quality human-made data is costly, and growing restrictions on the text, images and other kinds of data freely available on the web are driving the technology’s developers toward machine-produced alternatives.
State of play: AI-generated data has been used for years to supplement data in some fields, including medical imaging and computer vision, that use proprietary or private data.
- But chatbots are trained on public data collected from across the internet that is increasingly being restricted — while at the same time, the web is expected to be flooded with AI-generated content.
Those constraints and the decreasing cost of generating synthetic data are spurring companies to use AI-generated data to help train their models.
- Meta, Google, Anthropic and others are using synthetic data — alongside human-generated data — to help train the AI models that power their chatbots.
- Google DeepMind’s new AlphaGeometry 2 system that can solve math Olympiad problems is trained from scratch on synthetic data…(More)”
A.I. May Save Us, or May Construct Viruses to Kill Us
Article by Nicholas Kristof: “Here’s a bargain of the most horrifying kind: For less than $100,000, it may now be possible to use artificial intelligence to develop a virus that could kill millions of people.
That’s the conclusion of Jason Matheny, the president of the RAND Corporation, a think tank that studies security matters and other issues.
“It wouldn’t cost more to create a pathogen that’s capable of killing hundreds of millions of people versus a pathogen that’s only capable of killing hundreds of thousands of people,” Matheny told me.
In contrast, he noted, it could cost billions of dollars to produce a new vaccine or antiviral in response…
In the early 2000s, some of us worried about smallpox being reintroduced as a bioweapon if the virus were stolen from the labs in Atlanta and in Russia’s Novosibirsk region that retain the virus since the disease was eradicated. But with synthetic biology, now it wouldn’t have to be stolen.
Some years ago, a research team created a cousin of the smallpox virus, horse pox, in six months for $100,000, and with A.I. it could be easier and cheaper to refine the virus.
One reason biological weapons haven’t been much used is that they can boomerang. If Russia released a virus in Ukraine, it could spread to Russia. But a retired Chinese general has raised the possibility of biological warfare that targets particular races or ethnicities (probably imperfectly), which would make bioweapons much more useful. Alternatively, it might be possible to develop a virus that would kill or incapacitate a particular person, such as a troublesome president or ambassador, if one had obtained that person’s DNA at a dinner or reception.
Assessments of ethnic-targeting research by China are classified, but they may be why the U.S. Defense Department has said that the most important long-term threat of biowarfare comes from China.
A.I. has a more hopeful side as well, of course. It holds the promise of improving education, reducing auto accidents, curing cancers and developing miraculous new pharmaceuticals.
One of the best-known benefits is in protein folding, which can lead to revolutionary advances in medical care. Scientists used to spend years or decades figuring out the shapes of individual proteins, and then a Google initiative called AlphaFold was introduced that could predict the shapes within minutes. “It’s Google Maps for biology,” Kent Walker, president of global affairs at Google, told me.
Scientists have since used updated versions of AlphaFold to work on pharmaceuticals including a vaccine against malaria, one of the greatest killers of humans throughout history.
So it’s unclear whether A.I. will save us or kill us first…(More)”.
Supporting Scientific Citizens
Article by Lisa Margonelli: “What do nuclear fusion power plants, artificial intelligence, hydrogen infrastructure, and drinking water recycled from human waste have in common? Aside from being featured in this edition of Issues, they all require intense public engagement to choose among technological tradeoffs, safety profiles, and economic configurations. Reaching these understandings requires researchers, engineers, and decisionmakers who are adept at working with the public. It also requires citizens who want to engage with such questions and can articulate what they want from science and technology.
This issue offers a glimpse into what these future collaborations might look like. To train engineers with the “deep appreciation of the social, cultural, and ethical priorities and implications of the technological solutions engineers are tasked with designing and deploying,” University of Michigan nuclear engineer Aditi Verma and coauthors Katie Snyder and Shanna Daly asked their first-year engineering students to codesign nuclear power plants in collaboration with local community members. Although traditional nuclear engineering classes avoid “getting messy,” Verma and colleagues wanted students to engage honestly with the uncertainties of the profession. In the process of working with communities, the students’ vocabulary changed; they spoke of trust, respect, and “love” for community—even when considering deep geological waste repositories…(More)”.
Is peer review failing its peer review?
Article by First Principles: “Ivan Oransky doesn’t sugar-coat his answer when asked about the state of academic peer review: “Things are pretty bad.”
As a distinguished journalist in residence at New York University and co-founder of Retraction Watch – a site that chronicles the growing number of papers being retracted from academic journals – Oransky is better positioned than just about anyone to make such a blunt assessment.
He elaborates further, citing a range of factors contributing to the current state of affairs. These include the publish-or-perish mentality, chatbot ghostwriting, predatory journals, plagiarism, an overload of papers, a shortage of reviewers, and weak incentives to attract and retain reviewers.
“Things are pretty bad and they have been bad for some time because the incentives are completely misaligned,” Oranksy told FirstPrinciples in a call from his NYU office.
Things are so bad that a new world record was set in 2023: more than 10,000 research papers were retracted from academic journals. In a troubling development, 19 journals closed after being inundated by a barrage of fake research from so-called “paper mills” that churn out the scientific equivalent of clickbait, and one scientist holds the current record of 213 retractions to his name.
“The numbers don’t lie: Scientific publishing has a problem, and it’s getting worse,” Oransky and Retraction Watch co-founder Adam Marcus wrote in a recent opinion piece for The Washington Post. “Vigilance against fraudulent or defective research has always been necessary, but in recent years the sheer amount of suspect material has threatened to overwhelm publishers.”..(More)”.