Will Artificial Intelligence Replace Us or Empower Us?


Article by Peter Coy: “…But A.I. could also be designed to empower people rather than replace them, as I wrote a year ago in a newsletter about the M.I.T. Shaping the Future of Work Initiative.

Which of those A.I. futures will be realized was a big topic at the San Francisco conference, which was the annual meeting of the American Economic Association, the American Finance Association and 65 smaller groups in the Allied Social Science Associations.

Erik Brynjolfsson of Stanford was one of the busiest economists at the conference, dashing from one panel to another to talk about his hopes for a human-centric A.I. and his warnings about what he has called the “Turing Trap.”

Alan Turing, the English mathematician and World War II code breaker, proposed in 1950 to evaluate the intelligence of computers by whether they could fool someone into thinking they were human. His “imitation game” led the field in an unfortunate direction, Brynjolfsson argues — toward creating machines that behaved as much like humans as possible, instead of like human helpers.

Henry Ford didn’t set out to build a car that could mimic a person’s walk, so why should A.I. experts try to build systems that mimic a person’s mental abilities? Brynjolfsson asked at one session I attended.

Other economists have made similar points: Daron Acemoglu of M.I.T. and Pascual Restrepo of Boston University use the term “so-so technologies” for systems that replace human beings without meaningfully increasing productivity, such as self-checkout kiosks in supermarkets.

People will need a lot more education and training to take full advantage of A.I.’s immense power, so that they aren’t just elbowed aside by it. “In fact, for each dollar spent on machine learning technology, companies may need to spend nine dollars on intangible human capital,” Brynjolfsson wrote in 2022, citing research by him and others…(More)”.

AI Is Bad News for the Global South


Article by Rachel Adams: “…AI’s adoption in developing regions is also limited by its design. AI designed in Silicon Valley on largely English-language data is not often fit for purpose outside of wealthy Western contexts. The productive use of AI requires stable internet access or smartphone technology; in sub-Saharan Africa, only 25 percent of people have reliable internet access, and it is estimated that African women are 32 percent less likely to use mobile internet than their male counterparts.

Generative AI technologies are also predominantly developed using the English language, meaning that the outputs they produce for non-Western users and contexts are oftentimes useless, inaccurate, and biased. Innovators in the global south have to put in at least twice the effort to make their AI applications work for local contexts, often by retraining models on localized datasets and through extensive trial and error practices.

Where AI is designed to generate profit and entertainment only for the already privileged, it will not be effective in addressing the conditions of poverty and in changing the lives of groups that are marginalized from the consumer markets of AI. Without a high level of saturation across major industries, and without the infrastructure in place to enable meaningful access to AI by all people, global south nations are unlikely to see major economic benefits from the technology.

As AI is adopted across industries, human labor is changing. For poorer countries, this is engendering a new race to the bottom where machines are cheaper than humans and the cheap labor that was once offshored to their lands is now being onshored back to wealthy nations. The people most impacted are those with lower education levels and fewer skills, whose jobs can be more easily automated. In short, much of the population in lower- and middle-income countries may be affected, severely impacting the lives of millions of people and threatening the capacity of poorer nations to prosper…(More)”.

Behaviour-based dependency networks between places shape urban economic resilience


Paper by Takahiro Yabe et al: “Disruptions, such as closures of businesses during pandemics, not only affect businesses and amenities directly but also influence how people move, spreading the impact to other businesses and increasing the overall economic shock. However, it is unclear how much businesses depend on each other during disruptions. Leveraging human mobility data and same-day visits in five US cities, we quantify dependencies between points of interest encompassing businesses, stores and amenities. We find that dependency networks computed from human mobility exhibit significantly higher rates of long-distance connections and biases towards specific pairs of point-of-interest categories. We show that using behaviour-based dependency relationships improves the predictability of business resilience during shocks by around 40% compared with distance-based models, and that neglecting behaviour-based dependencies can lead to underestimation of the spatial cascades of disruptions. Our findings underscore the importance of measuring complex relationships in patterns of human mobility to foster urban economic resilience to shocks…(More)”.

Big brother: the effects of surveillance on fundamental aspects of social vision


Paper by Kiley Seymour et al: “Despite the dramatic rise of surveillance in our societies, only limited research has examined its effects on humans. While most research has focused on voluntary behaviour, no study has examined the effects of surveillance on more fundamental and automatic aspects of human perceptual awareness and cognition. Here, we show that being watched on CCTV markedly impacts a hardwired and involuntary function of human sensory perception—the ability to consciously detect faces. Using the method of continuous flash suppression (CFS), we show that when people are surveilled (N = 24), they are quicker than controls (N = 30) to detect faces. An independent control experiment (N = 42) ruled out an explanation based on demand characteristics and social desirability biases. These findings show that being watched impacts not only consciously controlled behaviours but also unconscious, involuntary visual processing. Our results have implications concerning the impacts of surveillance on basic human cognition as well as public mental health…(More)”.

Data solidarity: Operationalising public value through a digital tool


Paper by Seliem El-Sayed, Ilona Kickbusch & Barbara Prainsack: “Most data governance frameworks are designed to protect the individuals from whom data originates. However, the impacts of digital practices extend to a broader population and are embedded in significant power asymmetries within and across nations. Further, inequities in digital societies impact everyone, not just those directly involved. Addressing these challenges requires an approach which moves beyond individual data control and is grounded in the values of equity and a just contribution of benefits and risks from data use. Solidarity-based data governance (in short: data solidarity), suggests prioritising data uses over data type and proposes that data uses that generate public value should be actively facilitated, those that generate significant risks and harms should be prohibited or strictly regulated, and those that generate private benefits with little or no public value should be ‘taxed’ so that profits generated by corporate data users are reinvested in the public domain. In the context of global health data governance, the public value generated by data use is crucial. This contribution clarifies the meaning, importance, and potential of public value within data solidarity and outlines methods for its operationalisation through the PLUTO tool, specifically designed to assess the public value of data uses…(More)”.

Kickstarting Collaborative, AI-Ready Datasets in the Life Sciences with Government-funded Projects


Article by Erika DeBenedictis, Ben Andrew & Pete Kelly: “In the age of Artificial Intelligence (AI), large high-quality datasets are needed to move the field of life science forward. However, the research community lacks strategies to incentivize collaboration on high-quality data acquisition and sharing. The government should fund collaborative roadmapping, certification, collection, and sharing of large, high-quality datasets in life science. In such a system, nonprofit research organizations engage scientific communities to identify key types of data that would be valuable for building predictive models, and define quality control (QC) and open science standards for collection of that data. Projects are designed to develop automated methods for data collection, certify data providers, and facilitate data collection in consultation with researchers throughout various scientific communities. Hosting of the resulting open data is subsidized as well as protected by security measures. This system would provide crucial incentives for the life science community to identify and amass large, high-quality open datasets that will immensely benefit researchers…(More)”.

The AI tool that can interpret any spreadsheet instantly


Article by Duncan C. McElfresh: “Say you run a hospital and you want to estimate which patients have the highest risk of deterioration so that your staff can prioritize their care1. You create a spreadsheet in which there is a row for each patient, and columns for relevant attributes, such as age or blood-oxygen level. The final column records whether the person deteriorated during their stay. You can then fit a mathematical model to these data to estimate an incoming patient’s deterioration risk. This is a classic example of tabular machine learning, a technique that uses tables of data to make inferences. This usually involves developing — and training — a bespoke model for each task. Writing in Nature, Hollmann et al.report a model that can perform tabular machine learning on any data set without being trained specifically to do so.

Tabular machine learning shares a rich history with statistics and data science. Its methods are foundational to modern artificial intelligence (AI) systems, including large language models (LLMs), and its influence cannot be overstated. Indeed, many online experiences are shaped by tabular machine-learning models, which recommend products, generate advertisements and moderate social-media content3. Essential industries such as healthcare and finance are also steadily, if cautiously, moving towards increasing their use of AI.

Despite the field’s maturity, Hollmann and colleagues’ advance could be revolutionary. The authors’ contribution is known as a foundation model, which is a general-purpose model that can be used in a range of settings. You might already have encountered foundation models, perhaps unknowingly, through AI tools, such as ChatGPT and Stable Diffusion. These models enable a single tool to offer varied capabilities, including text translation and image generation. So what does a foundation model for tabular machine learning look like?

Let’s return to the hospital example. With spreadsheet in hand, you choose a machine-learning model (such as a neural network) and train the model with your data, using an algorithm that adjusts the model’s parameters to optimize its predictive performance (Fig. 1a). Typically, you would train several such models before selecting one to use — a labour-intensive process that requires considerable time and expertise. And of course, this process must be repeated for each unique task.

Figure 1 | A foundation model for tabular machine learning. a, Conventional machine-learning models are trained on individual data sets using mathematical optimization algorithms. A different model needs to be developed and trained for each task, and for each data set. This practice takes years to learn and requires extensive time and computing resources. b, By contrast, a ‘foundation’ model could be used for any machine-learning task and is pre-trained on the types of data used to train conventional models. This type of model simply reads a data set and can immediately produce inferences about new data points. Hollmann et al. developed a foundation model for tabular machine learning, in which inferences are made on the basis of tables of data. Tabular machine learning is used for tasks as varied as social-media moderation and hospital decision-making, so the authors’ advance is expected to have a profound effect in many areas…(More)”

Comparative perspectives on the regulation of large language models


Editorial to Special Issue by Cristina Poncibò and Martin Ebers: “Large language models (LLMs) represent one of the most significant technological advancements in recent decades, offering transformative capabilities in natural language processing and content generation. Their development has far-reaching implications across technological, economic and societal domains, simultaneously creating opportunities for innovation and posing profound challenges for governance and regulation. As LLMs become integral to various sectors, from education to healthcare to entertainment, regulators are scrambling to establish frameworks that ensure their safe and ethical use.

Our issue primarily examines the private ordering, regulatory responses and normative frameworks for LLMs from a comparative law perspective, with a particular focus on the European Union (EU), the United States (US) and China. An introductory part preliminarily explores the technical principles that underpin LLMs as well as their epistemological foundations. It also addresses key sector-specific legal challenges posed by LLMs, including their implications for criminal law, data protection and copyright law…(More)”.

Government reform starts with data, evidence


Article by Kshemendra Paul: “It’s time to strengthen the use of dataevidence and transparency to stop driving with mud on the windshield and to steer the government toward improving management of its programs and operations.

Existing Government Accountability Office and agency inspectors general reports identify thousands of specific evidence-based recommendations to improve efficiency, economy and effectiveness, and reduce fraud, waste and abuse. Many of these recommendations aim at program design and requirements, highlighting specific instances of overlap, redundancy and duplication. Others describe inadequate internal controls to balance program integrity with the experience of the customer, contractor or grantee. While progress is being reported in part due to stronger partnerships with IGs, much remains to be done. Indeed, GAO’s 2023 High Risk List, which it has produced going back to 1990, shows surprisingly slow progress of efforts to reduce risk to government programs and operations.

Here are a few examples:

  • GAO estimates recent annual fraud of between $233 billion to $521 billion, or about 3% to 7% of federal spending. On the other hand, identified fraud with high-risk Recovery Act spending was held under 1% using data, transparency and partnerships with Offices of Inspectors General.
  • GAO and IGs have collectively identified hundreds of billions in potential cost savings or improvements not yet addressed by federal agencies.
  • GAO has recently described shortcomings with the government’s efforts to build evidence. While federal policymakers need good information to inform their decisions, the Commission on Evidence-Based Policymaking previously said, “too little evidence is produced to meet this need.”

One of the main reasons for agency sluggishness is the lack of agency and governmentwide use of synchronized, authoritative and shared data to support how the government manages itself.

For example, the Energy Department IG found that, “[t]he department often lacks the data necessary to make critical decisions, evaluate and effectively manage risks, or gain visibility into program results.” It is past time for the government to commit itself to move away from its widespread use of data calls, the error-prone, costly and manual aggregation of data used to support policy analysis and decision-making. Efforts to embrace data-informed approaches to manage government programs and operations are stymied by lack of basic agency and governmentwide data hygiene. While bright pockets exist, management gaps, as DOE OIG stated, “create blind spots in the universe of data that, if captured, could be used to more efficiently identify, track and respond to risks…”

The proposed approach starts with current agency operating models, then drives into management process integration to tackle root causes of dysfunction from the bottom up. It recognizes that inefficiency, fraud and other challenges are diffused, deeply embedded and have non-obvious interrelationships within the federal complex…(More)”

Survey of attitudes in a Danish public towards reuse of health data


Paper by Lea Skovgaard et al: “Everyday clinical care generates vast amounts of digital data. A broad range of actors are interested in reusing these data for various purposes. Such reuse of health data could support medical research, healthcare planning, technological innovation, and lead to increased financial revenue. Yet, reuse also raises questions about what data subjects think about the use of health data for various different purposes. Based on a survey with 1071 respondents conducted in 2021 in Denmark, this article explores attitudes to health data reuse. Denmark is renowned for its advanced integration of data infrastructures, facilitating data reuse. This is therefore a relevant setting from which to explore public attitudes to reuse, both as authorities around the globe are currently working to facilitate data reuse opportunities, and in the light of the recent agreement on the establishment in 2024 of the European Health Data Space (EHDS) within the European Union (EU). Our study suggests that there are certain forms of health data reuse—namely transnational data sharing, commercial involvement, and use of data as national economic assets—which risk undermining public support for health data reuse. However, some of the purposes that the EHDS is supposed to facilitate are these three controversial purposes. Failure to address these public concerns could well challenge the long-term legitimacy and sustainability of the data infrastructures currently under construction…(More)”