Recalibrating assumptions on AI


Essay by Arthur Holland Michel: “Many assumptions about artificial intelligence (AI) have become entrenched despite the lack of evidence to support them. Basing policies on these assumptions is likely to increase the risk of negative impacts for certain demographic groups. These dominant assumptions include claims that AI is ‘intelligent’ and ‘ethical’, that more data means better AI, and that AI development is a ‘race’.

The risks of this approach to AI policymaking are often ignored, while the potential positive impacts of AI tend to be overblown. By illustrating how a more evidence-based, inclusive discourse can improve policy outcomes, this paper makes the case for recalibrating the conversation around AI policymaking…(More)”

Institutional review boards need new skills to review data sharing and management plans


Article by Vasiliki Rahimzadeh, Kimberley Serpico & Luke Gelinas: “New federal rules require researchers to submit plans for how to manage and share their scientific data, but institutional ethics boards may be underprepared to review them.

Data sharing is widely considered a conduit to scientific progress, the benefits of which should return to individuals and communities who invested in that science. This is the central premise underpinning changes recently announcement by the US Office of Science Technology and Policy (OSTP)1 on sharing and managing data generated from federally funded research. Researchers will now be required to make publicly accessible any scholarly publications stemming from their federally funded research, as well as supporting data, according to the OSTP announcement. However, the attendant risks to individuals’ privacy-related interests and the increasing threat of community-based harms remain barriers to fostering a trustworthy ecosystem of biomedical data science.

Institutional review boards (IRBs) are responsible for ensuring protections for all human participants engaged in research, but they rarely include members with specialized expertise needed to effectively minimize data privacy and security risks. IRBs must be prepared to meet these review demands given the new data sharing policy changes. They will need additional resources to conduct quality and effective reviews of data management and sharing (DMS) plans. Practical ways forward include expanding IRB membership, proactively consulting with researchers, and creating new research compliance resources. This Comment will focus on data management and sharing oversight by IRBs in the US, but the globalization of data science research underscores the need for enhancing similar review capacities in data privacy, management and security worldwide…(More)”.

How public money is shaping the future of AI


Report by Ethica: “The European Union aims to become the “home of trustworthy Artificial Intelligence” and has committed the biggest existing public funding to invest in AI over the next decade. However, the lack of accessible data and comprehensive reporting on the Framework Programmes’ results and impact hinder the EU’s capacity to achieve its objectives and undermine the credibility of its commitments. 

This research commissioned by the European AI & Society Fund, recommends publicly accessible data, effective evaluation of the real-world impacts of funding, and mechanisms for civil society participation in funding before investing further public funds to achieve the EU’s goal of being the epicenter of trustworthy AI.

Among its findings, the research has highlighted the negative impact of the European Union’s investment in artificial intelligence (AI). The EU invested €10bn into AI via its Framework Programmes between 2014 and 2020, representing 13.4% of all available funding. However, the investment process is top-down, with little input from researchers or feedback from previous grantees or civil society organizations. Furthermore, despite the EU’s aim to fund market-focused innovation, research institutions and higher and secondary education establishments received 73% of the total funding between 2007 and 2020. Germany, France, and the UK were the largest recipients, receiving 37.4% of the total EU budget.

The report also explores the lack of commitment to ethical AI, with only 30.3% of funding calls related to AI mentioning trustworthiness, privacy, or ethics. Additionally, civil society organizations are not involved in the design of funding programs, and there is no evaluation of the economic or societal impact of the funded work. The report calls for political priorities to align with funding outcomes in specific, measurable ways, citing transport as the most funded sector in AI despite not being an EU strategic focus, while programs to promote SME and societal participation in scientific innovation have been dropped….(More)”.

No Ground Truth? No Problem: Improving Administrative Data Linking Using Active Learning and a Little Bit of Guile


Paper by Sarah Tahamont et al: “While linking records across large administrative datasets [“big data”] has the potential to revolutionize empirical social science research, many administrative data files do not have common identifiers and are thus not designed to be linked to others. To address this problem, researchers have developed probabilistic record linkage algorithms which use statistical patterns in identifying characteristics to perform linking tasks. Naturally, the accuracy of a candidate linking algorithm can be substantially improved when an algorithm has access to “ground-truth” examples — matches which can be validated using institutional knowledge or auxiliary data. Unfortunately, the cost of obtaining these examples is typically high, often requiring a researcher to manually review pairs of records in order to make an informed judgement about whether they are a match. When a pool of ground-truth information is unavailable, researchers can use “active learning” algorithms for linking, which ask the user to provide ground-truth information for select candidate pairs. In this paper, we investigate the value of providing ground-truth examples via active learning for linking performance. We confirm popular intuition that data linking can be dramatically improved with the availability of ground truth examples. But critically, in many real-world applications, only a relatively small number of tactically-selected ground-truth examples are needed to obtain most of the achievable gains. With a modest investment in ground truth, researchers can approximate the performance of a supervised learning algorithm that has access to a large database of ground truth examples using a readily available off-the-shelf tool…(More)”.

The NIST Trustworthy and Responsible Artificial Intelligence Resource Center


About: “The NIST Trustworthy and Responsible Artificial Intelligence Resource Center (AIRC) is a platform to support people and organizations in government, industry, and academia—both in the U.S. and internationally—driving technical and scientific innovation in AI. It serves as a one-stop-shop for foundational content, technical documents, and AI toolkits such as repository hub for standards, measurement methods and metrics, and data sets. It also provides a common forum for all AI actors to engage and collaborate in the development and deployment of trustworthy and responsible AI technologies that benefit all people in a fair and equitable manner.

The NIST AIRC is developed to support and operationalize the NIST AI Risk Management Framework (AI RMF 1.0) and its accompanying playbook. To match the complexity of AI technology, the AIRC will grow over time to provide an engaging interactive space that enables stakeholders to share AI RMF case studies and profiles, educational materials and technical guidance related to AI risk management.

The initial release of the AIRC (airc.nist.gov) provides access to the foundational content, including the AI RMF 1.0, the playbook, and a trustworthy and responsible AI glossary. It is anticipated that in the coming months enhancements to the AIRC will include structured access to relevant technical and policy documents; access to a standards hub that connects various standards promoted around the globe; a metrics hub to assist in test, evaluation, verification, and validation of AI; as well as software tools, resources and guidance that promote trustworthy and responsible AI development and use. Visitors to the AIRC will be able to tailor the above content they see based on their requirements (organizational role, area of expertise, etc.).

Over time the Trustworthy and Responsible AI Resource Center will enable distribution of stakeholder produced content, case studies, and educational materials…(More)”.

Outsourcing Virtue


Essay by  L. M. Sacasas: “To take a different class of example, we might think of the preoccupation with technological fixes to what may turn out to be irreducibly social and political problems. In a prescient essay from 2020 about the pandemic response, the science writer Ed Yong observed that “instead of solving social problems, the U.S. uses techno-fixes to bypass them, plastering the wounds instead of removing the source of injury—and that’s if people even accept the solution on offer.” There’s no need for good judgment, responsible governance, self-sacrifice or mutual care if there’s an easy technological fix to ostensibly solve the problem. No need, in other words, to be good, so long as the right technological solution can be found.

Likewise, there’s no shortage of examples involving algorithmic tools intended to outsource human judgment. Consider the case of NarxCare, a predictive program developed by Appriss Health, as reported in Wired in 2021. NarxCare is “an ‘analytics tool and care management platform’ that purports to instantly and automatically identify a patient’s risk of misusing opioids.” The article details the case of a 32-year-old woman suffering from endometriosis whose pain medications were cut off, without explanation or recourse, because she triggered a high-risk score from the proprietary algorithm. The details of the story are both fascinating and disturbing, but here’s the pertinent part for my purposes:

Appriss is adamant that a NarxCare score is not meant to supplant a doctor’s diagnosis. But physicians ignore these numbers at their peril. Nearly every state now uses Appriss software to manage its prescription drug monitoring programs, and most legally require physicians and pharmacists to consult them when prescribing controlled substances, on penalty of losing their license.

This is an obviously complex and sensitive issue, but it is hard to escape the conclusion that the use of these algorithmic systems exacerbates the same demoralizing opaqueness, evasion of responsibility and cover-your-ass dynamics that have long characterized analog bureaucracies. It becomes difficult to assume responsibility for a particular decision made in a particular case. Or, to put it otherwise, it becomes too easy to claim “the algorithm made me do it,” and it becomes so, in part, because the existing bureaucratic dynamics all but require it…(More)”.

Valuing the U.S. Data Economy Using Machine Learning and Online Job Postings


Paper by J Bayoán Santiago Calderón and Dylan Rassier: “With the recent proliferation of data collection and uses in the digital economy, the understanding and statistical treatment of data stocks and flows is of interest among compilers and users of national economic accounts. In this paper, we measure the value of own-account data stocks and flows for the U.S. business sector by summing the production costs of data-related activities implicit in occupations. Our method augments the traditional sum-of-costs methodology for measuring other own-account intellectual property products in national economic accounts by proxying occupation-level time-use factors using a machine learning model and the text of online job advertisements (Blackburn 2021). In our experimental estimates, we find that annual current-dollar investment in own-account data assets for the U.S. business sector grew from $84 billion in 2002 to $186 billion in 2021, with an average annual growth rate of 4.2 percent. Cumulative current-dollar investment for the period 2002–2021 was $2.6 trillion. In addition to the annual current-dollar investment, we present historical-cost net stocks, real growth rates, and effects on value-added by the industrial sector…(More)”.

Data Cooperatives as Catalysts for Collaboration, Data Sharing, and the (Trans)Formation of the Digital Commons


Paper by Michael Max Bühler et al: “Network effects, economies of scale, and lock-in-effects increasingly lead to a concentration of digital resources and capabilities, hindering the free and equitable development of digital entrepreneurship (SDG9), new skills, and jobs (SDG8), especially in small communities (SDG11) and their small and medium-sized enterprises (“SMEs”). To ensure the affordability and accessibility of technologies, promote digital entrepreneurship and community well-being (SDG3), and protect digital rights, we propose data cooperatives [1,2] as a vehicle for secure, trusted, and sovereign data exchange [3,4]. In post-pandemic times, community/SME-led cooperatives can play a vital role by ensuring that supply chains to support digital commons are uninterrupted, resilient, and decentralized [5]. Digital commons and data sovereignty provide communities with affordable and easy access to information and the ability to collectively negotiate data-related decisions. Moreover, cooperative commons (a) provide access to the infrastructure that underpins the modern economy, (b) preserve property rights, and (c) ensure that privatization and monopolization do not further erode self-determination, especially in a world increasingly mediated by AI. Thus, governance plays a significant role in accelerating communities’/SMEs’ digital transformation and addressing their challenges. Cooperatives thrive on digital governance and standards such as open trusted Application Programming Interfaces (APIs) that increase the efficiency, technological capabilities, and capacities of participants and, most importantly, integrate, enable, and accelerate the digital transformation of SMEs in the overall process. This policy paper presents and discusses several transformative use cases for cooperative data governance. The use cases demonstrate how platform/data-cooperatives, and their novel value creation can be leveraged to take digital commons and value chains to a new level of collaboration while addressing the most pressing community issues. The proposed framework for a digital federated and sovereign reference architecture will create a blueprint for sustainable development both in the Global South and North…(More)”

Knowledge monopolies and the innovation divide: A governance perspective


Paper by Hani Safadi and Richard Thomas Watson: “The rise of digital platforms creates knowledge monopolies that threaten innovation. Their power derives from the imposition of data obligations and persistent coupling on platform participation and their usurpation of the rights to data created by other participants to facilitate information asymmetries. Knowledge monopolies can use machine learning to develop competitive insights unavailable to every other platform participant. This information asymmetry stifles innovation, stokes the growth of the monopoly, and reinforces its ascendency. National or regional governance structures, such as laws and regulatory authorities, constrain economic monopolies deemed not in the public interest. We argue the need for legislation and an associated regulatory mechanism to curtail coercive data obligations, control, eliminate data rights exploitation, and prevent mergers and acquisitions that could create or extend knowledge monopolies…(More)”.

National Experimental Wellbeing Statistics (NEWS)


US Census: “The National Experimental Wellbeing Statistics (NEWS) project is a new experimental project to develop improved estimates of income, poverty, and other measures of economic wellbeing.  Using all available survey, administrative, and commercial data, we strive to provide the best possible estimates of our nation and economy.

In this first release, we estimate improved income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research.  We address biases from (1) unit nonresponse through improved weights, (2) missing income information in both survey and administrative data through improved imputation, and (3) misreporting by combining or replacing survey responses with administrative information.  Reducing survey error using these techniques substantially affects key measures of well-being.  With this initial set of experimental estimates, we estimate median household income is 6.3 percent higher than in survey estimates, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For householders aged 65 and over, median household income is 27.3 percent higher, and poverty is 3.3 percentage points lower than in survey estimates. We do not find a significant impact on median household income for householders under 65 or on child poverty. 

We will continue research (1) to estimate income at smaller geographies, through increased use of American Community Survey data, (2) addressing other potential sources of bias, (3) releasing additional years of statistics, particularly more timely estimates, and (4) extending the income concepts measured.  As we advance the methods in future releases, we expect to revise these estimates…(More)”.