Paper by Poli Nemkova, et al: “The present-day Russia-Ukraine military conflict has exposed the pivotal role of social media in enabling the transparent and unbridled sharing of information directly from the frontlines. In conflict zones where freedom of expression is constrained and information warfare is pervasive, social media has emerged as an indispensable lifeline. Anonymous social media platforms, as publicly available sources for disseminating war-related information, have the potential to serve as effective instruments for monitoring and documenting Human Rights Violations (HRV). Our research focuses on the analysis of data from Telegram, the leading social media platform for reading independent news in post-Soviet regions. We gathered a dataset of posts sampled from 95 public Telegram channels that cover politics and war news, which we have utilized to identify potential occurrences of HRV. Employing a mBERT-based text classifier, we have conducted an analysis to detect any mentions of HRV in the Telegram data. Our final approach yielded an F2 score of 0.71 for HRV detection, representing an improvement of 0.38 over the multilingual BERT base model. We release two datasets that contains Telegram posts: (1) large corpus with over 2.3 millions posts and (2) annotated at the sentence-level dataset to indicate HRVs. The Telegram posts are in the context of the Russia-Ukraine war. We posit that our findings hold significant implications for NGOs, governments, and researchers by providing a means to detect and document possible human rights violations…(More)” See also Data for Peace and Humanitarian Response? The Case of the Ukraine-Russia War
Opportunities and Challenges in Reusing Public Genomics Data
Introduction to Special Issue by Mahmoud Ahmed and Deok Ryong Kim: “Genomics data is accumulating in public repositories at an ever-increasing rate. Large consortia and individual labs continue to probe animal and plant tissue and cell cultures, generating vast amounts of data using established and novel technologies. The human genome project kickstarted the era of systems biology (1, 2). Ambitious projects followed to characterize non-coding regions, variations across species, and between populations (3, 4, 5). The cost reduction allowed individual labs to generate numerous smaller high-throughput datasets (6, 7, 8, 9). As a result, the scientific community should consider strategies to overcome the challenges and maximize the opportunities to use these resources for research and the public good. In this collection, we will elicit opinions and perspectives from researchers in the field on the opportunities and challenges of reusing public genomics data. The articles in this research topic converge on the need for data sharing while acknowledging the challenges that come with it. Two articles defined and highlighted the distinction between data and metadata. The characteristic of each should be considered when designing optimal sharing strategies. One article focuses on the specific issues surrounding the sharing of genomics interval data, and another on balancing the need for protecting pediatric rights and the sharing benefits.
The definition of what counts as data is itself a moving target. As technology advances, data can be produced in more ways and from novel sources. Events of recent years have highlighted this fact. “The pandemic has underscored the urgent need to recognize health data as a global public good with mechanisms to facilitate rapid data sharing and governance,” Schwalbe and colleagues (2020). The challenges facing these mechanisms could be technical, economic, legal, or political. Defining what data is and its type, therefore, is necessary to overcome these barriers because “the mechanisms to facilitate data sharing are often specific to data types.” Unlike genomics data, which has established platforms, sharing clinical data “remains in a nascent phase.” The article by Patrinos and colleagues (2022) considers the strong ethical imperative for protecting pediatric data while acknowledging the need not to overprotections. The authors discuss a model of consent for pediatric research that can balance the need to protect participants and generate health benefits.
Xue et al. (2023) focus on reusing genomic interval data. Identifying and retrieving the relevant data can be difficult, given the state of the repositories and the size of these data. Similarly, integrating interval data in reference genomes can be hard. The author calls for standardized formats for the data and the metadata to facilitate reuse.
Sheffield and colleagues (2023) highlight the distinction between data and metadata. Metadata describes the characteristics of the sample, experiment, and analysis. The nature of this information differs from that of the primary data in size, source, and ways of use. Therefore, an optimal strategy should consider these specific attributes for sharing metadata. Challenges specifics to sharing metadata include the need for standardized terms and formats, making it portable and easier to find.
We go beyond the reuse issue to highlight two other aspects that might increase the utility of available public data in Ahmed et al. (2023). These are curation and integration…(More)”.
There’s a model for governing AI. Here it is.
Article by Jacinda Ardern: “…On March 15, 2019, a terrorist took the lives of 51 members of New Zealand’s Muslim community in Christchurch. The attacker livestreamed his actions for 17 minutes, and the images found their way onto social media feeds all around the planet. Facebook alone blocked or removed 1.5 million copies of the video in the first 24 hours; in that timeframe, YouTube measured one upload per second.
Afterward, New Zealand was faced with a choice: accept that such exploitation of technology was inevitable or resolve to stop it. We chose to take a stand.
We had to move quickly. The world was watching our response and that of social media platforms. Would we regulate in haste? Would the platforms recognize their responsibility to prevent this from happening again?
New Zealand wasn’t the only nation grappling with the connection between violent extremism and technology. We wanted to create a coalition and knew that France had started to work in this space — so I reached out, leader to leader. In my first conversation with President Emmanuel Macron, he agreed there was work to do and said he was keen to join us in crafting a call to action.
We asked industry, civil society and other governments to join us at the table to agree on a set of actions we could all commit to. We could not use existing structures and bureaucracies because they weren’t equipped to deal with this problem.
Within two months of the attack, we launched the Christchurch Call to Action, and today it has more than 120 members, including governments, online service providers and civil society organizations — united by our shared objective to eliminate terrorist and other violent extremist content online and uphold the principle of a free, open and secure internet.
The Christchurch Call is a large-scale collaboration, vastly different from most top-down approaches. Leaders meet annually to confirm priorities and identify areas of focus, allowing the project to act dynamically. And the Call Secretariat — made up of officials from France and New Zealand — convenes working groups and undertakes diplomatic efforts throughout the year. All members are invited to bring their expertise to solve urgent online problems.
While this multi-stakeholder approach isn’t always easy, it has created change. We have bolstered the power of governments and communities to respond to attacks like the one New Zealand experienced. We have created new crisis-response protocols — which enabled companies to stop the 2022 Buffalo attack livestream within two minutes and quickly remove footage from many platforms. Companies and countries have enacted new trust and safety measures to prevent livestreaming of terrorist and other violent extremist content. And we have strengthened the industry-founded Global Internet Forum to Counter Terrorism with dedicated funding, staff and a multi-stakeholder mission.
We’re also taking on some of the more intransigent problems. The Christchurch Call Initiative on Algorithmic Outcomes, a partnership with companies and researchers, was intended to provide better access to the kind of data needed to design online safety measures to prevent radicalization to violence. In practice, it has much wider ramifications, enabling us to reveal more about the ways in which AI and humans interact.
From its start, the Christchurch Call anticipated the emerging challenges of AI and carved out space to address emerging technologies that threaten to foment violent extremism online. The Christchurch Call is actively tackling these AI issues.
Perhaps the most useful thing the Christchurch Call can add to the AI governance debate is the model itself. It is possible to bring companies, government officials, academics and civil society together not only to build consensus but also to make progress. It’s possible to create tools that address the here and now and also position ourselves to face an unknown future. We need this to deal with AI…(More)”.
OECD Recommendation on Digital Identity
OECD Recommendation: “…Recommends that Adherents prioritise inclusion and minimise barriers to access to and the use of digital identity. To this effect, Adherents should:
1. Promote accessibility, affordability, usability, and equity across the digital identity lifecycle in order to increase access to a secure and trusted digital identity solution, including by vulnerable groups and minorities in accordance with their needs;
2. Take steps to ensure that access to essential services, including those in the public and private sector is not restricted or denied to natural persons who do not want to, or cannot access or use a digital identity solution;
3. Facilitate inclusive and collaborative stakeholder engagement throughout the design, development, and implementation of digital identity systems, to promote transparency, accountability, and alignment with user needs and expectations;
4. Raise awareness of the benefits and secure uses of digital identity and the way in which the digital identity system protects users while acknowledging risks and demonstrating the mitigation of potential harms;
5. Take steps to ensure that support is provided through appropriate channel(s), for those who face challenges in accessing and using digital identity solutions, and identify opportunities to build the skills and capabilities of users;
6. Monitor, evaluate and publicly report on the effectiveness of the digital identity system, with a focus on inclusiveness and minimising the barriers to the access and use of digital identity…
Recommends that Adherents take a strategic approach to digital identity and define roles and responsibilities across the digital identity ecosystem…(More)”.
Digital Freedoms in French-Speaking African Countries
Report by AFD: “As digital penetration increases in countries across the African continent, its citizens face growing risks and challenges. Indeed, beyond facilitated access to knowledge such as the online encyclopedia Wikipedia, to leisure-related tools such as Youtube, and to sociability such as social networks, digital technology offers an unprecedented space for democratic expression.
However, these online civic spaces are under threat. Several governments have enacted vaguely-defined laws, allowing for random arrests.
Several countries have implemented repressive practices restricting freedom of expression and access to information. This is what is known as “digital authoritarianism”, which is on the rise in many countries.
This report takes stock of digital freedoms in 26 French-speaking African countries, and proposes concrete actions to improve citizen participation and democracy…(More)”
From the Economic Graph to Economic Insights: Building the Infrastructure for Delivering Labor Market Insights from LinkedIn Data
Blog by Patrick Driscoll and Akash Kaura: “LinkedIn’s vision is to create economic opportunity for every member of the global workforce. Since its inception in 2015, the Economic Graph Research and Insights (EGRI) team has worked to make this vision a reality by generating labor market insights such as:
- Real-time economic and workforce intelligence & insights. This takes the form of the monthly LinkedIn Workforce Report, newsletters on LinkedIn.com, working papers, and flagship reports about timely issues such as the green transition to address climate change.
- Sharing economic data with the government and multilateral partners. The Data for Impact program (DFI), whose partners include the World Bank, Inter-American Development Bank (IDB), International Monetary Fund (IMF), Organization for Economic Co-operation and Development (OECD), Destatis (Germany’s statistical authority), and the United Nations Development Program (UNDP), provides researchers the opportunity to leverage LinkedIn data to inform cutting-edge research, program design, and investment strategy.
- Sharing economic data and commentary with media such as CNBC, The Wall Street Journal, NBC News, Financial Times, etc. so their audiences can stay up to date on timely issues such as remote work, the gender gap, and climate change.
In this post, we’ll describe how the EGRI Data Foundations team (Team Asimov) leverages LinkedIn’s cutting-edge data infrastructure tools such as Unified Metrics Platform, Pinot, and Datahub to ensure we can deliver data and insights robustly, securely, and at scale to a myriad of partners. We will illustrate this through a case study of how we built the pipeline for our most well-known and oft-cited flagship metric: the LinkedIn Hiring Rate…(More)”.
AI and Global Governance: Modalities, Rationales, Tensions
Paper by Michael Veale, Kira Matus and Robert Gorwa: “Artificial intelligence (AI) is a salient but polarizing issue of recent times. Actors around the world are engaged in building a governance regime around it. What exactly the “it” is that is being governed, how, by who, and why—these are all less clear. In this review, we attempt to shine some light on those questions, considering literature on AI, the governance of computing, and regulation and governance more broadly. We take critical stock of the different modalities of the global governance of AI that have been emerging, such as ethical councils, industry governance, contracts and licensing, standards, international agreements, and domestic legislation with extraterritorial impact. Considering these, we examine selected rationales and tensions that underpin them, drawing attention to the interests and ideas driving these different modalities. As these regimes become clearer and more stable, we urge those engaging with or studying the global governance of AI to constantly ask the important question of all global governance regimes: Who benefits?…(More)”.
Artificial Intelligence in the COVID-19 Response
Report by Sean Mann, Carl Berdahl, Lawrence Baker, and Federico Girosi: “We conducted a scoping review to identify AI applications used in the clinical and public health response to COVID-19. Interviews with stakeholders early in the research process helped inform our research questions and guide our study design. We conducted a systematic search, screening, and full text review of both academic and gray literature…
- AI is still an emerging technology in health care, with growing but modest rates of adoption in real-world clinical and public health practice. The COVID-19 pandemic showcased the wide range of clinical and public health functions performed by AI as well as the limited evidence available on most AI products that have entered use.
- We identified 66 AI applications (full list in Appendix A) used to perform a wide range of diagnostic, prognostic, and treatment functions in the clinical response to COVID-19. This included applications used to analyze lung images, evaluate user-reported symptoms, monitor vital signs, predict infections, and aid in breathing tube placement. Some applications were used by health systems to help allocate scarce resources to patients.
- Many clinical applications were deployed early in the pandemic, and most were used in the United States, other high-income countries, or China. A few applications were used to care for hundreds of thousands or even millions of patients, although most were used to an unknown or limited extent.
- We identified 54 AI-based public health applications used in the pandemic response. These included AI-enabled cameras used to monitor health-related behavior and health messaging chatbots used to answer questions about COVID-19. Other applications were used to curate public health information, produce epidemiologic forecasts, or help prioritize communities for vaccine allocation and outreach efforts.
- We found studies supporting the use of 39 clinical applications and 8 public health applications, although few of these were independent evaluations, and we found no clinical trials evaluating any application’s impact on patient health. We found little evidence available on entire classes of applications, including some used to inform care decisions such as patient deterioration monitors.
- Further research is needed, particularly independent evaluations on application performance and health impacts in real-world care settings. New guidance may be needed to overcome the unique challenges to evaluating AI application impacts on patient- and population-level health outcomes….(More)” – See also: The #Data4Covid19 Review
From LogFrames to Logarithms – A Travel Log
Article by Karl Steinacker and Michael Kubach: “..Today, authorities all over the world are experimenting with predictive algorithms. That sounds technical and innocent but as we dive deeper into the issue, we realise that the real meaning is rather specific: fraud detection systems in social welfare payment systems. In the meantime, the hitherto banned terminology had it’s come back: welfare or social safety nets are, since a couple of years, en vogue again. But in the centuries-old Western tradition, welfare recipients must be monitored and, if necessary, sanctioned, while those who work and contribute must be assured that there is no waste. So it comes at no surprise that even today’s algorithms focus on the prime suspect, the individual fraudster, the undeserving poor.
Fraud detection systems promise that the taxpayer will no longer fall victim to fraud and efficiency gains can be re-directed to serve more people. The true extent of welfare fraud is regularly exaggerated while the costs of such systems is routinely underestimated. A comparison of the estimated losses and investments doesn’t take place. It is the principle to detect and punish the fraudsters that prevail. Other issues don’t rank high either, for example on how to distinguish between honest mistakes and deliberate fraud. And as case workers spent more time entering and analysing data and in front of a computer screen, the less they have time and inclination to talk to real people and to understand the context of their life at the margins of society.
Thus, it can be said that routinely hundreds of thousands of people are being scored. Example Denmark: Here, a system called Udbetaling Danmark was created in 2012 to streamline the payment of welfare benefits. Its fraud control algorithms can access the personal data of millions of citizens, not all of whom receive welfare payments. In contrast to the hundreds of thousands affected by this data mining, the number of cases referred to the Police for further investigation are minute.
In the city of Rotterdam in the Netherlands every year, data of 30,000 welfare recipients is investigated in order to flag suspected welfare cheats. However, an analysis of its scoring system based on machine learning and algorithms showed systemic discrimination with regard to ethnicity, age, gender, and parenthood. It revealed evidence of other fundamental flaws making the system both inaccurate and unfair. What might appear to a caseworker as a vulnerability is treated by the machine as grounds for suspicion. Despite the scale of data used to calculate risk scores, the output of the system is not better than random guesses. However, the consequences of being flagged by the “suspicion machine” can be drastic, with fraud controllers empowered to turn the lives of suspects inside out.
As reported by the World Bank, the recent Covid-19 pandemic provided a great push to implement digital social welfare systems in the global South. In fact, for the World Bank the so-called Digital Public Infrastructure (DPI), enabling “Digitizing Government to Person Payments (G2Px)”, are as fundamental for social and economic development today as physical infrastructure was for previous generations. Hence, the World Bank is finances globally systems modelled after the Indian Aadhaar system, where more than a billion persons have been registered biometrically. Aadhaar has become, for all intents and purposes, a pre-condition to receive subsidised food and other assistance for 800 million Indian citizens.
Important international aid organisations are not behaving differently from states. The World Food Programme alone holds data of more than 40 million people on its Scope data base. Unfortunately, WFP like other UN organisations, is not subject to data protection laws and the jurisdiction of courts. This makes the communities they have worked with particularly vulnerable.
In most places, the social will become the metric, where logarithms determine the operational conduit for delivering, controlling and withholding assistance, especially welfare payments. In other places, the power of logarithms may go even further, as part of trust systems, creditworthiness, and social credit. These social credit systems for individuals are highly controversial as they require mass surveillance since they aim to track behaviour beyond financial solvency. The social credit score of a citizen might not only suffer from incomplete, or inaccurate data, but also from assessing political loyalties and conformist social behaviour…(More)”.
An Action Plan Towards a “New Deal on Data” in Africa
Blog by Charlie Martial Ngounou, Hannah Chafetz, Sampriti Saxena, Adrienne Schmoeker, Stefaan G. Verhulst, & Andrew J. Zahuranec: “To help accelerate responsible data use across the African data ecosystem, AfroLeadership with the support of The GovLab hosted two Open Data Action Labs in March and April 2023 focused on advancing open data policy across Africa. The Labs brought together domain experts across the African data ecosystem to build upon the African Union’s Data Policy Framework and develop an instrument to help realize Agenda 2063.
The Labs included discussions about the current state of open data policy and what could be involved in a “New Deal on Data” across the African continent. Specifically, the Labs explored how open data across African countries and communities could become more:
- Purpose-led: how to strengthen the value proposition of and incentives for open data and data re-use, and become purpose-led?
- Practice-led: how to accelerate the implementation of open data and data re-use policies, moving from policy to practice?
- People-led: how to trigger engagement, collaboration and coordination with communities and stakeholders toward advancing data rights, community interests, and diversity of needs and capacities?

Following the Labs, the organizing team conducted a brainstorming session to synthesize the insights gathered and develop an action plan towards a “New Deal on Data” for Africa. Below we provide a summary of our action plan. The action plan includes four vehicles that could make progress towards becoming purpose-, practice-, and people-led. These include:
- A “New Deal” Observatory: An online resource that takes stock of the the current state of open data policies, barriers to implementation, and use cases from the local to continental levels
- A Community-Led Platform: A solutions platform that helps advance data stewardship across African countries and communities
- “New Deal” Investment: Supporting the development of locally sourced solutions and nuanced technologies tailored to the African context
- Responsible Data Stewardship Framework: A framework that open data stewards can use to support their existing efforts when looking to encourage or implement grassroots policies…(More)”