Inclusive Cyber Policy Making


Toolkit by Global Digital Partnership: “Marginalised perspectives, particularly from women and LGBTQ+ communities, are largely absent in current cyber norm discussions. This is a serious issue, as marginalised groups often face elevated and specific threats in cyberspace

Our bespoke toolkit provides policymakers and other stakeholders with a range of resources to address this lack of inclusion, including:

  • A how-to guide on developing an inclusive process to develop a cybernorm or implement existing agreed norms
  • An introduction to key terms and concepts relevant to inclusivity and cybernorms
  • Key questions for facilitating inclusive stakeholder mapping processes
  • A mapping of regional and global cybernorm processes…(More)”.

Cross-Border Data Policy Index


Report by the Global Data Alliance: “The ability to responsibly transfer data around the globe supports cross-border economic opportunity, cross-border technological and scientific progress, and cross-border digital transformation and inclusion, among other public policy objectives. To assess where policies have helped create an enabling environment for cross-border data and its associated benefits, the Global Data Alliance has developed the Cross-Border Data Policy Index.

The Cross-Border Data Policy Index offers a quantitative and qualitative assessment of the relative openness or restrictiveness of cross-border data policies across nearly 100 economies. Global economies are classified into four levels. At Level 1 are economies that impose relatively fewer limits on the cross-border access to knowledge, information, digital tools, and economic opportunity for their citizens and legal persons. Economies’ restrictiveness scores increase as they are found to impose greater limits on cross-border data, thereby eroding opportunities for digital transformation while also impeding other policy objectives relating to health, safety, security, and the environment…(More)”.

A Comparative Perspective on AI Regulation


Blog by Itsiq Benizri, Arianna Evers, Shannon Togawa Mercer, Ali A. Jessani: “The question isn’t whether AI will be regulated, but how. Both the European Union and the United Kingdom have stepped up to the AI regulation plate with enthusiasm but have taken different approaches: The EU has put forth a broad and prescriptive proposal in the AI Act, which aims to regulate AI by adopting a risk-based approach that increases the compliance obligations depending on the specific use case. The U.K., in turn, has committed to abstaining from new legislation for the time being, relying instead on existing regulations and regulators with an AI-specific overlay. The United States, meanwhile, has pushed for national AI standards through the executive branch but also has adopted some AI-specific rules at the state level (both through comprehensive privacy legislation and for specific AI-related use cases). Between these three jurisdictions, there are multiple approaches to AI regulation that can help strike the balance between developing AI technology and ensuring that there is a framework in place to account for potential harms to consumers and others. Given the explosive popularity and development of AI in recent months, there is likely to be a strong push by companies, entrepreneurs, and tech leaders in the near future for additional clarity on AI. Regulators will have to answer these calls. Despite not knowing what AI regulation in the United States will look like in one year (let alone five), savvy AI users and developers should examine these early regulatory approaches to try and chart a thoughtful approach to AI…(More)”

Patients are Pooling Data to Make Diabetes Research More Representative


Blog by Tracy Kariuki: “Saira Khan-Gallo knows how overwhelming managing and living healthily with diabetes can be. As a person living with type 1 diabetes for over two decades, she understands how tracking glucose levels, blood pressure, blood cholesterol, insulin intake, and, and, and…could all feel like drowning in an infinite pool of numbers.

But that doesn’t need to be the case. This is why Tidepool, a non-profit tech organization composed of caregivers and other people living with diabetes such as Gallo, is transforming diabetes data management. Its data visualization platform enables users to make sense of the data and derive insights into their health status….

Through its Big Data Donation Project, Tidepool has been supporting the advancement of diabetes research by sharing anonymized data from people living with diabetes with researchers.

To date, more than 40,000 individuals have chosen to donate data uploaded from their diabetes devices like blood glucose meters, insulin pumps and continuous glucose monitors, which is then shared by Tidepool with students, academics, researchers, and industry partners — Making the database larger than many clinical trials. For instance, Oregon Health and Science University have used datasets collected from Tidepool to build an algorithm that predicts hypoglycemia, which is low blood sugar, with the goal of advancing closed loop therapy for diabetes management…(More)”.

Datafication, Identity, and the Reorganization of the Category Individual


Paper by Juan Ortiz Freuler: “A combination of political, sociocultural, and technological shifts suggests a change in the way we understand human rights. Undercurrents fueling this process are digitization and datafication. Through this process of change, categories that might have been cornerstones of our past and present might very well become outdated. A key category that is under pressure is that of the individual. Since datafication is typically accompanied by technologies and processes aimed at segmenting and grouping, such groupings become increasingly relevant at the expense of the notion of the individual. This concept might become but another collection of varied characteristics, a unit of analysis that is considered at times too broad—and at other times too narrow—to be considered relevant or useful by the systems driving our key economic, social, and political processes.

This Essay provides a literature review and a set of key definitions linking the processes of digitization, datafication, and the concept of the individual to existing conceptions of individual rights. It then presents a framework to dissect and showcase the ways in which current technological developments are putting pressure on our existing conceptions of the individual and individual rights…(More)”.

What prevents us from reusing medical real-world data in research


Paper by Julia Gehrmann, Edit Herczog, Stefan Decker & Oya Beyan: “Recent studies show that Medical Data Science (MDS) carries great potential to improve healthcare. Thereby, considering data from several medical areas and of different types, i.e. using multimodal data, significantly increases the quality of the research results. On the other hand, the inclusion of more features in an MDS analysis means that more medical cases are required to represent the full range of possible feature combinations in a quantity that would be sufficient for a meaningful analysis. Historically, data acquisition in medical research applies prospective data collection, e.g. in clinical studies. However, prospectively collecting the amount of data needed for advanced multimodal data analyses is not feasible for two reasons. Firstly, such a data collection process would cost an enormous amount of money. Secondly, it would take decades to generate enough data for longitudinal analyses, while the results are needed now. A worthwhile alternative is using real-world data (RWD) from clinical systems of e.g. university hospitals. This data is immediately accessible in large quantities, providing full flexibility in the choice of the analyzed research questions. However, when compared to prospectively curated data, medical RWD usually lacks quality due to the specificities of medical RWD outlined in section 2. The reduced quality makes its preparation for analysis more challenging…(More)”.

Setting data free: The politics of open data for food and agriculture


Paper by M. Fairbairn, and Z. Kish: “Open data is increasingly being promoted as a route to achieve food security and agricultural development. This article critically examines the promotion of open agri-food data for development through a document-based case study of the Global Open Data for Agriculture and Nutrition (GODAN) initiative as well as through interviews with open data practitioners and participant observation at open data events. While the concept of openness is striking for its ideological flexibility, we argue that GODAN propagates an anti-political, neoliberal vision for how open data can enhance agricultural development. This approach centers values such as private innovation, increased production, efficiency, and individual empowerment, in contrast to more political and collectivist approaches to openness practiced by some agri-food social movements. We further argue that open agri-food data projects, in general, have a tendency to reproduce elements of “data colonialism,” extracting data with minimal consideration for the collective harms that may result, and embedding their own values within universalizing information infrastructures…(More)”.

A new way to look at data privacy


Article by Adam Zewe: “Imagine that a team of scientists has developed a machine-learning model that can predict whether a patient has cancer from lung scan images. They want to share this model with hospitals around the world so clinicians can start using it in diagnosis.

But there’s a problem. To teach their model how to predict cancer, they showed it millions of real lung scan images, a process called training. Those sensitive data, which are now encoded into the inner workings of the model, could potentially be extracted by a malicious agent. The scientists can prevent this by adding noise, or more generic randomness, to the model that makes it harder for an adversary to guess the original data. However, perturbation reduces a model’s accuracy, so the less noise one can add, the better.

MIT researchers have developed a technique that enables the user to potentially add the smallest amount of noise possible, while still ensuring the sensitive data are protected.

The researchers created a new privacy metric, which they call Probably Approximately Correct (PAC) Privacy, and built a framework based on this metric that can automatically determine the minimal amount of noise that needs to be added. Moreover, this framework does not need knowledge of the inner workings of a model or its training process, which makes it easier to use for different types of models and applications.

In several cases, the researchers show that the amount of noise required to protect sensitive data from adversaries is far less with PAC Privacy than with other approaches. This could help engineers create machine-learning models that provably hide training data, while maintaining accuracy in real-world settings…

A fundamental question in data privacy is: How much sensitive data could an adversary recover from a machine-learning model with noise added to it?

Differential Privacy, one popular privacy definition, says privacy is achieved if an adversary who observes the released model cannot infer whether an arbitrary individual’s data is used for the training processing. But provably preventing an adversary from distinguishing data usage often requires large amounts of noise to obscure it. This noise reduces the model’s accuracy.

PAC Privacy looks at the problem a bit differently. It characterizes how hard it would be for an adversary to reconstruct any part of randomly sampled or generated sensitive data after noise has been added, rather than only focusing on the distinguishability problem…(More)”

How do we know how smart AI systems are?


Article by Melanie Mitchell: “In 1967, Marvin Minksy, a founder of the field of artificial intelligence (AI), made a bold prediction: “Within a generation…the problem of creating ‘artificial intelligence’ will be substantially solved.” Assuming that a generation is about 30 years, Minsky was clearly overoptimistic. But now, nearly two generations later, how close are we to the original goal of human-level (or greater) intelligence in machines?

Some leading AI researchers would answer that we are quite close. Earlier this year, deep-learning pioneer and Turing Award winner Geoffrey Hinton told Technology Review, “I have suddenly switched my views on whether these things are going to be more intelligent than us. I think they’re very close to it now and they will be much more intelligent than us in the future.” His fellow Turing Award winner Yoshua Bengio voiced a similar opinion in a recent blog post: “The recent advances suggest that even the future where we know how to build superintelligent AIs (smarter than humans across the board) is closer than most people expected just a year ago.”

These are extraordinary claims that, as the saying goes, require extraordinary evidence. However, it turns out that assessing the intelligence—or more concretely, the general capabilities—of AI systems is fraught with pitfalls. Anyone who has interacted with ChatGPT or other large language models knows that these systems can appear quite intelligent. They converse with us in fluent natural language, and in many cases seem to reason, to make analogies, and to grasp the motivations behind our questions. Despite their well-known unhumanlike failings, it’s hard to escape the impression that behind all that confident and articulate language there must be genuine understanding…(More)”.

Building Responsive Investments in Gender Equality using Gender Data System Maturity Models


Tools and resources by Data2X and Open Data Watch: “.. to help countries check the maturity of their gender data systems and set priorities for gender data investments. The new Building Responsive Investments in Data for Gender Equality (BRIDGE) tool is designed for use by gender data focal points in national statistical offices (NSOs) of low- and middle- income countries and by their partners within the national statistical system (NSS) to communicate gender data priorities to domestic sources of financing and international donors.

The BRIDGE results will help gender data stakeholders understand the current maturity level of their gender data system, diagnose strengths and weaknesses, and identify priority areas for improvement. They will also serve as an input to any roadmap or action plan developed in collaboration with key stakeholders within the NSS.

Below are links to and explanations of our ‘Gender Data System Maturity Model’ briefs (a long and short version), our BRIDGE assessment and tools methodology, how-to guide, questionnaire, and scoring form that will provide an overall assessment of system maturity and insight into potential action plans to strengthen gender data systems…(More)”.