Patients are Pooling Data to Make Diabetes Research More Representative


Blog by Tracy Kariuki: “Saira Khan-Gallo knows how overwhelming managing and living healthily with diabetes can be. As a person living with type 1 diabetes for over two decades, she understands how tracking glucose levels, blood pressure, blood cholesterol, insulin intake, and, and, and…could all feel like drowning in an infinite pool of numbers.

But that doesn’t need to be the case. This is why Tidepool, a non-profit tech organization composed of caregivers and other people living with diabetes such as Gallo, is transforming diabetes data management. Its data visualization platform enables users to make sense of the data and derive insights into their health status….

Through its Big Data Donation Project, Tidepool has been supporting the advancement of diabetes research by sharing anonymized data from people living with diabetes with researchers.

To date, more than 40,000 individuals have chosen to donate data uploaded from their diabetes devices like blood glucose meters, insulin pumps and continuous glucose monitors, which is then shared by Tidepool with students, academics, researchers, and industry partners — Making the database larger than many clinical trials. For instance, Oregon Health and Science University have used datasets collected from Tidepool to build an algorithm that predicts hypoglycemia, which is low blood sugar, with the goal of advancing closed loop therapy for diabetes management…(More)”.

Datafication, Identity, and the Reorganization of the Category Individual


Paper by Juan Ortiz Freuler: “A combination of political, sociocultural, and technological shifts suggests a change in the way we understand human rights. Undercurrents fueling this process are digitization and datafication. Through this process of change, categories that might have been cornerstones of our past and present might very well become outdated. A key category that is under pressure is that of the individual. Since datafication is typically accompanied by technologies and processes aimed at segmenting and grouping, such groupings become increasingly relevant at the expense of the notion of the individual. This concept might become but another collection of varied characteristics, a unit of analysis that is considered at times too broad—and at other times too narrow—to be considered relevant or useful by the systems driving our key economic, social, and political processes.

This Essay provides a literature review and a set of key definitions linking the processes of digitization, datafication, and the concept of the individual to existing conceptions of individual rights. It then presents a framework to dissect and showcase the ways in which current technological developments are putting pressure on our existing conceptions of the individual and individual rights…(More)”.

What prevents us from reusing medical real-world data in research


Paper by Julia Gehrmann, Edit Herczog, Stefan Decker & Oya Beyan: “Recent studies show that Medical Data Science (MDS) carries great potential to improve healthcare. Thereby, considering data from several medical areas and of different types, i.e. using multimodal data, significantly increases the quality of the research results. On the other hand, the inclusion of more features in an MDS analysis means that more medical cases are required to represent the full range of possible feature combinations in a quantity that would be sufficient for a meaningful analysis. Historically, data acquisition in medical research applies prospective data collection, e.g. in clinical studies. However, prospectively collecting the amount of data needed for advanced multimodal data analyses is not feasible for two reasons. Firstly, such a data collection process would cost an enormous amount of money. Secondly, it would take decades to generate enough data for longitudinal analyses, while the results are needed now. A worthwhile alternative is using real-world data (RWD) from clinical systems of e.g. university hospitals. This data is immediately accessible in large quantities, providing full flexibility in the choice of the analyzed research questions. However, when compared to prospectively curated data, medical RWD usually lacks quality due to the specificities of medical RWD outlined in section 2. The reduced quality makes its preparation for analysis more challenging…(More)”.

Setting data free: The politics of open data for food and agriculture


Paper by M. Fairbairn, and Z. Kish: “Open data is increasingly being promoted as a route to achieve food security and agricultural development. This article critically examines the promotion of open agri-food data for development through a document-based case study of the Global Open Data for Agriculture and Nutrition (GODAN) initiative as well as through interviews with open data practitioners and participant observation at open data events. While the concept of openness is striking for its ideological flexibility, we argue that GODAN propagates an anti-political, neoliberal vision for how open data can enhance agricultural development. This approach centers values such as private innovation, increased production, efficiency, and individual empowerment, in contrast to more political and collectivist approaches to openness practiced by some agri-food social movements. We further argue that open agri-food data projects, in general, have a tendency to reproduce elements of “data colonialism,” extracting data with minimal consideration for the collective harms that may result, and embedding their own values within universalizing information infrastructures…(More)”.

A new way to look at data privacy


Article by Adam Zewe: “Imagine that a team of scientists has developed a machine-learning model that can predict whether a patient has cancer from lung scan images. They want to share this model with hospitals around the world so clinicians can start using it in diagnosis.

But there’s a problem. To teach their model how to predict cancer, they showed it millions of real lung scan images, a process called training. Those sensitive data, which are now encoded into the inner workings of the model, could potentially be extracted by a malicious agent. The scientists can prevent this by adding noise, or more generic randomness, to the model that makes it harder for an adversary to guess the original data. However, perturbation reduces a model’s accuracy, so the less noise one can add, the better.

MIT researchers have developed a technique that enables the user to potentially add the smallest amount of noise possible, while still ensuring the sensitive data are protected.

The researchers created a new privacy metric, which they call Probably Approximately Correct (PAC) Privacy, and built a framework based on this metric that can automatically determine the minimal amount of noise that needs to be added. Moreover, this framework does not need knowledge of the inner workings of a model or its training process, which makes it easier to use for different types of models and applications.

In several cases, the researchers show that the amount of noise required to protect sensitive data from adversaries is far less with PAC Privacy than with other approaches. This could help engineers create machine-learning models that provably hide training data, while maintaining accuracy in real-world settings…

A fundamental question in data privacy is: How much sensitive data could an adversary recover from a machine-learning model with noise added to it?

Differential Privacy, one popular privacy definition, says privacy is achieved if an adversary who observes the released model cannot infer whether an arbitrary individual’s data is used for the training processing. But provably preventing an adversary from distinguishing data usage often requires large amounts of noise to obscure it. This noise reduces the model’s accuracy.

PAC Privacy looks at the problem a bit differently. It characterizes how hard it would be for an adversary to reconstruct any part of randomly sampled or generated sensitive data after noise has been added, rather than only focusing on the distinguishability problem…(More)”

How do we know how smart AI systems are?


Article by Melanie Mitchell: “In 1967, Marvin Minksy, a founder of the field of artificial intelligence (AI), made a bold prediction: “Within a generation…the problem of creating ‘artificial intelligence’ will be substantially solved.” Assuming that a generation is about 30 years, Minsky was clearly overoptimistic. But now, nearly two generations later, how close are we to the original goal of human-level (or greater) intelligence in machines?

Some leading AI researchers would answer that we are quite close. Earlier this year, deep-learning pioneer and Turing Award winner Geoffrey Hinton told Technology Review, “I have suddenly switched my views on whether these things are going to be more intelligent than us. I think they’re very close to it now and they will be much more intelligent than us in the future.” His fellow Turing Award winner Yoshua Bengio voiced a similar opinion in a recent blog post: “The recent advances suggest that even the future where we know how to build superintelligent AIs (smarter than humans across the board) is closer than most people expected just a year ago.”

These are extraordinary claims that, as the saying goes, require extraordinary evidence. However, it turns out that assessing the intelligence—or more concretely, the general capabilities—of AI systems is fraught with pitfalls. Anyone who has interacted with ChatGPT or other large language models knows that these systems can appear quite intelligent. They converse with us in fluent natural language, and in many cases seem to reason, to make analogies, and to grasp the motivations behind our questions. Despite their well-known unhumanlike failings, it’s hard to escape the impression that behind all that confident and articulate language there must be genuine understanding…(More)”.

Building Responsive Investments in Gender Equality using Gender Data System Maturity Models


Tools and resources by Data2X and Open Data Watch: “.. to help countries check the maturity of their gender data systems and set priorities for gender data investments. The new Building Responsive Investments in Data for Gender Equality (BRIDGE) tool is designed for use by gender data focal points in national statistical offices (NSOs) of low- and middle- income countries and by their partners within the national statistical system (NSS) to communicate gender data priorities to domestic sources of financing and international donors.

The BRIDGE results will help gender data stakeholders understand the current maturity level of their gender data system, diagnose strengths and weaknesses, and identify priority areas for improvement. They will also serve as an input to any roadmap or action plan developed in collaboration with key stakeholders within the NSS.

Below are links to and explanations of our ‘Gender Data System Maturity Model’ briefs (a long and short version), our BRIDGE assessment and tools methodology, how-to guide, questionnaire, and scoring form that will provide an overall assessment of system maturity and insight into potential action plans to strengthen gender data systems…(More)”.

De Gruyter Handbook of Citizens’ Assemblies


Book edited by Min Reuchamps, Julien Vrydagh and Yanina Welp: “Citizens’ Assemblies (CAs) are flourishing around the world. Quite often composed of randomly selected citizens, CAs, arguably, come as a possible answer to contemporary democratic challenges. Democracies worldwide are indeed confronted with a series of disruptive phenomena such as a widespread perception of distrust and growing polarization as well as low performance. Many actors seek to reinvigorate democracy with citizen participation and deliberation. CAs are expected to have the potential to meet this twofold objective. But, despite deliberative and inclusive qualities of CAs, many questions remain open. The increasing popularity of CAs call for a holistic reflection and evaluation on their origins, current uses and future directions.

The De Gruyter Handbook of Citizens’ Assemblies showcases the state of the art around the study of CAs and opens novel perspectives informed by multidisciplinary research and renewed thinking about deliberative participatory processes. It discusses the latest theoretical, empirical, and methodological scientific developments on CAs and offers a unique resource for scholars, decision-makers, practitioners, and curious citizens to better understand the qualities, purposes, promises but also pitfalls of CAs…(More)”.

Connecting After Chaos: Social Media and the Extended Aftermath of Disaster


Book by Stephen F. Ostertag: “Natural disasters and other such catastrophes typically attract large-scale media attention and public concern in their immediate aftermath. However, rebuilding efforts can take years or even decades, and communities are often left to repair physical and psychological damage on their own once public sympathy fades away. Connecting After Chaos tells the story of how people restored their lives and society in the months and years after disaster, focusing on how New Orleanians used social media to cope with trauma following Hurricane Katrina.

Stephen F. Ostertag draws on almost a decade of research to create a vivid portrait of life in “settling times,” a term he defines as a distinct social condition of prolonged insecurity and uncertainty after disasters. He portrays this precarious state through the story of how a group of strangers began blogging in the wake of Katrina, and how they used those blogs to put their lives and their city back together. In the face of institutional failure, weak authority figures, and an abundance of chaos, the people of New Orleans used social media to gain information, foster camaraderie, build support networks, advocate for and against proposed policies, and cope with trauma. In the efforts of these bloggers, Ostertag finds evidence of the capacity of this and other forms of cultural work to motivate, guide, and energize collective action aimed at weathering the constant instability of extended recovery periods. Connecting After Chaos is both a compelling story of a community in crisis and a broader argument for the power of social media and cultural cooperation to create order when chaos abounds…(More)”.

Meta Ran a Giant Experiment in Governance. Now It’s Turning to AI


Article by Aviv Ovadya: “Late last month, Meta quietly announced the results of an ambitious, near-global deliberative “democratic” process to inform decisions around the company’s responsibility for the metaverse it is creating. This was not an ordinary corporate exercise. It involved over 6,000 people who were chosen to be demographically representative across 32 countries and 19 languages. The participants spent many hours in conversation in small online group sessions and got to hear from non-Meta experts about the issues under discussion. Eighty-two percent of the participants said that they would recommend this format as a way for the company to make decisions in the future.

Meta has now publicly committed to running a similar process for generative AI, a move that aligns with the huge burst of interest in democratic innovation for governing or guiding AI systems. In doing so, Meta joins Google, DeepMind, OpenAI, Anthropic, and other organizations that are starting to explore approaches based on the kind of deliberative democracy that I and others have been advocating for. (Disclosure: I am on the application advisory committee for the OpenAI Democratic inputs to AI grant.) Having seen the inside of Meta’s process, I am excited about this as a valuable proof of concept for transnational democratic governance. But for such a process to truly be democratic, participants would need greater power and agency, and the process itself would need to be more public and transparent.

I first got to know several of the employees responsible for setting up Meta’s Community Forums (as these processes came to be called) in the spring of 2019 during a more traditional external consultation with the company to determine its policy on “manipulated media.” I had been writing and speaking about the potential risks of what is now called generative AI and was asked (alongside other experts) to provide input on the kind of policies Meta should develop to address issues such as misinformation that could be exacerbated by the technology.

At around the same time, I first learned about representative deliberations—an approach to democratic decisionmaking that has taken off like wildfire, with increasingly high-profile citizen assemblies and deliberative polls all over the world. The basic idea is that governments bring difficult policy questions back to the public to decide. Instead of a referendum or elections, a representative microcosm of the public is selected via lottery. That group is brought together for days or even weeks (with compensation) to learn from experts, stakeholders, and each other before coming to a final set of recommendations…(More)”.