On conspiracy theories of ignorance


Essay by In “On the Sources of Knowledge and Ignorance”, Karl Popper identifies a kind of “epistemological optimism”—an optimism about “man’s power to discern truth and to acquire knowledge”—that has played a significant role in the history of philosophy. At the heart of this optimistic view, Popper argues, is the “doctrine that truth is manifest”:

“Truth may perhaps be veiled, and removing the veil may not be easy. But once the naked truth stands revealed before our eyes, we have the power to see it, to distinguish it from falsehood, and to know that it is truth.”

According to Popper, this doctrine inspired the birth of modern science, technology, and liberalism. If the truth is manifest, there is “no need for any man to appeal to authority in matters of truth because each man carried the sources of knowledge in himself”:

“Man can know: thus he can be free. This is the formula which explains the link between epistemological optimism and the ideas of liberalism.”

Although a liberal himself, Popper argues that the doctrine of manifest truth is false. “The simple truth,” he writes, “is that truth is often hard to come by, and that once found it may easily be lost again.” Moreover, he argues that the doctrine is pernicious. If we think the truth is manifest, we create “the need to explain falsehood”:

“Knowledge, the possession of truth, need not be explained. But how can we ever fall into error if truth is manifest? The answer is: through our own sinful refusal to see the manifest truth; or because our minds harbour prejudices inculcated by education and tradition, or other evil influences which have perverted our originally pure and innocent minds.”

In this way, the doctrine of manifest truth inevitably gives rise to “the conspiracy theory of ignorance”…

In previous work, I have criticised how the concept of “misinformation” is applied by researchers and policy-makers. Roughly, I think that narrow applications of the term (e.g., defined in terms of fake news) are legitimate but focus on content that is relatively rare and largely symptomatic of other problems, at least in Western democracies. In contrast, broad definitions inevitably get applied in biased and subjective ways, transforming misinformation research and policy-making into “partisan combat by another name”…(More)”

Conflicts over access to Americans’ personal data emerging across federal government


Article by Caitlin Andrews: “The Trump administration’s fast-moving efforts to limit the size of the U.S. federal bureaucracy, primarily through the recently minted Department of Government Efficiency, are raising privacy and data security concerns among current and former officials across the government, particularly as the administration scales back positions charged with privacy oversight. Efforts to limit the independence of a host of federal agencies through a new executive order — including the independence of the Federal Trade Commission and Securities and Exchange Commission — are also ringing alarm bells among civil society and some legal experts.

According to CNN, several staff within the Office of Personnel Management’s privacy and records keeping department were fired last week. Staff who handle communications and respond to Freedom of Information Act requests were also let go. Though the entire privacy team was not fired, according to the OPM, details about what kind of oversight will remain within the department were limited. The report also states the staff’s termination date is 15 April.

It is one of several moves the Trump administration has made in recent days reshaping how entities access and provide oversight to government agencies’ information.

The New York Times reports on a wide range of incidents within the government where DOGE’s efforts to limit fraudulent government spending by accessing sensitive agency databases have run up against staffers who are concerned about the privacy of Americans’ personal information. In one incident, Social Security Administration acting Commissioner Michelle King was fired after resisting a request from DOGE to access the agency’s database. “The episode at the Social Security Administration … has played out repeatedly across the federal government,” the Times reported…(More)”.

Being an Effective Policy Analyst in the Age of Information Overload


Blog by Adam Thierer: “The biggest challenge of being an effective technology policy analyst, academic, or journalist these days is that the shelf life of your products is measured in weeks — and sometimes days — instead of months. Because of that, I’ve been adjusting my own strategies over time to remain effective.

The thoughts and advice I offer here are meant mostly for other technology policy analysts, whether you are a student or young professional just breaking into the field, or someone in the middle of your career looking to take it to the next level. But much of what I’ll say here is generally applicable across the field of policy analysis. It’s just a lot more relevant for people in the field of tech policy because of its fast-moving, ever-changing nature.

This essay will repeatedly reference two realities that have shaped my life both as an average citizen and as an academic and policy analyst: First, we used to live in a world of information scarcity, but we now live in a world of information abundance–and that trend is only accelerating. Second, life and work in a world of information overload is simultaneously a wonderful and awful thing, but one thing is for sure: there is absolutely no going back to the sleepy days of information scarcity.

If you care to be an effective policy analyst today, then you have to come to grips with these new realities. Here are a few tips…(More)”.

Cities, health, and the big data revolution


Blog by Harvard Public Health: “Cities influence our health in unexpected ways. From sidewalks to crosswalks, the built environment affects how much we move, impacting our risk for diseases like obesity and diabetes. A recent New York City study underscores that focusing solely on infrastructure, without understanding how people use it, can lead to ineffective interventions. Researchers analyzed over two million Google Street View images, combining them with health and demographic data to reveal these dynamics. Harvard Public Health spoke with Rumi Chunara, director of New York University’s Center for Health Data Science and lead author of the study.

Why study this topic?

We’re seeing an explosion of new data sources, like street-view imagery, being used to make decisions. But there’s often a disconnect—people using these tools don’t always have the public health knowledge to interpret the data correctly. We wanted to highlight the importance of combining data science and domain expertise to ensure interventions are accurate and impactful.

What did you find?

We discovered that the relationship between built environment features and health outcomes isn’t straightforward. It’s not just about having sidewalks; it’s about how often people are using them. Improving physical activity levels in a community could have a far greater impact on health outcomes than simply adding more infrastructure.

It also revealed the importance of understanding the local context. For instance, Google Street View data sometimes misclassifies sidewalks, particularly near highways or bridges, leading to inaccurate conclusions. Relying solely on this data, without accounting for these nuances, could result in less effective interventions…(More)”.

Randomize NIH grant giving


Article by Vinay Prasad: “A pause in NIH study sections has been met with fear and anxiety from researchers. At many universities, including mine, professors live on soft money. No grants? If you are assistant professor, you can be asked to pack your desk. If you are a full professor, the university slowly cuts your pay until you see yourself out. Everyone talks about you afterwards, calling you a failed researcher. They laugh, a little too long, and then blink back tears as they wonder if they are next. Of course, your salary doubles in the new job and you are happier, but you are still bitter and gossiped about.

In order to apply for NIH grants, you have to write a lot of bullshit. You write specific aims and methods, collect bios from faculty and more. There is a section where you talk about how great your department and team is— this is the pinnacle of the proverbial expression, ‘to polish a turd.’ You invite people to work on your grant if they have a lot of papers or grants or both, and they agree to be on your grant even though they don’t want to talk to you ever again.

You submit your grant and they hire someone to handle your section. They find three people to review it. Ideally, they pick people who have no idea what you are doing or why it is important, and are not as successful as you, so they can hate read your proposal. If, despite that, they give you a good score, you might be discussed at study section.

The study section assembles scientists to discuss your grant. As kids who were picked last in kindergarten basketball, they focus on the minutiae. They love to nitpick small things. If someone on study section doesn’t like you, they can tank you. In contrast, if someone loves you, they can’t really single handedly fund you.

You might wonder if study section leaders are the best scientists. Rest assured. They aren’t. They are typically mid career, mediocre scientists. (This is not just a joke, data support this claim see www.drvinayprasad.com). They rarely have written extremely influential papers.

Finally, your proposal gets a percentile score. Here is the chance of funding by percentile. You might get a chance to revise your grant if you just fall short….Given that the current system is onerous and likely flawed, you would imagine that NIH leadership has repeatedly tested whether the current method is superior than say a modified lottery, aka having an initial screen and then randomly giving out the money.

Of course not. Self important people giving out someone else’s money rarely study their own processes. If study sections are no better than lottery, that would mean a lot of NIH study section officers would no longer need to work hard from home half the day, freeing up money for one more grant.

Let’s say we take $200 million and randomize it. Half of it is allocated to being given out in the traditional method, and the other half is allocated to a modified lottery. If an application is from a US University and passes a minimum screen, it is enrolled in the lottery.

Then we follow these two arms into the future. We measure publications, citations, h index, the average impact factor of journals in which the papers are published, and more. We even take a subset of the projects and blind reviewers to score the output. Can they tell which came from study section?…(More)”.

Will big data lift the veil of ignorance?


Blog by Lisa Herzog: “Imagine that you have a toothache, and a visit at the dentist reveals that a major operation is needed. You phone your health insurance. You listen to the voice of the chatbot, press the buttons to go through the menu. And then you hear: “We have evaluated your profile based on the data you have agreed to share with us. Your dental health behavior scores 6 out of 10. The suggested treatment plan therefore requires a co-payment of [insert some large sum of money here].”

This may sound like science fiction. But many other insurances, e.g. car insurances, already build on automated data being shared with them. If they were allowed, health insurers would certainly like to access our data as well – not only those from smart toothbrushes, but also credit card data, behavioral data (e.g. from step counting apps), or genetic data. If they were allowed to use them, they could move towards segmented insurance plans for specific target groups. As two commentators, on whose research I come back below, recently wrote about health insurance: “Today, public plans and nondiscrimination clauses, not lack of information, are what stands between integration and segmentation.”

If, like me, you’re interested in the relation between knowledge and institutional design, insurance is a fascinating topic. The basic idea of insurance is centuries old – here is a brief summary (skip a few paragraphs if you know this stuff). Because we cannot know what might happen to us in the future, but we can know that on an aggregate level, things will happen to people, it can make sense to enter an insurance contract, creating a pool that a group jointly contributes to. Those for whom the risks in question materialize get support from the pool. Those for whom it does not materialize may go through life without receiving any money, but they still know that they could get support if something happened to them. As such, insurance combines solidarity within a group with individual pre-caution…(More)”.

Data Stewardship as Environmental Stewardship


Article by Stefaan Verhulst and Sara Marcucci: “Why responsible data stewardship could help address today’s pressing environmental challenges resulting from artificial intelligence and other data-related technologies…

Even as the world grows increasingly reliant on data and artificial intelligence, concern over the environmental impact of data-related activities is increasing. Solutions remain elusive. The rise of generative AI, which rests on a foundation of massive data sets and computational power, risks exacerbating the problem.

In the below, we propose that responsible data stewardship offers a potential pathway to reducing the environmental footprint of data activities. By promoting practices such as data reuse, minimizing digital waste, and optimizing storage efficiency, data stewardship can help mitigate environmental harm. Additionally, data stewardship supports broader environmental objectives by facilitating better decision-making through transparent, accessible, and shared data. In the below, we suggest that advancing data stewardship as a cornerstone of environmental responsibility could provide a compelling approach to addressing the dual challenges of advancing digital technologies while safeguarding the environment…(More)”

Data Governance Meets the EU AI Act


Article by Axel Schwanke: “..The EU AI Act emphasizes sustainable AI through robust data governance, promoting principles like data minimization, purpose limitation, and data quality to ensure responsible data collection and processing. It mandates measures such as data protection impact assessments and retention policies. Article 10 underscores the importance of effective data management in fostering ethical and sustainable AI development…This article states that high-risk AI systems must be developed using high-quality data sets for training, validation, and testing. These data sets should be managed properly, considering factors like data collection processes, data preparation, potential biases, and data gaps. The data sets should be relevant, representative, error-free, and complete as much as possible. They should also consider the specific context in which the AI system will be used. In some cases, providers may process special categories of personal data to detect and correct biases, but they must follow strict conditions to protect individuals’ rights and freedoms…

However, achieving compliance presents several significant challenges:

  • Ensuring Dataset Quality and Relevance: Organizations must establish robust data and AI platforms to prepare and manage datasets that are error-free, representative, and contextually relevant for their intended use cases. This requires rigorous data preparation and validation processes.
  • Bias and Contextual Sensitivity: Continuous monitoring for biases in data is critical. Organizations must implement corrective actions to address gaps while ensuring compliance with privacy regulations, especially when processing personal data to detect and reduce bias.
  • End-to-End Traceability: A comprehensive data governance framework is essential to track and document data flow from its origin to its final use in AI models. This ensures transparency, accountability, and compliance with regulatory requirements.
  • Evolving Data Requirements: Dynamic applications and changing schemas, particularly in industries like real estate, necessitate ongoing updates to data preparation processes to maintain relevance and accuracy.
  • Secure Data Processing: Compliance demands strict adherence to secure processing practices for personal data, ensuring privacy and security while enabling bias detection and mitigation.

Example: Real Estate Data
Immowelt’s real estate price map, awarded as the top performer in a 2022 test of real estate price maps, exemplifies the challenges of achieving high-quality datasets. The prepared data powers numerous services and applications, including data analysis, price predictions, personalization, recommendations, and market research…(More)”

Why Digital Public Goods, including AI, Should Depend on Open Data


Article by Cable Green: “Acknowledging that some data should not be shared (for moral, ethical and/or privacy reasons) and some cannot be shared (for legal or other reasons), Creative Commons (CC) thinks there is value in incentivizing the creation, sharing, and use of open data to advance knowledge production. As open communities continue to imagine, design, and build digital public goods and public infrastructure services for education, science, and culture, these goods and services – whenever possible and appropriate – should produce, share, and/or build upon open data.

Open Data and Digital Public Goods (DPGs)

CC is a member of the Digital Public Goods Alliance (DPGA) and CC’s legal tools have been recognized as digital public goods (DPGs). DPGs are “open-source software, open standards, open data, open AI systems, and open content collections that adhere to privacy and other applicable best practices, do no harm, and are of high relevance for attainment of the United Nations 2030 Sustainable Development Goals (SDGs).” If we want to solve the world’s greatest challenges, governments and other funders will need to invest in, develop, openly license, share, and use DPGs.

Open data is important to DPGs because data is a key driver of economic vitality with demonstrated potential to serve the public good. In the public sector, data informs policy making and public services delivery by helping to channel scarce resources to those most in need; providing the means to hold governments accountable and foster social innovation. In short, data has the potential to improve people’s lives. When data is closed or otherwise unavailable, the public does not accrue these benefits.CC was recently part of a DPGA sub-committee working to preserve the integrity of open data as part of the DPG Standard. This important update to the DPG Standard was introduced to ensure only open datasets and content collections with open licenses are eligible for recognition as DPGs. This new requirement means open data sets and content collections must meet the following criteria to be recognised as a digital public good.

  1. Comprehensive Open Licensing:
    1. The entire data set/content collection must be under an acceptable open licence. Mixed-licensed collections will no longer be accepted.
  2. Accessible and Discoverable:
    1. All data sets and content collection DPGs must be openly licensed and easily accessible from a distinct, single location, such as a unique URL.
  3. Permitted Access Restrictions:
    1. Certain access restrictions – such as logins, registrations, API keys, and throttling – are permitted as long as they do not discriminate against users or restrict usage based on geography or any other factors…(More)”.

The Case for Local and Regional Public Engagement in Governing Artificial Intelligence


Article by Stefaan Verhulst and Claudia Chwalisz: “As the Paris AI Action Summit approaches, the world’s attention will once again turn to the urgent questions surrounding how we govern artificial intelligence responsibly. Discussions will inevitably include calls for global coordination and participation, exemplified by several proposals for a Global Citizens’ Assembly on AI. While such initiatives aim to foster inclusivity, the reality is that meaningful deliberation and actionable outcomes often emerge most effectively at the local and regional levels.

Building on earlier reflections in “AI Globalism and AI Localism,” we argue that to govern AI for public benefit, we must prioritize building public engagement capacity closer to the communities where AI systems are deployed. Localized engagement not only ensures relevance to specific cultural, social, and economic contexts but also equips communities with the agency to shape both policy and product development in ways that reflect their needs and values.

While a Global Citizens’ Assembly sounds like a great idea on the surface, there is no public authority with teeth or enforcement mechanisms at that level of governance. The Paris Summit represents an opportunity to rethink existing AI governance frameworks, reorienting them toward an approach that is grounded in lived, local realities and mutually respectful processes of co-creation. Toward that end, we elaborate below on proposals for: local and regional AI assemblies; AI citizens’ assemblies for EU policy; capacity-building programs, and localized data governance models…(More)”.