Access Rules: Freeing Data from Big Tech for a Better Future


Book by Thomas Ramge: “Information is power, and the time is now for digital liberation. Access Rules mounts a strong and hopeful argument for how informational tools at present in the hands of a few could instead become empowering machines for everyone. By forcing data-hoarding companies to open access to their data, we can reinvigorate both our economy and our society. Authors Viktor Mayer-Schönberger and Thomas Ramge contend that if we disrupt monopoly power and create a level playing field, digital innovations can emerge to benefit us all.

Over the past twenty years, Big Tech has managed to centralize the most relevant data on their servers, as data has become the most important raw material for innovation. However, dominant oligopolists like Facebook, Amazon, and Google, in contrast with their reputation as digital pioneers, are actually slowing down innovation and progress by withholding data for the benefit of their shareholders––at the expense of customers, the economy, and society. As Access Rules compellingly argues, ultimately it is up to us to force information giants, wherever they are located, to open their treasure troves of data to others. In order for us to limit global warming, contain a virus like COVID-19, or successfully fight poverty, everyone—including citizens and scientists, start-ups and established companies, as well as the public sector and NGOs—must have access to data. When everyone has access to the informational riches of the data age, the nature of digital power will change. Information technology will find its way back to its original purpose: empowering all of us to use information so we can thrive as individuals and as societies….(More)”.

Decoding human behavior with big data? Critical, constructive input from the decision sciences


Paper by Konstantinos V. Katsikopoulos and Marc C. Canellas: “Big data analytics employs algorithms to uncover people’s preferences and values, and support their decision making. A central assumption of big data analytics is that it can explain and predict human behavior. We investigate this assumption, aiming to enhance the knowledge basis for developing algorithmic standards in big data analytics. First, we argue that big data analytics is by design atheoretical and does not provide process-based explanations of human behavior; thus, it is unfit to support deliberation that is transparent and explainable. Second, we review evidence from interdisciplinary decision science, showing that the accuracy of complex algorithms used in big data analytics for predicting human behavior is not consistently higher than that of simple rules of thumb. Rather, it is lower in situations such as predicting election outcomes, criminal profiling, and granting bail. Big data algorithms can be considered as candidate models for explaining, predicting, and supporting human decision making when they match, in transparency and accuracy, simple, process-based, domain-grounded theories of human behavior. Big data analytics can be inspired by behavioral and cognitive theory….(More)”.

Making forest data fair and open


Paper by Renato A. F. de Lima : “It is a truth universally acknowledged that those in possession of time and good fortune must be in want of information. Nowhere is this more so than for tropical forests, which include the richest and most productive ecosystems on Earth. Information on tropical forest carbon and biodiversity, and how these are changing, is immensely valuable, and many different stakeholders wish to use data on tropical and subtropical forests. These include scientists, governments, nongovernmental organizations and commercial interests, such as those extracting timber or selling carbon credits. Another crucial, often-ignored group are the local communities for whom forest information may help to assert their rights and conserve or restore their forests.

A widespread view is that to lead to better public outcomes it is necessary and sufficient for forest data to be open and ‘Findable, Accessible, Interoperable, Reusable’ (FAIR). There is indeed a powerful case. Open data — those that anyone can use and share without restrictions — can encourage transparency and reproducibility, foster innovation and be used more widely, thus translating into a greater public good (for example, https://creativecommons.org). Open biological collections and genetic sequences such as GBIF or GenBank have enabled species discovery, and open Earth observation data helps people to understand and monitor deforestation (for example, Global Forest Watch). But the perspectives of those who actually make the forest measurements are much less recognized, meaning that open and FAIR data can be extremely unfair indeed. We argue here that forest data policies and practices must be fair in the correct, linguistic use of the term — just and equitable.

In a world in which forest data origination — measuring, monitoring and sustaining forest science — is secured by large, long-term capital investment (such as through space missions and some officially supported national forest inventories), making all data open makes perfect sense. But where data origination depends on insecure funding and precarious employment conditions, top-down calls to make these data open can be deeply problematic. Even when well-intentioned, such calls ignore the socioeconomic context of the places where the forest plots are located and how knowledge is created, entrenching the structural inequalities that characterize scientific research and collaboration among and within nations. A recent review found scant evidence for open data ever lessening such inequalities. Clearly, only a privileged part of the global community is currently able to exploit the potential of open forest data. Meanwhile, some local communities are de facto owners of their forests and associated knowledge, so making information open — for example, the location of valuable species — may carry risks to themselves and their forests….(More)”.

Inclusive policy making in a digital age: The case for crowdsourced deliberation


Blog by Theo Bass: “In 2016, the Finnish Government ran an ambitious experiment to test if and how citizens across the country could meaningfully contribute to the law-making process.

Many people in Finland use off-road snowmobiles to get around in the winter, raising issues like how to protect wildlife, keep pedestrians safe, and compensate property owners for use of their land for off-road traffic.

To hear from people across the country who would be most affected by new laws, the government set up an online platform to understand problems they faced and gather solutions. Citizens could post comments and suggestions, respond to one another, and vote on ideas they liked. Over 700 people took part, generating around 250 policy ideas.

The exercise caught the attention of academics Tanja Aitamurto and Hélène Landemore. In 2017, they wrote a paper coining the term crowdsourced deliberation — an ‘open, asynchronous, depersonalized, and distributed kind of online deliberation occurring among self‐selected participants’ — to describe the interactions they saw on the platform.

Many other crowdsourced deliberation initiatives have emerged in recent years, although they haven’t always been given that name. From France to Taiwan, governments have experimented with opening policy making and enabling online conversations among diverse groups of thousands of people, leading to the adoption of new regulations or laws.

So what’s distinctive about this approach and why should policy makers consider it alongside others? In this post I’ll make a case for crowdsourced deliberation, comparing it to two other popular methods for inclusive policy making…(More)”.

Russia Is Leaking Data Like a Sieve


Matt Burgess at Wired: “Names, birthdays, passport numbers, job titles—the personal information goes on for pages and looks like any typical data breach. But this data set is very different. It allegedly contains the personal information of 1,600 Russian troops who served in Bucha, a Ukrainian city devastated during Russia’s war and the scene of multiple potential war crimes.

The data set is not the only one. Another allegedly contains the names and contact details of 620 Russian spies who are registered to work at the Moscow office of the FSB, the country’s main security agency. Neither set of information was published by hackers. Instead they were put online by Ukraine’s intelligence services, with all the names and details freely available to anyone online. “Every European should know their names,” Ukrainian officials wrote in a Facebook post as they published the data.

Since Russian troops crossed Ukraine’s borders at the end of February, colossal amounts of information about the Russian state and its activities have been made public. The data offers unparalleled glimpses into closed-off private institutions, and it may be a gold mine for investigators, from journalists to those tasked with investigating war crimes. Broadly, the data comes in two flavors: information published proactively by Ukranian authorities or their allies, and information obtained by hacktivists. Hundreds of gigabytes of files and millions of emails have been made public.

“Both sides in this conflict are very good at information operations,” says Philip Ingram, a former colonel in British military intelligence. “The Russians are quite blatant about the lies that they’ll tell,” he adds. Since the war started, Russian disinformation has been consistently debunked. Ingram says Ukraine has to be more tactical with the information it publishes. “They have to make sure that what they’re putting out is credible and they’re not caught out telling lies in a way that would embarrass them or embarrass their international partners.”

Both the lists of alleged FSB officers and Russian troops were published online by Ukraine’s Central Intelligence Agency at the end of March and start of April, respectively. While WIRED has not been able to verify the accuracy of the data—and Ukrainian cybersecurity officials did not respond to a request for comment—Aric Toler, from investigative outlet Bellingcat, tweeted that the FSB details appear to have been combined from previous leaks and open source information. It is unclear how up-to-date the information is…(More)”.

Co-designing algorithms for governance: Ensuring responsible and accountable algorithmic management of refugee camp supplies


Paper by Rianne Dekker et al: “There is increasing criticism on the use of big data and algorithms in public governance. Studies revealed that algorithms may reinforce existing biases and defy scrutiny by public officials using them and citizens subject to algorithmic decisions and services. In response, scholars have called for more algorithmic transparency and regulation. These are useful, but ex post solutions in which the development of algorithms remains a rather autonomous process. This paper argues that co-design of algorithms with relevant stakeholders from government and society is another means to achieve responsible and accountable algorithms that is largely overlooked in the literature. We present a case study of the development of an algorithmic tool to estimate the populations of refugee camps to manage the delivery of emergency supplies. This case study demonstrates how in different stages of development of the tool—data selection and pre-processing, training of the algorithm and post-processing and adoption—inclusion of knowledge from the field led to changes to the algorithm. Co-design supported responsibility of the algorithm in the selection of big data sources and in preventing reinforcement of biases. It contributed to accountability of the algorithm by making the estimations transparent and explicable to its users. They were able to use the tool for fitting purposes and used their discretion in the interpretation of the results. It is yet unclear whether this eventually led to better servicing of refugee camps…(More)”.

Intermediaries do matter: voluntary standards and the Right to Data Portability


Paper by Matteo Nebbiai: “This paper enlightens an understudied aspect of the application of the General Data Protection Regulation (GDPR) Right to Data Portability (RtDP), introducing a framework to analyse empirically the voluntary data portability standards adopted by various data controllers. The first section explains how the RtDP wording creates some “grey areas” that allow data controllers a broad interpretation of the right. Secondly, the paper shows why the regulatory initiatives affecting the interpretation of these “grey areas” can be framed as “regulatory standard-setting (RSS) schemes”, which are voluntary standards of behaviour settled either by private, public, or non-governmental actors. The empirical section reveals that in the EU, between 2000 and 2020, the number of such schemes increased every year and most of them were governed by private actors. Finally, the historical analysis highlights that the RtDP was introduced when many private-run RSS schemes were already operating, and no evidence suggests that the GDPR impacted significantly on their spread…(More)”.

Orientation Failure? Why Directionality Matters in Innovation Policy and Implementation


Blog by Mariam Tabatadze and Benjamin Kumpf: “…In the essay “The Moon and the Ghetto” from 1977, Richard Nelson brought renewed attention to the question of directionality of innovation. He asked why societies that are wealthy and technologically advanced are not able to deal effectively with social problems such as poverty or inequities in education. Nelson believed that politics are only a small part of the problem. The main challenge, according to him, was further advancing scientific and technological breakthroughs.

Since the late seventies, humanity has laid claim to many more significant technological and scientific achievements. However, challenges such as poverty, social inequalities and of course environmental degradation persist. This begs the question: is the main problem a lack of directionality?

The COVID-19 pandemic sparked renewed interest in mission-driven innovation in industrial and socio-economic policy (see below for a framing of missions and mission-oriented innovation). The focus is a continuation of a “normative turn” in national and supranational science, technology and innovation (STI) policies over the last 15 years.

The directionality of STI policies shifted from pursuing predominantly growth and competitiveness-related objectives to addressing societal challenges. It brings together elements of innovation policy – focused on economic growth – and transition policy, which seeks beneficial change for society at large. This is important as we are seeing increasingly more evidence on the negative effects of innovation in countries across the globe, from exacerbated inequalities between places to greater inequalities between income groups…(More)”.

Facial Recognition Goes to War


Kashmir Hill at the New York Times: “In the weeks after Russia invaded Ukraine and images of the devastation wrought there flooded the news, Hoan Ton-That, the chief executive of the facial recognition company Clearview AI, began thinking about how he could get involved.

He believed his company’s technology could offer clarity in complex situations in the war.

“I remember seeing videos of captured Russian soldiers and Russia claiming they were actors,” Mr. Ton-That said. “I thought if Ukrainians could use Clearview, they could get more information to verify their identities.”

In early March, he reached out to people who might help him contact the Ukrainian government. One of Clearview’s advisory board members, Lee Wolosky, a lawyer who has worked for the Biden administration, was meeting with Ukrainian officials and offered to deliver a message.

Mr. Ton-That drafted a letter explaining that his app “can instantly identify someone just from a photo” and that the police and federal agencies in the United States used it to solve crimes. That feature has brought Clearview scrutiny over concerns about privacy and questions about racism and other biases within artificial-intelligence systems.

The tool, which can identify a suspect caught on surveillance video, could be valuable to a country under attack, Mr. Ton-That wrote. He said the tool could identify people who might be spies, as well as deceased people, by comparing their faces against Clearview’s database of 20 billion faces from the public web, including from “Russian social sites such as VKontakte.”

Mr. Ton-That decided to offer Clearview’s services to Ukraine for free, as reported earlier by Reuters. Now, less than a month later, the New York-based Clearview has created more than 200 accounts for users at five Ukrainian government agencies, which have conducted more than 5,000 searches. Clearview has also translated its app into Ukrainian.

“It’s been an honor to help Ukraine,” said Mr. Ton-That, who provided emails from officials from three agencies in Ukraine, confirming that they had used the tool. It has identified dead soldiers and prisoners of war, as well as travelers in the country, confirming the names on their official IDs. The fear of spies and saboteurs in the country has led to heightened paranoia.

According to one email, Ukraine’s national police obtained two photos of dead Russian soldiers, which have been viewed by The New York Times, on March 21. One dead man had identifying patches on his uniform, but the other did not, so the ministry ran his face through Clearview’s app…(More)”.

The rise of the data steward


Article by Sarah Wray: “As data use and collaboration become more advanced, there is a need for a new profession within the public and private sectors, says Stefaan Verhulst, Co-Founder and Chief Research and Development Officer at New York University’s The GovLab. He calls this role the ‘data steward’ and is also seeking to expand existing definitions of the term.

While many cities, government organisations, and private sector companies have chief data officers and chief privacy officers, Verhulst says this new function is broader and necessary as more organisations begin to explore data collaborations which bring together data from various sources to solve problems for the public good.

Many cities, for instance, want to get more value and innovation from the open data they share, and are also increasingly partnering to benefit from private sector data on mobility, spending, and more.

Several examples highlight the challenges, though. There have been disputes about data-sharing and privacy, such as between Uber and the Los Angeles Department of Transportation, while other initiatives have failed to gain traction. Copenhagen’s City Data Exchange facilitated the exchange of public and private data but was disbanded after it struggled to get enough data providers and users on the platform and to become financially sustainable.

Verhulst says that beyond ensuring the security and integrity of data, new skills required by data stewards include the ability to secure partnerships, adequately vet data partners and set up data-sharing agreements, as well as the capacity to steward data-sharing initiatives internally and obtain legal and executive buy-in. Data stewards should also develop financial models for data-sharing to ensure partnerships are sustainable over time.

“That’s quite often ignored,” says Verhulst. “It’s assumed that these things will pay for themselves. Well surprise, surprise, there are costs.”

In addition, there’s an important role for retaining an active focus on insights from data and problems to be solved. Many early open data efforts have taken a ‘build it and they will come’ approach, and usage at scale hasn’t always materialised.

A dynamic regulatory environment is also driving demand for new skills, says Verhulst, noting that the proposed EU Data Act indicates a mandate “to knock on the doors of the private sector [for data] in emergency contexts”.

“The question is: how do you go about that?” Verhulst comments. “Many organisations are going to have to figure this out.”

The GovLab is now running the third cohort of its training for data stewards, and the first focused in the Eastern Hemisphere.

The Developing a Data Reuse Strategy for Public Problems course is part of The GovLab’s Open Data Policy Lab, which is supported by Microsoft..(More)”.