The language we use to describe data can also help us fix its problems


Luke Stark & Anna Lauren Hoffmann at Quartz: “Data is, apparently, everything.

It’s the “new oil” that fuels online business. It comes in floods or tsunamis. We access it via “streams” or “fire hoses.” We scrape it, mine it, bank it, and clean it. (Or, if you prefer your buzzphrases with a dash of ageism and implicit misogyny, big data is like “teenage sex,” while working with it is “the sexiest job” of the century.)

These data metaphors can seem like empty cliches, but at their core they’re efforts to come to grips with the continuing onslaught of connected devices and the huge amounts of data they generate.

In a recent article, we—an algorithmic-fairness researcher at Microsoft and a data-ethics scholar at the University of Washington—push this connection one step further. More than simply helping us wrap our collective heads around data-fueled technological change, we set out to learn what these metaphors can teach us about the real-life ethics of collecting and handling data today.

Instead of only drawing from the norms and commitments of computer science, information science, and statistics, what if we looked at the ethics of the professions evoked by our data metaphors instead?…(More)”.

Public Entrepreneurship: How to train 21st century leaders


Beth Noveck at apolitical: “So how do we develop these better ways of working in government? How do we create a more effective public service?

Governments, universities and philanthropies are beginning to invest in training those inside and outside of government in new kinds of public entrepreneurial skills. They are also innovating in how they teach.

Canada has created a new Digital Academy to teach digital literacy to all 250,000 public servants. Among other approaches, they have created a 15 minute podcast series called bus rides to enable public servants to learn on their commute.

The better programs, like Canada’s, combine online and face-to-face methods. This is what Israel does in its Digital Leaders program. This nine-month program alternates between web- and live meetings as well as connecting learners to a global, online network of digital innovators.

Many countries have started to teach human-centred design to public servants, instructing officials in how to design services with, not simply for the public, as WeGov does in Brazil. in Chile, the UAI University has just begun teaching quantitative skills, offering three day intensives in data science for public servants.

The GovLab also offers a nifty, free online program called Solving Public Problems with Data.

The Public sector learning

To ensure that learning translates into practice, Australia’s BizLab Academy, turns students into teachers by using alumni of their human-centred design training as mentors for new students.

The Cities of Orlando and Sao Paulo go beyond training public servantsOrlando includes members of the public in its training program for city officials. Because they are learning to redesign services with citizens, the public participates in the training.

The Sao Paulo Abierta program uses citizens as trainers for the city’s public servants. Over 23,000 of them have studied with these lay trainers, who possess the innovation skills that are in short supply in government. In fact, public officials are prohibited from teaching in the program altogether.

Image from the ten recommendations for training public entrepreneurs. Read all the recommendations here. 

Recognising that it is not enough to train only a lone innovator or data scientist in a unit, governments are scaling their programs across the public sector.

Argentina’s LabGob has already trained 30,000 people since 2016 in its Design Academy for Public Policy with plans to expand. For every class taken, a public servant earns points, which are a prerequisite for promotions and pay raises in the Argentinian civil service.

Rather than going broad, some training programs are going deep by teaching sector-specific innovation skills. The NHS Digital Academy done in collaboration with Imperial College is a series of six online and four live sessions designed to produce leaders in health innovation.

Innovating in a bureaucracy

In my own work at the GovLab at New York University, we are helping public entrepreneurs take their public interest projects from idea to implementation using coaching, rather than training.

Training classes may be wonderful but leave people feeling abandoned when they return to their desks to face the challenge of innovating within a bureaucracy.

With hands-on mentoring from global leaders and peer-to-peer support, the GovLab Academycoaching programs try to ensure that public servants are getting the help they need to advance innovative projects.

Knowing what innovation skills to teach and how to teach them, however, should depend on asking people what they want. That’s why the Australia New Zealand School of Government is administering a survey asking these questions for public servants there….(More)”.

The Education Data Collaborative: A new kind of partnership.


About: “Whether we work within schools or as part of the broader ecosystem of parent-teacher associations, and philanthropic, nonprofit, and volunteer organizations, we need data to guide decisions about investing our time and resources.

This data is typically expensive to gather, often unvalidated (e.g. self-reported), and commonly available only to those who collect or report it. It can even be hard to ask for data when it’s not clear what’s available. At the same time, information – in the form of discrete research, report-card style PDFs, or static websites – is everywhere. The result is that many already resource-thin organizations that could be collaborating around strategies to help kids advance, spend a lot of time in isolation collecting and searching for data.

In the past decade, we’ve seen solid progress in addressing part of the problem: the emergence of connected longitudinal data systems (LDS). These warehouses and  linked databases contain data that can help us understand how students progress over time. No personally identifiable information (or PII) is shared, yet the data can reveal where interventions are most needed. Because these systems are typically designed for researchers and policy professionals, they are rarely accessible to the educators, parents, and partners – arts, sports, academic enrichment (e.g. STEM), mentoring, and family support programs – that play such important roles in helping young people learn and succeed…

“We need open tools for the ecosystem – parents, volunteers, non-profit organizations and the foundations and agencies that support them. These partners can realize significant benefit from the same kind of data policy makers and education leaders hold in their LDS.


That’s why we’re launching the Education Data Collaborative. Working together, we can build tools that help us use data to improve the design, efficacy, and impact of programs and interventions and find new  way to work with public education systems to achieve great things for kids. …Data collaboratives, data trusts, and other kinds of multi-sector data partnerships are among the most important civic innovations to emerge in the past decade….(More)”

The war to free science


Brian Resnick and Julia Belluz at Vox: “The 27,500 scientists who work for the University of California generate 10 percent of all the academic research papers published in the United States.

Their university recently put them in a strange position: Sometime this year, these scientists will not be able to directly access much of the world’s published research they’re not involved in.

That’s because in February, the UC system — one of the country’s largest academic institutions, encompassing Berkeley, Los Angeles, Davis, and several other campuses — dropped its nearly $11 million annual subscription to Elsevier, the world’s largest publisher of academic journals.

On the face of it, this seemed like an odd move. Why cut off students and researchers from academic research?

In fact, it was a principled stance that may herald a revolution in the way science is shared around the world.

The University of California decided it doesn’t want scientific knowledge locked behind paywalls, and thinks the cost of academic publishing has gotten out of control.

Elsevier owns around 3,000 academic journals, and its articles account for some 18 percentof all the world’s research output. “They’re a monopolist, and they act like a monopolist,” says Jeffrey MacKie-Mason, head of the campus libraries at UC Berkeley and co-chair of the team that negotiated with the publisher.Elsevier makes huge profits on its journals, generating billions of dollars a year for its parent company RELX .

This is a story about more than subscription fees. It’s about how a private industry has come to dominate the institutions of science, and how librarians, academics, and even pirates are trying to regain control.

The University of California is not the only institution fighting back. “There are thousands of Davids in this story,” says University of California Davis librarian MacKenzie Smith, who, like so many other librarians around the world, has been pushing for more open access to science. “But only a few big Goliaths.”…(More)”.

Data & Policy: A new venue to study and explore policy–data interaction


Opening editorial by Stefaan G. Verhulst, Zeynep Engin and Jon Crowcroft: “…Policy–data interactions or governance initiatives that use data have been the exception rather than the norm, isolated prototypes and trials rather than an indication of real, systemic change. There are various reasons for the generally slow uptake of data in policymaking, and several factors will have to change if the situation is to improve. ….

  • Despite the number of successful prototypes and small-scale initiatives, policy makers’ understanding of data’s potential and its value proposition generally remains limited (Lutes, 2015). There is also limited appreciation of the advances data science has made the last few years. This is a major limiting factor; we cannot expect policy makers to use data if they do not recognize what data and data science can do.
  • The recent (and justifiable) backlash against how certain private companies handle consumer data has had something of a reverse halo effect: There is a growing lack of trust in the way data is collected, analyzed, and used, and this often leads to a certain reluctance (or simply risk-aversion) on the part of officials and others (Engin, 2018).
  • Despite several high-profile open data projects around the world, much (probably the majority) of data that could be helpful in governance remains either privately held or otherwise hidden in silos (Verhulst and Young, 2017b). There remains a shortage not only of data but, more specifically, of high-quality and relevant data.
  • With few exceptions, the technical capacities of officials remain limited, and this has obviously negative ramifications for the potential use of data in governance (Giest, 2017).
  • It’s not just a question of limited technical capacities. There is often a vast conceptual and values gap between the policy and technical communities (Thompson et al., 2015; Uzochukwu et al., 2016); sometimes it seems as if they speak different languages. Compounding this difference in world views is the fact that the two communities rarely interact.
  • Yet, data about the use and evidence of the impact of data remain sparse. The impetus to use more data in policy making is stymied by limited scholarship and a weak evidential basis to show that data can be helpful and how. Without such evidence, data advocates are limited in their ability to make the case for more data initiatives in governance.
  • Data are not only changing the way policy is developed, but they have also reopened the debate around theory- versus data-driven methods in generating scientific knowledge (Lee, 1973; Kitchin, 2014; Chivers, 2018; Dreyfuss, 2017) and thus directly questioning the evidence base to utilization and implementation of data within policy making. A number of associated challenges are being discussed, such as: (i) traceability and reproducibility of research outcomes (due to “black box processing”); (ii) the use of correlation instead of causation as the basis of analysis, biases and uncertainties present in large historical datasets that cause replication and, in some cases, amplification of human cognitive biases and imperfections; and (iii) the incorporation of existing human knowledge and domain expertise into the scientific knowledge generation processes—among many other topics (Castelvecchi, 2016; Miller and Goodchild, 2015; Obermeyer and Emanuel, 2016; Provost and Fawcett, 2013).
  • Finally, we believe that there should be a sound under-pinning a new theory of what we call Policy–Data Interactions. To date, in reaction to the proliferation of data in the commercial world, theories of data management,1 privacy,2 and fairness3 have emerged. From the Human–Computer Interaction world, a manifesto of principles of Human–Data Interaction (Mortier et al., 2014) has found traction, which intends reducing the asymmetry of power present in current design considerations of systems of data about people. However, we need a consistent, symmetric approach to consideration of systems of policy and data, how they interact with one another.

All these challenges are real, and they are sticky. We are under no illusions that they will be overcome easily or quickly….

During the past four conferences, we have hosted an incredibly diverse range of dialogues and examinations by key global thought leaders, opinion leaders, practitioners, and the scientific community (Data for Policy, 2015201620172019). What became increasingly obvious was the need for a dedicated venue to deepen and sustain the conversations and deliberations beyond the limitations of an annual conference. This leads us to today and the launch of Data & Policy, which aims to confront and mitigate the barriers to greater use of data in policy making and governance.

Data & Policy is a venue for peer-reviewed research and discussion about the potential for and impact of data science on policy. Our aim is to provide a nuanced and multistranded assessment of the potential and challenges involved in using data for policy and to bridge the “two cultures” of science and humanism—as CP Snow famously described in his lecture on “Two Cultures and the Scientific Revolution” (Snow, 1959). By doing so, we also seek to bridge the two other dichotomies that limit an examination of datafication and is interaction with policy from various angles: the divide between practice and scholarship; and between private and public…

So these are our principles: scholarly, pragmatic, open-minded, interdisciplinary, focused on actionable intelligence, and, most of all, innovative in how we will share insight and pushing at the boundaries of what we already know and what already exists. We are excited to launch Data & Policy with the support of Cambridge University Press and University College London, and we’re looking for partners to help us build it as a resource for the community. If you’re reading this manifesto it means you have at least a passing interest in the subject; we hope you will be part of the conversation….(More)”.

Introducing ‘AI Commons’: A framework for collaboration to achieve global impact


Press Release: “Last week’s 3rd annual AI for Good Global Summit once again showcased the growing number of Artificial Intelligence (AI) projects with promise to advance the United Nations Sustainable Development Goals (SDGs).

Now, using the Summit’s momentum, AI innovators and humanitarian leaders are prepared to take the ‘AI for Good’ movement to the next level.

They are working together to launch an ‘AI Commons’ that aims to scale AI for Good projects and maximize their impact across the world.

The AI Commons will enable AI adopters to connect with AI specialists and data owners to align incentives for innovation and develop AI solutions to precisely defined problems.

“The concept of AI Commons has developed over three editions of the Summit and is now motivating implementation,” said ITU Secretary-General Houlin Zhao in closing remarks to the summit. “AI and data need to be a shared resource if we are serious about scaling AI for good. The community supporting the Summit is creating infrastructure to scale-up their collaboration − to convert the principles underlying the Summit into global impact.”…

The AI Commons will provide an open framework for collaboration, a decentralized system to democratize problem solving with AI.

It aims to be a “knowledge space”, says Banifatemi, answering a key question: “How can problem solving with AI become common knowledge?”

“The goal is to be an open initiative, like a Linux effort, like an open-source network, where everyone can participate and we jointly share and we create an abundance of knowledge, knowledge of how we can solve problems with AI,” said Banifatemi.

AI development and application will build on the state of the art, enabling AI solutions to scale with the help of shared datasets, testing and simulation environments, AI models and associated software, and storage and computing resources….(More)”.

How not to conduct a consultation – and why asking the public is not always such a great idea


Agnes Batory & Sara Svensson at Policy and Politics: “Involving people in policy-making is generally a good thing. Policy-makers themselves often pay at least lip-service to the importance of giving citizens a say. In the academic literature, participatory governance has been, with some exaggeration, almost universally hailed as a panacea to all ills in Western democracies. In particular, it is advocated as a way to remedy the alienation of voters from politicians who seem to be oblivious to the concerns of the common man and woman, with an ensuing decline in public trust in government. Representation by political parties is ridden with problems, so the argument goes, and in any case it is overly focused on the act of voting in elections – a one-off event once every few years which limits citizens’ ability to control the policy agenda. On the other hand, various forms of public participation are expected to educate citizens, help develop a civic culture, and boost the legitimacy of decision-making. Consequently, practices to ensure that citizens can provide direct input into policy-making are to be welcomed on both pragmatic and normative grounds.  

I do not disagree with these generally positive expectations. However, the main objective of my recent article in Policy and Politics, co-authored with Sara Svensson, is to inject a dose of healthy scepticism into the debate or, more precisely, to show that there are circumstances in which public consultations will achieve anything but greater legitimacy and better policy-outcomes. We do this partly by discussing the more questionable assumptions in the participatory governance literature, and partly by examining a recent, glaring example of the misuse, and abuse, of popular input….(More)”.

Number of fact-checking outlets surges to 188 in more than 60 countries


Mark Stencel at Poynter: “The number of fact-checking outlets around the world has grown to 188 in more than 60 countries amid global concerns about the spread of misinformation, according to the latest tally by the Duke Reporters’ Lab.

Since the last annual fact-checking census in February 2018, we’ve added 39 more outlets that actively assess claims from politicians and social media, a 26% increase. The new total is also more than four times the 44 fact-checkers we counted when we launched our global database and map in 2014.

Globally, the largest growth came in Asia, which went from 22 to 35 outlets in the past year. Nine of the 27 fact-checking outlets that launched since the start of 2018 were in Asia, including six in India. Latin American fact-checking also saw a growth spurt in that same period, with two new outlets in Costa Rica, and others in Mexico, Panama and Venezuela.

The actual worldwide total is likely much higher than our current tally. That’s because more than a half-dozen of the fact-checkers we’ve added to the database since the start of 2018 began as election-related partnerships that involved the collaboration of multiple organizations. And some those election partners are discussing ways to continue or reactivate that work— either together or on their own.

Over the past 12 months, five separate multimedia partnerships enlisted more than 60 different fact-checking organizations and other news companies to help debunk claims and verify information for voters in MexicoBrazilSweden,Nigeria and the Philippines. And the Poynter Institute’s International Fact-Checking Network assembled a separate team of 19 media outlets from 13 countries to consolidate and share their reporting during the run-up to last month’s elections for the European Parliament. Our database includes each of these partnerships, along with several others— but not each of the individual partners. And because they were intentionally short-run projects, three of these big partnerships appear among the 74 inactive projects we also document in our database.

Politics isn’t the only driver for fact-checkers. Many outlets in our database are concentrating efforts on viral hoaxes and other forms of online misinformation — often in coordination with the big digital platforms on which that misinformation spreads.

We also continue to see new topic-specific fact-checkers such as Metafact in Australia and Health Feedback in France— both of which launched in 2018 to focus on claims about health and medicine for a worldwide audience….(More)”.

How Organizations with Data and Technology Skills Can Play a Critical Role in the 2020 Census


Blog Post by Kathryn L.S. Pettit and Olivia Arena: “The 2020 Census is less than a year away, and it’s facing new challenges that could result in an inaccurate count. The proposed inclusion of a citizenship question, the lack of comprehensive and unified messaging, and the new internet-response option could worsen the undercount of vulnerable and marginalized communities and deprive these groups of critical resources.

The US Census Bureau aims to count every US resident. But some groups are more likely to be missed than others. Communities of color, immigrants, young children, renters, people experiencing homelessness, and people living in rural areas have long been undercounted in the census. Because the census count is used to apportion federal funding and draw legislative districts for political seats, an inaccurate count means that these populations receive less than their fair share of resources and representation.

Local governments and community-based organizations have begun forming Complete Count Committees, coalitions of trusted community voices established to encourage census responses, to achieve a more accurate count in 2020. Local organizations with data and technology skills—like civic tech groups, libraries, technology training organizations, and data intermediaries—can harness their expertise to help these coalitions achieve a complete count.

As the coordinator of the National Neighborhood Indicators Partnership (NNIP), we are learning about 2020 Census mobilization in communities across the country. We have found that data and technology groups are natural partners in this work; they understand what is at risk in 2020, are embedded in communities as trusted data providers, and can amplify the importance of the census.

Threats to a complete count

The proposed citizenship question, currently being challenged in court, would likely suppress the count of immigrants and households in immigrant communities in the US. Though federal law prohibits the Census Bureau from disclosing individual-level data, even to other agencies, people may still be skeptical about the confidentiality of the data or generally distrust the government. Acknowledging these fears is important for organizations partnering in outreach to vulnerable communities.

Another potential hurdle is that, for the first time, the Census Bureau will encourage people to complete their census forms online (though answering by mail or phone will still be options). Though a high tech census could be more cost-effective, the digital divide compounded by the underfunding of the Census Bureau that limited initial testing of new methods and outreach could worsen the undercount….(More)”.

Open government and citizen engagement: From theory to action


Camilo Romero Galeano at apolitical: “…According to the 2016 Corruption Perception Index analysing the behaviour of 178 countries, 69% of countries evaluated again raised the alarm about what has been referred to as “the cancer of the public service”.

The scandals of misappropriation of public funds, illicit enrichment of public officials, the slippery labyrinths of procurement and all kinds of practices that challenge ethics in the public service are daily news around the world.

Colombia and the department of Nariño suffer from the same problems. Bad practices of traditional politics and chiefdoms have ended up destroying the trust that citizens once had in political institutions. Corruption and its devastating effects always end up undermining people’s dignity.

With this as the current state of affairs, and in our capacity as a subnational government, we have designed hand in hand with the citizens of Nariño a new government program. It  is based on an approach to innovation called “New Government” that relies on three pillars: open government; social innovation; and collaborative economy.

The new program has been endorsed by more than 300,000 voters and subsequently concretised in our roadmap for the territory: “Nariño heart of the World”. The creation of this policy document brought together 31,700 participants and involved travelling around the 13 subregions that compose the 64 municipalities in Nariño.

In this way, citizen participation has become an essential tool in the fight against corruption.

Our open government strategy is called GANA — Gobierno Abierto de Nariño (in English, “Win — Open Government of Nariño”). The strategy takes a step forward in ensuring cabinet officials become transparent and publicly declare private assets. Citizens can now find out the financial conditions in which public officials begin and finish their administrative periods. Each one of us….(More)”