Who wants to know?: The Political Economy of Statistical Capacity in Latin America


IADB paper by Dargent, Eduardo; Lotta, Gabriela; Mejía-Guerra, José Antonio; Moncada, Gilberto: “Why is there such heterogenity in the level of technical and institutional capacity in national statistical offices (NSOs)? Although there is broad consensus about the importance of statistical information as an essential input for decision making in the public and private sectors, this does not generally translate into a recognition of the importance of the institutions responsible for the production of data. In the context of the role of NSOs in government and society, this study seeks to explain the variation in regional statistical capacity by comparing historical processes and political economy factors in 10 Latin American countries. To do so, it proposes a new theoretical and methodological framework and offers recommendations to strengthen the institutionality of NSOs….(More)”.

Research Shows Political Acumen, Not Just Analytical Skills, is Key to Evidence-Informed Policymaking


Press Release: “Results for Development (R4D) has released a new study unpacking how evidence translators play a key and somewhat surprising role in ensuring policymakers have the evidence they need to make informed decisions. Translators — who can be evidence producers, policymakers, or intermediaries such as journalists, advocates and expert advisors — identify, filter, interpret, adapt, contextualize and communicate data and evidence for the purposes of policymaking.

The study, Translators’ Role in Evidence-Informed Policymaking, provides a better understanding of who translators are and how different factors influence translators’ ability to promote the use of evidence in policymaking. This research shows translation is an essential function and that, absent individuals or organizations taking up the translator role, evidence translation and evidence-informed policymaking often do not take place.

“We began this research assuming that translators’ technical skills and analytical prowess would prove to be among the most important factors in predicting when and how evidence made its way into public sector decision making,” Nathaniel Heller, executive vice president for integrated strategies at Results for Development, said. “Surprisingly, that turned out not to be the case, and other ‘soft’ skills play a far larger role in translators’ efficacy than we had imagined.”

Key findings include:

  • Translator credibility and reputation are crucial to the ability to gain access to policymakers and to promote the uptake of evidence.
  • Political savvy and stakeholder engagement are among the most critical skills for effective translators.
  • Conversely, analytical skills and the ability to adapt, transform and communicate evidence were identified as being less important stand-alone translator skills.
  • Evidence translation is most effective when initiated by those in power or when translators place those in power at the center of their efforts.

The study includes a definitional and theoretical framework as well as a set of research questions about key enabling and constraining factors that might affect evidence translators’ influence. It also focuses on two cases in Ghana and Argentina to validate and debunk some of the intellectual frameworks around policy translators that R4D and others in the field have already developed. The first case focuses on Ghana’s blue-ribbon commission formed by the country’s president in 2015, which was tasked with reviewing Ghana’s national health insurance scheme. The second case looks at Buenos Aires’ 2016 government-led review of the city’s right to information regime….(More)”.

Ontario is trying a wild experiment: Opening access to its residents’ health data


Dave Gershorn at Quartz: “The world’s most powerful technology companies have a vision for the future of healthcare. You’ll still go to your doctor’s office, sit in a waiting room, and explain your problem to someone in a white coat. But instead of relying solely on their own experience and knowledge, your doctor will consult an algorithm that’s been trained on the symptoms, diagnoses, and outcomes of millions of other patients. Instead of a radiologist reading your x-ray, a computer will be able to detect minute differences and instantly identify a tumor or lesion. Or at least that’s the goal.

AI systems like these, currently under development by companies including Google and IBM, can’t read textbooks and journals, attend lectures, and do rounds—they need millions of real life examples to understand all the different variations between one patient and another. In general, AI is only as good as the data it’s trained on, but medical data is exceedingly private—most developed countries have strict health data protection laws, such as HIPAA in the United States….

These approaches, which favor companies with considerable resources, are pretty much the only way to get large troves of health data in the US because the American health system is so disparate. Healthcare providers keep personal files on each of their patients, and can only transmit them to other accredited healthcare workers at the patient’s request. There’s no single place where all health data exists. It’s more secure, but less efficient for analysis and research.

Ontario, Canada, might have a solution, thanks to its single-payer healthcare system. All of Ontario’s health data exists in a few enormous caches under government control. (After all, the government needs to keep track of all the bills its paying.) Similar structures exist elsewhere in Canada, such as Quebec, but Toronto, which has become a major hub for AI research, wants to lead the charge in providing this data to businesses.

Until now, the only people allowed to study this data were government organizations or researchers who partnered with the government to study disease. But Ontario has now entrusted the MaRS Discovery District—a cross between a tech incubator and WeWork—to build a platform for approved companies and researchers to access this data, dubbed Project Spark. The project, initiated by MaRS and Canada’s University Health Network, began exploring how to share this data after both organizations expressed interest to the government about giving broader health data access to researchers and companies looking to build healthcare-related tools.

Project Spark’s goal is to create an API, or a way for developers to request information from the government’s data cache. This could be used to create an app for doctors to access the full medical history of a new patient. Ontarians could access their health records at any time through similar software, and catalog health issues as they occur. Or researchers, like the ones trying to build AI to assist doctors, could request a different level of access that provides anonymized data on Ontarians who meet certain criteria. If you wanted to study every Ontarian who had Alzheimer’s disease over the last 40 years, that data would only be authorization and a few lines of code away.

There are currently 100 companies lined up to get access to data, comprised of health records from Ontario’s 14 million residents. (MaRS won’t say who the companies are). …(More)”

Big Data and AI – A transformational shift for government: So, what next for research?


Irina Pencheva, Marc Esteve and Slava Jenkin Mikhaylov in Public Policy and Administration: “Big Data and artificial intelligence will have a profound transformational impact on governments around the world. Thus, it is important for scholars to provide a useful analysis on the topic to public managers and policymakers. This study offers an in-depth review of the Policy and Administration literature on the role of Big Data and advanced analytics in the public sector. It provides an overview of the key themes in the research field, namely the application and benefits of Big Data throughout the policy process, and challenges to its adoption and the resulting implications for the public sector. It is argued that research on the subject is still nascent and more should be done to ensure that the theory adds real value to practitioners. A critical assessment of the strengths and limitations of the existing literature is developed, and a future research agenda to address these gaps and enrich our understanding of the topic is proposed…(More)”.

Data Protection and e-Privacy: From Spam and Cookies to Big Data, Machine Learning and Profiling


Chapter by Lilian Edwards in L Edwards ed Law, Policy and the Internet (Hart , 2018): “In this chapter, I examine in detail how data subjects are tracked, profiled and targeted by their activities on line and, increasingly, in the “offline” world as well. Tracking is part of both commercial and state surveillance, but in this chapter I concentrate on the former. The European law relating to spam, cookies, online behavioural advertising (OBA), machine learning (ML) and the Internet of Things (IoT) is examined in detail, using both the GDPR and the forthcoming draft ePrivacy Regulation. The chapter concludes by examining both code and law solutions which might find a way forward to protect user privacy and still enable innovation, by looking to paradigms not based around consent, and less likely to rely on a “transparency fallacy”. Particular attention is drawn to the new work around Personal Data Containers (PDCs) and distributed ML analytics….(More)”.

Why Do We Care So Much About Privacy?


Louis Menand in The New Yorker: “…Possibly the discussion is using the wrong vocabulary. “Privacy” is an odd name for the good that is being threatened by commercial exploitation and state surveillance. Privacy implies “It’s nobody’s business,” and that is not really what Roe v. Wade is about, or what the E.U. regulations are about, or even what Katz and Carpenter are about. The real issue is the one that Pollak and Martin, in their suit against the District of Columbia in the Muzak case, said it was: liberty. This means the freedom to choose what to do with your body, or who can see your personal information, or who can monitor your movements and record your calls—who gets to surveil your life and on what grounds.

As we are learning, the danger of data collection by online companies is not that they will use it to try to sell you stuff. The danger is that that information can so easily fall into the hands of parties whose motives are much less benign. A government, for example. A typical reaction to worries about the police listening to your phone conversations is the one Gary Hart had when it was suggested that reporters might tail him to see if he was having affairs: “You’d be bored.” They were not, as it turned out. We all may underestimate our susceptibility to persecution. “We were just talking about hardwood floors!” we say. But authorities who feel emboldened by the promise of a Presidential pardon or by a Justice Department that looks the other way may feel less inhibited about invading the spaces of people who belong to groups that the government has singled out as unpatriotic or undesirable. And we now have a government that does that….(More)”.

Data Stewards: Data Leadership to Address 21st Century Challenges


Post by Stefaan Verhulst: “…Over the last two years, we have focused on the opportunities (and challenges) surrounding what we call “data collaboratives.” Data collaboratives are an emerging form of public-private partnership, in which information held by companies (or other entities) is shared with the public sector, civil society groups, research institutes and international organizations. …

For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations that has a mandate to explore ways to harness the potential of their data towards positive public ends.

Today, each attempt to establish a cross-sector partnership built on the analysis of private-sector data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.

As a consequence, the process of establishing data collaboratives and leveraging privately held data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.

By establishing data stewardship as a corporate function, recognized and trusted within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked….

To take stock of current practice and scope needs and opportunities we held a small yet in-depth kick-off event at the offices of the Cloudera Foundation in San Francisco on May 8th 2018 that was attended by representatives from Linkedin, Facebook, Uber, Mastercard, DigitalGlobe, Cognizant, Streetlight Data, the World Economic Forum, and Nethope — among others.

Four Key Take Aways

The discussions were varied and wide-ranging.

Several reflected on the risks involved — including the risks of NOT sharing or collaborating on privately held data that could improve people’s lives (and in some occasions save lives).

Others warned that the window of opportunity to increase the practice of data collaboratives may be closing — given new regulatory requirements and other barriers that may disincentivize corporations from engaging with third parties around their data.

Ultimately four key take aways emerged. These areas — at the nexus of opportunities and challenges — are worth considering further, because they help us better understand both the potential and limitations of data collaboratives….(More)”

Latin America is fighting corruption by opening up government data


Anoush Darabi in apolitical: “Hardly a country in Latin America has been untouched by corruption scandals; this was just one of the more bizarre episodes. In response, using a variety of open online platforms, both city and national governments are working to lift the lid on government activity, finding new ways to tackle corruption with technology….

In Buenos Aires, government is dealing with the problem by making the details of all its public works projects completely transparent. With BA Obras, an online platform, the city maps projects across the city, and lists detailed information on their cost, progress towards completion and the names of the contractors.

“We allocate an enormous amount of money,” said Alvaro Herrero, Under Secretary for Strategic Management and Institutional Quality for the government of Buenos Aires, who helped to build the tool. “We need to be accountable to citizens in terms of what are we doing with that money.”

The portal is designed to be accessible to the average user. Citizens can filter the map to focus on their neighbourhood, revealing information on existing projects with the click of a mouse.

“A journalist called our communications team a couple of weeks ago,” said Herrero. “He said: ‘I want all the information on all the infrastructure projects that the government has, and I want the documentation.’ Our guy’s answer was, ‘OK, I will send you all the information in ten seconds.’ All he had to do was send a link to the platform.”

Since launching in October 2017 with 80 public works projects, the platform now features over 850. It has had 75,000 unique views, the majority coming in the month after launching.

Making people aware and encouraging them to use it is key. “The main challenge is not the platform itself, but getting residents to use it,” said Herrero. “We’re still in that process.”

Brazil’s public spending checkers

Brazil is using big data analysis to scrutinise its spending via its Public Expenditure Observatory (ODP).

The ODP was founded in 2008 to help monitor spending across government departments systematically. In such a large country, spending data is difficult to pull together, and its volume makes it difficult to analyse. The ODP pulls together disparate information from government databases across the country into a central location, puts it into a consistent format and analyses it for inconsistency. Alongside analysis, the ODP also makes the data public.

For example, in 2010 the ODP analysed expenses made on credit cards by federal government officers. They discovered that 11% of all transactions that year were suspicious, requiring further investigation. After the data was published, credit card expenditure dropped by 25%….(More)”.

Data Ethics Framework


Introduction by Matt Hancock MP, Secretary of State for Digital, Culture, Media and Sport to the UK’s Data Ethics Framework: “Making better use of data offers huge benefits, in helping us provide the best possible services to the people we serve.

However, all new opportunities present new challenges. The pace of technology is changing so fast that we need to make sure we are constantly adapting our codes and standards. Those of us in the public sector need to lead the way.

As we set out to develop our National Data Strategy, getting the ethics right, particularly in the delivery of public services, is critical. To do this, it is essential that we agree collective standards and ethical frameworks.

Ethics and innovation are not mutually exclusive. Thinking carefully about how we use our data can help us be better at innovating when we use it.

Our new Data Ethics Framework sets out clear principles for how data should be used in the public sector. It will help us maximise the value of data whilst also setting the highest standards for transparency and accountability when building or buying new data technology.

We have come a long way since we published the first version of the Data Science Ethical Framework. This new version focuses on the need for technology, policy and operational specialists to work together, so we can make the most of expertise from across disciplines.

We want to work with others to develop transparent standards for using new technology in the public sector, promoting innovation in a safe and ethical way.

This framework will build the confidence in public sector data use needed to underpin a strong digital economy. I am looking forward to working with all of you to put it into practice…. (More)”

The Data Ethics Framework principles

1.Start with clear user need and public benefit

2.Be aware of relevant legislation and codes of practice

3.Use data that is proportionate to the user need

4.Understand the limitations of the data

5.Ensure robust practices and work within your skillset

6.Make your work transparent and be accountable

7.Embed data use responsibly

The Data Ethics Workbook

I want your (anonymized) social media data


Anthony Sanford at The Conversation: “Social media sites’ responses to the Facebook-Cambridge Analytica scandal and new European privacy regulations have given users much more control over who can access their data, and for what purposes. To me, as a social media user, these are positive developments: It’s scary to think what these platforms could do with the troves of data available about me. But as a researcher, increased restrictions on data sharing worry me.

I am among the many scholars who depend on data from social media to gain insights into people’s actions. In a rush to protect individuals’ privacy, I worry that an unintended casualty could be knowledge about human nature. My most recent work, for example, analyzes feelings people express on Twitter to explain why the stock market fluctuates so much over the course of a single day. There are applications well beyond finance. Other scholars have studied mass transit rider satisfactionemergency alert systems’ function during natural disasters and how online interactions influence people’s desire to lead healthy lifestyles.

This poses a dilemma – not just for me personally, but for society as a whole. Most people don’t want social media platforms to share or sell their personal information, unless specifically authorized by the individual user. But as members of a collective society, it’s useful to understand the social forces at work influencing everyday life and long-term trends. Before the recent crises, Facebook and other companies had already been making it hard for legitimate researchers to use their data, including by making it more difficult and more expensive to download and access data for analysis. The renewed public pressure for privacy means it’s likely to get even tougher….

It’s true – and concerning – that some presumably unethical people have tried to use social media data for their own benefit. But the data are not the actual problem, and cutting researchers’ access to data is not the solution. Doing so would also deprive society of the benefits of social media analysis.

Fortunately, there is a way to resolve this dilemma. Anonymization of data can keep people’s individual privacy intact, while giving researchers access to collective data that can yield important insights.

There’s even a strong model for how to strike that balance efficiently: the U.S. Census Bureau. For decades, that government agency has collected extremely personal data from households all across the country: ages, employment status, income levels, Social Security numbers and political affiliations. The results it publishes are very rich, but also not traceable to any individual.

It often is technically possible to reverse anonymity protections on data, using multiple pieces of anonymized information to identify the person they all relate to. The Census Bureau takes steps to prevent this.

For instance, when members of the public access census data, the Census Bureau restricts information that is likely to identify specific individuals, such as reporting there is just one person in a community with a particularly high- or low-income level.

For researchers the process is somewhat different, but provides significant protections both in law and in practice. Scholars have to pass the Census Bureau’s vetting process to make sure they are legitimate, and must undergo training about what they can and cannot do with the data. The penalties for violating the rules include not only being barred from using census data in the future, but also civil fines and even criminal prosecution.

Even then, what researchers get comes without a name or Social Security number. Instead, the Census Bureau uses what it calls “protected identification keys,” a random number that replaces data that would allow researchers to identify individuals.

Each person’s data is labeled with his or her own identification key, allowing researchers to link information of different types. For instance, a researcher wanting to track how long it takes people to complete a college degree could follow individuals’ education levels over time, thanks to the identification keys.

Social media platforms could implement a similar anonymization process instead of increasing hurdles – and cost – to access their data…(More)” .