Harnessing the Data Revolution to Achieve the Sustainable Development Goals


Erol Yayboke et al at CSIS: “Functioning societies collect accurate data and utilize the evidence to inform policy. The use of evidence derived from data in policymaking requires the capability to collect and analyze accurate data, clear administrative channels through which timely evidence is made available to decisionmakers, and the political will to rely on—and ideally share—the evidence. The collection of accurate and timely data, especially in the developing world, is often logistically difficult, not politically expedient, and/or expensive.

Before launching its second round of global goals—the Sustainable Development Goals (SDGs)—the United Nations convened a High-Level Panel of Eminent Persons on the Post-2015 Development Agenda. As part of its final report, the Panel called for a “data revolution” and recommended the formation of an independent body to lead the charge.1The report resulted in the creation of the Global Partnership for Sustainable Development Data (GPSDD)—an independent group of countries, companies, data communities, and NGOs—and the SDG Data Labs, a private initiative partnered with the GPSDD. In doing so the United Nations and its partners signaled broad interest in data and evidence-based policymaking at a high level. In fact, the GPSDD calls for the “revolution in data” by addressing the “crisis of non-existent, inaccessible or unreliable data.”As this report shows, this is easier said than done.

This report defines the data revolution as an unprecedented increase in the volume and types of data—and the subsequent demand for them—thanks to the ongoing yet uneven proliferation of new technologies. This revolution is allowing governments, companies, researchers, and citizens to monitor progress and drive action, often with real-time, dynamic, disaggregated data. Much work will be needed to make sure the data revolution reaches developing countries facing difficult challenges (i.e., before the data revolution fully becomes the data revolution for sustainable development). It is important to think of the revolution as a multistep process, beginning with building basic knowledge and awareness of the value of data. This is followed by a more specific focus on public private partnerships, opportunities, and constraints regarding collection and utilization of data for evidence-based policy decisions….

This report provides the following recommendations to the international community to play a constructive role in the data revolution:

  • Don’t fixate on big data alone. Focus on the foundation necessary to facilitate leapfrogs around all types of data: small, big, and everywhere in between.
  • Increase funding for capacity building as part of an expansion of broader educational development priorities.
  • Highlight, share, and support enlightened government-driven approaches to data.
  • Increase funding for the data revolution and coordinate donor efforts.
  • Coordinate UN data revolution-related activities closely with an expanded GPSDD.
  • Secure consensus on data sharing, ownership, and privacy-related international standards….(More)”.

Owned: Property, Privacy, and the New Digital Serfdom Read


Book by Joshua A. T. Fairfield: “… explains the crisis of digital ownership – how and why we no longer control our smartphones or software-enable devices, which are effectively owned by software and content companies. In two years we will not own our ‘smart’ televisions which will also be used by advertisers to listen in to our living rooms. In the coming decade, if we do not take back our ownership rights, the same will be said of our self-driving cars and software-enabled homes. We risk becoming digital peasants, owned by software and advertising companies, not to mention overreaching governments. Owned should be read by anyone wanting to know more about the loss of our property rights, the implications for our privacy rights and how we can regain control of both….(More)”.

The Use of Big Data Analytics by the IRS: Efficient Solutions or the End of Privacy as We Know It?


Kimberly A. Houser and Debra Sanders in the Vanderbilt Journal of Entertainment and Technology Law: “This Article examines the privacy issues resulting from the IRS’s big data analytics program as well as the potential violations of federal law. Although historically, the IRS chose tax returns to audit based on internal mathematical mistakes or mismatches with third party reports (such as W-2s), the IRS is now engaging in data mining of public and commercial data pools (including social media) and creating highly detailed profiles of taxpayers upon which to run data analytics. This Article argues that current IRS practices, mostly unknown to the general public are violating fair information practices. This lack of transparency and accountability not only violates federal law regarding the government’s data collection activities and use of predictive algorithms, but may also result in discrimination. While the potential efficiencies that big data analytics provides may appear to be a panacea for the IRS’s budget woes, unchecked, these activities are a significant threat to privacy. Other concerns regarding the IRS’s entrée into big data are raised including the potential for political targeting, data breaches, and the misuse of such information. This Article intends to bring attention to these privacy concerns and contribute to the academic and policy discussions about the risks presented by the IRS’s data collection, mining and analytics activities….(More)”.

What does it mean to be differentially private?


Paul Francis at IAPP: “Back in June 2016, Apple announced it will use differential privacy to protect individual privacy for certain data that it collects. Though already a hot research topic for over a decade, this announcement introduced differential privacy to the broader public. Before that announcement, Google had already been using differential privacy for collecting Chrome usage statistics. And within the last month, Uber announced that they too are using differential privacy.

If you’ve done a little homework on differential privacy, you may have learned that it provides provable guarantees of privacy and concluded that a database that is differentially private is, well, private — in other words, that it protects individual privacy. But that isn’t necessarily the case. When someone says, “a database is differentially private,” they don’t mean that the database is private. Rather, they mean, “the privacy of the database can be measured.”

Really, it is like saying that “a bridge is weight limited.” If you know the weight limit of a bridge, then yes, you can use the bridge safely. But the bridge isn’t safe under all conditions. You can exceed the weight limit and hurt yourself.

The weight limit of bridges is expressed in tons, kilograms or number of people. Simplifying here a bit, the amount of privacy afforded by a differentially private database is expressed as a number, by convention labeled ε (epsilon). Lower ε means more private.

All bridges have a weight limit. Everybody knows this, so it sounds dumb to say, “a bridge is weight limited.” And guess what? All databases are differentially private. Or, more precisely, all databases have an ε. A database with no privacy protections at all has an ε of infinity. It is pretty misleading to call such a database differentially private, but mathematically speaking, it is not incorrect to do so. A database that can’t be queried at all has an ε of zero. Private, but useless.

In their paper on differential privacy for statistics, Cynthia Dwork and Adam Smith write, “The choice of ε is essentially a social question. We tend to think of ε as, say, 0.01, 0.1, or in some cases, ln 2 or ln 3.” The natural logarithm of 3 (ln 3) is around 1.1….(More)”.

Artificial Intelligence for Citizen Services and Government


Paper by Hila Mehr: “From online services like Netflix and Facebook, to chatbots on our phones and in our homes like Siri and Alexa, we are beginning to interact with artificial intelligence (AI) on a near daily basis. AI is the programming or training of a computer to do tasks typically reserved for human intelligence, whether it is recommending which movie to watch next or answering technical questions. Soon, AI will permeate the ways we interact with our government, too. From small cities in the US to countries like Japan, government agencies are looking to AI to improve citizen services.

While the potential future use cases of AI in government remain bounded by government resources and the limits of both human creativity and trust in government, the most obvious and immediately beneficial opportunities are those where AI can reduce administrative burdens, help resolve resource allocation problems, and take on significantly complex tasks. Many AI case studies in citizen services today fall into five categories: answering questions, filling out and searching documents, routing requests, translation, and drafting documents. These applications could make government work more efficient while freeing up time for employees to build better relationships with citizens. With citizen satisfaction with digital government offerings leaving much to be desired, AI may be one way to bridge the gap while improving citizen engagement and service delivery.

Despite the clear opportunities, AI will not solve systemic problems in government, and could potentially exacerbate issues around service delivery, privacy, and ethics if not implemented thoughtfully and strategically. Agencies interested in implementing AI can learn from previous government transformation efforts, as well as private-sector implementation of AI. Government offices should consider these six strategies for applying AI to their work: make AI a part of a goals-based, citizen-centric program; get citizen input; build upon existing resources; be data-prepared and tread carefully with privacy; mitigate ethical risks and avoid AI decision making; and, augment employees, do not replace them.

This paper explores the various types of AI applications, and current and future uses of AI in government delivery of citizen services, with a focus on citizen inquiries and information. It also offers strategies for governments as they consider implementing AI….(More)”

Mastercard’s Big Data For Good Initiative: Data Philanthropy On The Front Lines


Interview by Randy Bean of Shamina Singh: Much has been written about big data initiatives and the efforts of market leaders to derive critical business insights faster. Less has been written about initiatives by some of these same firms to apply big data and analytics to a different set of issues, which are not solely focused on revenue growth or bottom line profitability. While the focus of most writing has been on the use of data for competitive advantage, a small set of companies has been undertaking, with much less fanfare, a range of initiatives designed to ensure that data can be applied not just for corporate good, but also for social good.

One such firm is Mastercard, which describes itself as a technology company in the payments industry, which connects buyers and sellers in 210 countries and territories across the globe. In 2013 Mastercard launched the Mastercard Center for Inclusive Growth, which operates as an independent subsidiary of Mastercard and is focused on the application of data to a range of issues for social benefit….

In testimony before the Senate Committee on Foreign Affairs on May 4, 2017, Mastercard Vice Chairman Walt Macnee, who serves as the Chairman of the Center for Inclusive Growth, addressed issues of private sector engagement. Macnee noted, “The private sector and public sector can each serve as a force for good independently; however when the public and private sectors work together, they unlock the potential to achieve even more.” Macnee further commented, “We will continue to leverage our technology, data, and know-how in an effort to solve many of the world’s most pressing problems. It is the right thing to do, and it is also good for business.”…

Central to the mission of the Mastercard Center is the notion of “data philanthropy”. This term encompasses notions of data collaboration and data sharing and is at the heart of the initiatives that the Center is undertaking. The three cornerstones on the Center’s mandate are:

  • Sharing Data Insights– This is achieved through the concept of “data grants”, which entails granting access to proprietary insights in support of social initiatives in a way that fully protects consumer privacy.
  • Data Knowledge – The Mastercard Center undertakes collaborations with not-for-profit and governmental organizations on a range of initiatives. One such effort was in collaboration with the Obama White House’s Data-Driven Justice Initiative, by which data was used to help advance criminal justice reform. This initiative was then able, through the use of insights provided by Mastercard, to demonstrate the impact crime has on merchant locations and local job opportunities in Baltimore.
  • Leveraging Expertise – Similarly, the Mastercard Center has collaborated with private organizations such as DataKind, which undertakes data science initiatives for social good.Just this past month, the Mastercard Center released initial findings from its Data Exploration: Neighborhood Crime and Local Business initiative. This effort was focused on ways in which Mastercard’s proprietary insights could be combined with public data on commercial robberies to help understand the potential relationships between criminal activity and business closings. A preliminary analysis showed a spike in commercial robberies followed by an increase in bar and nightclub closings. These analyses help community and business leaders understand factors that can impact business success.Late last year, Ms. Singh issued A Call to Action on Data Philanthropy, in which she challenges her industry peers to look at ways in which they can make a difference — “I urge colleagues at other companies to review their data assets to see how they may be leveraged for the benefit of society.” She concludes, “the sheer abundance of data available today offers an unprecedented opportunity to transform the world for good.”….(More)

The Nudging Divide in the Digital Big Data Era


Julia M. Puaschunder in the International Robotics & Automation Journal: “Since the end of the 1970ies a wide range of psychological, economic and sociological laboratory and field experiments proved human beings deviating from rational choices and standard neo-classical profit maximization axioms to fail to explain how human actually behave. Behavioral economists proposed to nudge and wink citizens to make better choices for them with many different applications. While the motivation behind nudging appears as a noble endeavor to foster peoples’ lives around the world in very many different applications, the nudging approach raises questions of social hierarchy and class division. The motivating force of the nudgital society may open a gate of exploitation of the populace and – based on privacy infringements – stripping them involuntarily from their own decision power in the shadow of legally-permitted libertarian paternalism and under the cloak of the noble goal of welfare-improving global governance. Nudging enables nudgers to plunder the simple uneducated citizen, who is neither aware of the nudging strategies nor able to oversee the tactics used by the nudgers.

The nudgers are thereby legally protected by democratically assigned positions they hold or by outsourcing strategies used, in which social media plays a crucial rule. Social media forces are captured as unfolding a class dividing nudgital society, in which the provider of social communication tools can reap surplus value from the information shared of social media users. The social media provider thereby becomes a capitalist-industrialist, who benefits from the information shared by social media users, or so-called consumer-workers, who share private information in their wish to interact with friends and communicate to public. The social media capitalist-industrialist reaps surplus value from the social media consumer-workers’ information sharing, which stems from nudging social media users. For one, social media space can be sold to marketers who can constantly penetrate the consumer-worker in a subliminal way with advertisements. But also nudging occurs as the big data compiled about the social media consumer-worker can be resold to marketers and technocrats to draw inferences about consumer choices, contemporary market trends or individual personality cues used for governance control, such as, for instance, border protection and tax compliance purposes.

The law of motion of the nudging societies holds an unequal concentration of power of those who have access to compiled data and who abuse their position under the cloak of hidden persuasion and in the shadow of paternalism. In the nudgital society, information, education and differing social classes determine who the nudgers and who the nudged are. Humans end in different silos or bubbles that differ in who has power and control and who is deceived and being ruled. The owners of the means of governance are able to reap a surplus value in a hidden persuasion, protected by the legal vacuum to curb libertarian paternalism, in the moral shadow of the unnoticeable guidance and under the cloak of the presumption that some know what is more rational than others. All these features lead to an unprecedented contemporary class struggle between the nudgers (those who nudge) and the nudged (those who are nudged), who are divided by the implicit means of governance in the digital scenery. In this light, governing our common welfare through deceptive means and outsourced governance on social media appears critical. In combination with the underlying assumption of the nudgers knowing better what is right, just and fair within society, the digital age and social media tools hold potential unprecedented ethical challenges….(More)”

Let the People Know the Facts: Can Government Information Removed from the Internet Be Reclaimed?


Paper by Susan Nevelow Mart: “…examines the legal bases of the public’s right to access government information, reviews the types of information that have recently been removed from the Internet, and analyzes the rationales given for the removals. She suggests that the concerted use of the Freedom of Information Act by public interest groups and their constituents is a possible method of returning the information to the Internet….(More)”.

The hidden costs of open data


Sara Friedman at GCN: “As more local governments open their data for public use, the emphasis is often on “free” — using open source tools to freely share already-created government datasets, often with pro bono help from outside groups. But according to a new report, there are unforeseen costs when it comes pushing government datasets out of public-facing platforms — especially when geospatial data is involved.

The research, led by University of Waterloo professor Peter A. Johnson and McGill University professor Renee Sieber, was based on work as part of Geothink.ca partnership research grant and exploration of the direct and indirect costs of open data.

Costs related to data collection, publishing, data sharing, maintenance and updates are increasingly driving governments to third-party providers to help with hosting, standardization and analytical tools for data inspection, the researchers found. GIS implementation also has associated costs to train staff, develop standards, create valuations for geospatial data, connect data to various user communities and get feedback on challenges.

Due to these direct costs, some governments are more likely to avoid opening datasets that need complex assessment or anonymization techniques for GIS concerns. Johnson and Sieber identified four areas where the benefits of open geospatial data can generate unexpected costs.

First, open data can create “smoke and mirrors” situation where insufficient resources are put toward deploying open data for government use. Users then experience “transaction costs” when it comes to working in specialist data formats that need additional skills, training and software to use.

Second, the level of investment and quality of open data can lead to “material benefits and social privilege” for communities that devote resources to providing more comprehensive platforms.

While there are some open source data platforms, the majority of solutions are proprietary and charged on a pro-rata basis, which can present a challenge for cities with larger, poor populations compared to smaller, wealthier cities. Issues also arise when governments try to combine their data sets, leading to increased costs to reconcile problems.

The third problem revolves around the private sector pushing for the release of data sets that can benefit their business objectives. Companies could push for the release high-value sets, such as a real-time transit data, to help with their product development goals. This can divert attention from low-value sets, such as those detailing municipal services or installations, that could have a bigger impact on residents “from a civil society perspective.”

If communities decide to release the low-value sets first, Johnson and Sieber think the focus can then be shifted to high-value sets that can help recoup the costs of developing the platforms.

Lastly, the report finds inadvertent consequences could result from tying open data resources to private-sector companies. Public-private open data partnerships could lead to infrastructure problems that prevent data from being widely shared, and help private companies in developing their bids for public services….

Johnson and Sieber encourage communities to ask the following questions before investing in open data:

  1. Who are the intended constituents for this open data?
  2. What is the purpose behind the structure for providing this data set?
  3. Does this data enable the intended users to meet their goals?
  4. How are privacy concerns addressed?
  5. Who sets the priorities for release and updates?…(More)”

Read the full report here.

‘I’ve Got Nothing to Hide’ and Other Misunderstandings of Privacy


“In this short essay, written for a symposium in the San Diego Law Review, Professor Daniel Solove examines the nothing to hide argument. When asked about government surveillance and data mining, many people respond by declaring: “I’ve got nothing to hide.” According to the nothing to hide argument, there is no threat to privacy unless the government uncovers unlawful activity, in which case a person has no legitimate justification to claim that it remain private. The nothing to hide argument and its variants are quite prevalent, and thus are worth addressing. In this essay, Solove critiques the nothing to hide argument and exposes its faulty underpinnings….(More)”