Big Data: A New Empiricism and its Epistemic and Socio-Political Consequences


Chapter by Gernot Rieder and Judith Simon in Berechenbarkeit der Welt?: “The paper investigates the rise of Big Data in contemporary society. It examines the most prominent epistemological claims made by Big Data proponents, calls attention to the potential socio-political consequences of blind data trust, and proposes a possible way forward. The paper’s main focus is on the interplay between an emerging new empiricism and an increasingly opaque algorithmic environment that challenges democratic demands for transparency and accountability. It concludes that a responsible culture of quantification requires epistemic vigilance as well as a greater awareness of the potential dangers and pitfalls of an ever more data-driven society….(More)”.

Harnessing the Data Revolution to Achieve the Sustainable Development Goals


Erol Yayboke et al at CSIS: “Functioning societies collect accurate data and utilize the evidence to inform policy. The use of evidence derived from data in policymaking requires the capability to collect and analyze accurate data, clear administrative channels through which timely evidence is made available to decisionmakers, and the political will to rely on—and ideally share—the evidence. The collection of accurate and timely data, especially in the developing world, is often logistically difficult, not politically expedient, and/or expensive.

Before launching its second round of global goals—the Sustainable Development Goals (SDGs)—the United Nations convened a High-Level Panel of Eminent Persons on the Post-2015 Development Agenda. As part of its final report, the Panel called for a “data revolution” and recommended the formation of an independent body to lead the charge.1The report resulted in the creation of the Global Partnership for Sustainable Development Data (GPSDD)—an independent group of countries, companies, data communities, and NGOs—and the SDG Data Labs, a private initiative partnered with the GPSDD. In doing so the United Nations and its partners signaled broad interest in data and evidence-based policymaking at a high level. In fact, the GPSDD calls for the “revolution in data” by addressing the “crisis of non-existent, inaccessible or unreliable data.”As this report shows, this is easier said than done.

This report defines the data revolution as an unprecedented increase in the volume and types of data—and the subsequent demand for them—thanks to the ongoing yet uneven proliferation of new technologies. This revolution is allowing governments, companies, researchers, and citizens to monitor progress and drive action, often with real-time, dynamic, disaggregated data. Much work will be needed to make sure the data revolution reaches developing countries facing difficult challenges (i.e., before the data revolution fully becomes the data revolution for sustainable development). It is important to think of the revolution as a multistep process, beginning with building basic knowledge and awareness of the value of data. This is followed by a more specific focus on public private partnerships, opportunities, and constraints regarding collection and utilization of data for evidence-based policy decisions….

This report provides the following recommendations to the international community to play a constructive role in the data revolution:

  • Don’t fixate on big data alone. Focus on the foundation necessary to facilitate leapfrogs around all types of data: small, big, and everywhere in between.
  • Increase funding for capacity building as part of an expansion of broader educational development priorities.
  • Highlight, share, and support enlightened government-driven approaches to data.
  • Increase funding for the data revolution and coordinate donor efforts.
  • Coordinate UN data revolution-related activities closely with an expanded GPSDD.
  • Secure consensus on data sharing, ownership, and privacy-related international standards….(More)”.

The Use of Big Data Analytics by the IRS: Efficient Solutions or the End of Privacy as We Know It?


Kimberly A. Houser and Debra Sanders in the Vanderbilt Journal of Entertainment and Technology Law: “This Article examines the privacy issues resulting from the IRS’s big data analytics program as well as the potential violations of federal law. Although historically, the IRS chose tax returns to audit based on internal mathematical mistakes or mismatches with third party reports (such as W-2s), the IRS is now engaging in data mining of public and commercial data pools (including social media) and creating highly detailed profiles of taxpayers upon which to run data analytics. This Article argues that current IRS practices, mostly unknown to the general public are violating fair information practices. This lack of transparency and accountability not only violates federal law regarding the government’s data collection activities and use of predictive algorithms, but may also result in discrimination. While the potential efficiencies that big data analytics provides may appear to be a panacea for the IRS’s budget woes, unchecked, these activities are a significant threat to privacy. Other concerns regarding the IRS’s entrée into big data are raised including the potential for political targeting, data breaches, and the misuse of such information. This Article intends to bring attention to these privacy concerns and contribute to the academic and policy discussions about the risks presented by the IRS’s data collection, mining and analytics activities….(More)”.

Automation Beyond the Physical: AI in the Public Sector


Ben Miller at Government Technology: “…The technology is, by nature, broadly applicable. If a thing involves data — “data” itself being a nebulous word — then it probably has room for AI. AI can help manage the data, analyze it and find patterns that humans might not have thought of. When it comes to big data, or data sets so big that they become difficult for humans to manually interact with, AI leverages the speedy nature of computing to find relationships that might otherwise be proverbial haystack needles.

One early area of government application is in customer service chatbots. As state and local governments started putting information on websites in the past couple of decades, they found that they could use those portals as a means of answering questions that constituents used to have to call an office to ask.

Ideally that results in a cyclical victory: Government offices didn’t have as many calls to answer, so they could devote more time and resources to other functions. And when somebody did call in, their call might be answered faster.

With chatbots, governments are betting they can answer even more of those questions. When he was the chief technology and innovation officer of North Carolina, Eric Ellis oversaw the setup of a system that did just that for IT help desk calls.

Turned out, more than 80 percent of the help desk’s calls were people who wanted to change their passwords. For something like that, where the process is largely the same each time, a bot can speed up the process with a little help from AI. Then, just like with the government Web portal, workers are freed up to respond to the more complicated calls faster….

Others are using AI to recognize and report objects in photographs and videos — guns, waterfowl, cracked concrete, pedestrians, semi-trucks, everything. Others are using AI to help translate between languages dynamically. Some want to use it to analyze the tone of emails. Some are using it to try to keep up with cybersecurity threats even as they morph and evolve. After all, if AI can learn to beat professional poker players, then why can’t it learn how digital black hats operate?

Castro sees another use for the technology, a more introspective one. The problem is this: The government workforce is a lot older than the private sector, and that can make it hard to create culture change. According to U.S. Census Bureau data, about 27 percent of public-sector workers are millennials, compared with 38 percent in the private sector.

“The traditional view [of government work] is you fill out a lot of forms, there are a lot of boring meetings. There’s a lot of bureaucracy in government,” Castro said. “AI has the opportunity to change a lot of that, things like filling out forms … going to routine meetings and stuff.”

As AI becomes more and more ubiquitous, people who work both inside and with government are coming up with an ever-expanding list of ways to use it. Here’s an inexhaustive list of specific use cases — some of which are already up and running and some of which are still just ideas….(More)”.

Debating big data: A literature review on realizing value from big data


Wendy Arianne Günther et al in The Journal of Strategic Information Systems: “Big data has been considered to be a breakthrough technological development over recent years. Notwithstanding, we have as yet limited understanding of how organizations translate its potential into actual social and economic value. We conduct an in-depth systematic review of IS literature on the topic and identify six debates central to how organizations realize value from big data, at different levels of analysis. Based on this review, we identify two socio-technical features of big data that influence value realization: portability and interconnectivity. We argue that, in practice, organizations need to continuously realign work practices, organizational models, and stakeholder interests in order to reap the benefits from big data. We synthesize the findings by means of an integrated model….(More)”.

Mastercard’s Big Data For Good Initiative: Data Philanthropy On The Front Lines


Interview by Randy Bean of Shamina Singh: Much has been written about big data initiatives and the efforts of market leaders to derive critical business insights faster. Less has been written about initiatives by some of these same firms to apply big data and analytics to a different set of issues, which are not solely focused on revenue growth or bottom line profitability. While the focus of most writing has been on the use of data for competitive advantage, a small set of companies has been undertaking, with much less fanfare, a range of initiatives designed to ensure that data can be applied not just for corporate good, but also for social good.

One such firm is Mastercard, which describes itself as a technology company in the payments industry, which connects buyers and sellers in 210 countries and territories across the globe. In 2013 Mastercard launched the Mastercard Center for Inclusive Growth, which operates as an independent subsidiary of Mastercard and is focused on the application of data to a range of issues for social benefit….

In testimony before the Senate Committee on Foreign Affairs on May 4, 2017, Mastercard Vice Chairman Walt Macnee, who serves as the Chairman of the Center for Inclusive Growth, addressed issues of private sector engagement. Macnee noted, “The private sector and public sector can each serve as a force for good independently; however when the public and private sectors work together, they unlock the potential to achieve even more.” Macnee further commented, “We will continue to leverage our technology, data, and know-how in an effort to solve many of the world’s most pressing problems. It is the right thing to do, and it is also good for business.”…

Central to the mission of the Mastercard Center is the notion of “data philanthropy”. This term encompasses notions of data collaboration and data sharing and is at the heart of the initiatives that the Center is undertaking. The three cornerstones on the Center’s mandate are:

  • Sharing Data Insights– This is achieved through the concept of “data grants”, which entails granting access to proprietary insights in support of social initiatives in a way that fully protects consumer privacy.
  • Data Knowledge – The Mastercard Center undertakes collaborations with not-for-profit and governmental organizations on a range of initiatives. One such effort was in collaboration with the Obama White House’s Data-Driven Justice Initiative, by which data was used to help advance criminal justice reform. This initiative was then able, through the use of insights provided by Mastercard, to demonstrate the impact crime has on merchant locations and local job opportunities in Baltimore.
  • Leveraging Expertise – Similarly, the Mastercard Center has collaborated with private organizations such as DataKind, which undertakes data science initiatives for social good.Just this past month, the Mastercard Center released initial findings from its Data Exploration: Neighborhood Crime and Local Business initiative. This effort was focused on ways in which Mastercard’s proprietary insights could be combined with public data on commercial robberies to help understand the potential relationships between criminal activity and business closings. A preliminary analysis showed a spike in commercial robberies followed by an increase in bar and nightclub closings. These analyses help community and business leaders understand factors that can impact business success.Late last year, Ms. Singh issued A Call to Action on Data Philanthropy, in which she challenges her industry peers to look at ways in which they can make a difference — “I urge colleagues at other companies to review their data assets to see how they may be leveraged for the benefit of society.” She concludes, “the sheer abundance of data available today offers an unprecedented opportunity to transform the world for good.”….(More)

Digital Decisions Tool


Center for Democracy and Technology (CDT): “Two years ago, CDT embarked on a project to explore what we call “digital decisions” – the use of algorithms, machine learning, big data, and automation to make decisions that impact individuals and shape society. Industry and government are applying algorithms and automation to problems big and small, from reminding us to leave for the airport to determining eligibility for social services and even detecting deadly diseases. This new era of digital decision-making has created a new challenge: ensuring that decisions made by computers reflect values like equality, democracy, and justice. We want to ensure that big data and automation are used in ways that create better outcomes for everyone, and not in ways that disadvantage minority groups.

The engineers and product managers who design these systems are the first line of defense against unfair, discriminatory, and harmful outcomes. To help mitigate harm at the design level, we have launched the first public version of our digital decisions tool. We created the tool to help developers understand and mitigate unintended bias and ethical pitfalls as they design automated decision-making systems.

About the digital decisions tool

This interactive tool translates principles for fair and ethical automated decision-making into a series of questions that can be addressed during the process of designing and deploying an algorithm. The questions address developers’ choices, such as what data to use to train an algorithm, what factors or features in the data to consider, and how to test the algorithm. They also ask about the systems and checks in place to assess risk and ensure fairness. These questions should provoke thoughtful consideration of the subjective choices that go into building an automated decision-making system and how those choices could result in disparate outcomes and unintended harms.

The tool is informed by extensive research by CDT and others about how algorithms and machine learning work, how they’re used, the potential risks of using them to make important decisions, and the principles that civil society has developed to ensure that digital decisions are fair, ethical, and respect civil rights. Some of this research is summarized on CDT’s Digital Decisions webpage….(More)”.

The Nudging Divide in the Digital Big Data Era


Julia M. Puaschunder in the International Robotics & Automation Journal: “Since the end of the 1970ies a wide range of psychological, economic and sociological laboratory and field experiments proved human beings deviating from rational choices and standard neo-classical profit maximization axioms to fail to explain how human actually behave. Behavioral economists proposed to nudge and wink citizens to make better choices for them with many different applications. While the motivation behind nudging appears as a noble endeavor to foster peoples’ lives around the world in very many different applications, the nudging approach raises questions of social hierarchy and class division. The motivating force of the nudgital society may open a gate of exploitation of the populace and – based on privacy infringements – stripping them involuntarily from their own decision power in the shadow of legally-permitted libertarian paternalism and under the cloak of the noble goal of welfare-improving global governance. Nudging enables nudgers to plunder the simple uneducated citizen, who is neither aware of the nudging strategies nor able to oversee the tactics used by the nudgers.

The nudgers are thereby legally protected by democratically assigned positions they hold or by outsourcing strategies used, in which social media plays a crucial rule. Social media forces are captured as unfolding a class dividing nudgital society, in which the provider of social communication tools can reap surplus value from the information shared of social media users. The social media provider thereby becomes a capitalist-industrialist, who benefits from the information shared by social media users, or so-called consumer-workers, who share private information in their wish to interact with friends and communicate to public. The social media capitalist-industrialist reaps surplus value from the social media consumer-workers’ information sharing, which stems from nudging social media users. For one, social media space can be sold to marketers who can constantly penetrate the consumer-worker in a subliminal way with advertisements. But also nudging occurs as the big data compiled about the social media consumer-worker can be resold to marketers and technocrats to draw inferences about consumer choices, contemporary market trends or individual personality cues used for governance control, such as, for instance, border protection and tax compliance purposes.

The law of motion of the nudging societies holds an unequal concentration of power of those who have access to compiled data and who abuse their position under the cloak of hidden persuasion and in the shadow of paternalism. In the nudgital society, information, education and differing social classes determine who the nudgers and who the nudged are. Humans end in different silos or bubbles that differ in who has power and control and who is deceived and being ruled. The owners of the means of governance are able to reap a surplus value in a hidden persuasion, protected by the legal vacuum to curb libertarian paternalism, in the moral shadow of the unnoticeable guidance and under the cloak of the presumption that some know what is more rational than others. All these features lead to an unprecedented contemporary class struggle between the nudgers (those who nudge) and the nudged (those who are nudged), who are divided by the implicit means of governance in the digital scenery. In this light, governing our common welfare through deceptive means and outsourced governance on social media appears critical. In combination with the underlying assumption of the nudgers knowing better what is right, just and fair within society, the digital age and social media tools hold potential unprecedented ethical challenges….(More)”

Why We Should Care About Bad Data


Blog by Stefaan G. Verhulst: “At a time of open and big data, data-led and evidence-based policy making has great potential to improve problem solving but will have limited, if not harmful, effects if the underlying components are riddled with bad data.

Why should we care about bad data? What do we mean by bad data? And what are the determining factors contributing to bad data that if understood and addressed could prevent or tackle bad data? These questions were the subject of my short presentation during a recent webinar on  Bad Data: The Hobgoblin of Effective Government, hosted by the American Society for Public Administration and moderated by Richard Greene (Partner, Barrett and Greene Inc.). Other panelists included Ben Ward (Manager, Information Technology Audits Unit, California State Auditor’s Office) and Katherine Barrett (Partner, Barrett and Greene Inc.). The webinar was a follow-up to the excellent Special Issue of Governing on Bad Data written by Richard and Katherine….(More)”

Rage against the machines: is AI-powered government worth it?


Maëlle Gavet at the WEF: “…the Australian government’s new “data-driven profiling” trial for drug testing welfare recipients, to US law enforcement’s use of facial recognition technology and the deployment of proprietary software in sentencing in many US courts … almost by stealth and with remarkably little outcry, technology is transforming the way we are policed, categorized as citizens and, perhaps one day soon, governed. We are only in the earliest stages of so-called algorithmic regulation — intelligent machines deploying big data, machine learning and artificial intelligence (AI) to regulate human behaviour and enforce laws — but it already has profound implications for the relationship between private citizens and the state….

Some may herald this as democracy rebooted. In my view it represents nothing less than a threat to democracy itself — and deep scepticism should prevail. There are five major problems with bringing algorithms into the policy arena:

  1. Self-reinforcing bias…
  2. Vulnerability to attack…
  3. Who’s calling the shots?…
  4. Are governments up to it?…
  5. Algorithms don’t do nuance….

All the problems notwithstanding, there’s little doubt that AI-powered government of some kind will happen. So, how can we avoid it becoming the stuff of bad science fiction? To begin with, we should leverage AI to explore positive alternatives instead of just applying it to support traditional solutions to society’s perceived problems. Rather than simply finding and sending criminals to jail faster in order to protect the public, how about using AI to figure out the effectiveness of other potential solutions? Offering young adult literacy, numeracy and other skills might well represent a far superior and more cost-effective solution to crime than more aggressive law enforcement. Moreover, AI should always be used at a population level, rather than at the individual level, in order to avoid stigmatizing people on the basis of their history, their genes and where they live. The same goes for the more subtle, yet even more pervasive data-driven targeting by prospective employers, health insurers, credit card companies and mortgage providers. While the commercial imperative for AI-powered categorization is clear, when it targets individuals it amounts to profiling with the inevitable consequence that entire sections of society are locked out of opportunity….(More)”.