Mastercard’s Big Data For Good Initiative: Data Philanthropy On The Front Lines


Interview by Randy Bean of Shamina Singh: Much has been written about big data initiatives and the efforts of market leaders to derive critical business insights faster. Less has been written about initiatives by some of these same firms to apply big data and analytics to a different set of issues, which are not solely focused on revenue growth or bottom line profitability. While the focus of most writing has been on the use of data for competitive advantage, a small set of companies has been undertaking, with much less fanfare, a range of initiatives designed to ensure that data can be applied not just for corporate good, but also for social good.

One such firm is Mastercard, which describes itself as a technology company in the payments industry, which connects buyers and sellers in 210 countries and territories across the globe. In 2013 Mastercard launched the Mastercard Center for Inclusive Growth, which operates as an independent subsidiary of Mastercard and is focused on the application of data to a range of issues for social benefit….

In testimony before the Senate Committee on Foreign Affairs on May 4, 2017, Mastercard Vice Chairman Walt Macnee, who serves as the Chairman of the Center for Inclusive Growth, addressed issues of private sector engagement. Macnee noted, “The private sector and public sector can each serve as a force for good independently; however when the public and private sectors work together, they unlock the potential to achieve even more.” Macnee further commented, “We will continue to leverage our technology, data, and know-how in an effort to solve many of the world’s most pressing problems. It is the right thing to do, and it is also good for business.”…

Central to the mission of the Mastercard Center is the notion of “data philanthropy”. This term encompasses notions of data collaboration and data sharing and is at the heart of the initiatives that the Center is undertaking. The three cornerstones on the Center’s mandate are:

  • Sharing Data Insights– This is achieved through the concept of “data grants”, which entails granting access to proprietary insights in support of social initiatives in a way that fully protects consumer privacy.
  • Data Knowledge – The Mastercard Center undertakes collaborations with not-for-profit and governmental organizations on a range of initiatives. One such effort was in collaboration with the Obama White House’s Data-Driven Justice Initiative, by which data was used to help advance criminal justice reform. This initiative was then able, through the use of insights provided by Mastercard, to demonstrate the impact crime has on merchant locations and local job opportunities in Baltimore.
  • Leveraging Expertise – Similarly, the Mastercard Center has collaborated with private organizations such as DataKind, which undertakes data science initiatives for social good.Just this past month, the Mastercard Center released initial findings from its Data Exploration: Neighborhood Crime and Local Business initiative. This effort was focused on ways in which Mastercard’s proprietary insights could be combined with public data on commercial robberies to help understand the potential relationships between criminal activity and business closings. A preliminary analysis showed a spike in commercial robberies followed by an increase in bar and nightclub closings. These analyses help community and business leaders understand factors that can impact business success.Late last year, Ms. Singh issued A Call to Action on Data Philanthropy, in which she challenges her industry peers to look at ways in which they can make a difference — “I urge colleagues at other companies to review their data assets to see how they may be leveraged for the benefit of society.” She concludes, “the sheer abundance of data available today offers an unprecedented opportunity to transform the world for good.”….(More)

The Nudging Divide in the Digital Big Data Era


Julia M. Puaschunder in the International Robotics & Automation Journal: “Since the end of the 1970ies a wide range of psychological, economic and sociological laboratory and field experiments proved human beings deviating from rational choices and standard neo-classical profit maximization axioms to fail to explain how human actually behave. Behavioral economists proposed to nudge and wink citizens to make better choices for them with many different applications. While the motivation behind nudging appears as a noble endeavor to foster peoples’ lives around the world in very many different applications, the nudging approach raises questions of social hierarchy and class division. The motivating force of the nudgital society may open a gate of exploitation of the populace and – based on privacy infringements – stripping them involuntarily from their own decision power in the shadow of legally-permitted libertarian paternalism and under the cloak of the noble goal of welfare-improving global governance. Nudging enables nudgers to plunder the simple uneducated citizen, who is neither aware of the nudging strategies nor able to oversee the tactics used by the nudgers.

The nudgers are thereby legally protected by democratically assigned positions they hold or by outsourcing strategies used, in which social media plays a crucial rule. Social media forces are captured as unfolding a class dividing nudgital society, in which the provider of social communication tools can reap surplus value from the information shared of social media users. The social media provider thereby becomes a capitalist-industrialist, who benefits from the information shared by social media users, or so-called consumer-workers, who share private information in their wish to interact with friends and communicate to public. The social media capitalist-industrialist reaps surplus value from the social media consumer-workers’ information sharing, which stems from nudging social media users. For one, social media space can be sold to marketers who can constantly penetrate the consumer-worker in a subliminal way with advertisements. But also nudging occurs as the big data compiled about the social media consumer-worker can be resold to marketers and technocrats to draw inferences about consumer choices, contemporary market trends or individual personality cues used for governance control, such as, for instance, border protection and tax compliance purposes.

The law of motion of the nudging societies holds an unequal concentration of power of those who have access to compiled data and who abuse their position under the cloak of hidden persuasion and in the shadow of paternalism. In the nudgital society, information, education and differing social classes determine who the nudgers and who the nudged are. Humans end in different silos or bubbles that differ in who has power and control and who is deceived and being ruled. The owners of the means of governance are able to reap a surplus value in a hidden persuasion, protected by the legal vacuum to curb libertarian paternalism, in the moral shadow of the unnoticeable guidance and under the cloak of the presumption that some know what is more rational than others. All these features lead to an unprecedented contemporary class struggle between the nudgers (those who nudge) and the nudged (those who are nudged), who are divided by the implicit means of governance in the digital scenery. In this light, governing our common welfare through deceptive means and outsourced governance on social media appears critical. In combination with the underlying assumption of the nudgers knowing better what is right, just and fair within society, the digital age and social media tools hold potential unprecedented ethical challenges….(More)”

Let the People Know the Facts: Can Government Information Removed from the Internet Be Reclaimed?


Paper by Susan Nevelow Mart: “…examines the legal bases of the public’s right to access government information, reviews the types of information that have recently been removed from the Internet, and analyzes the rationales given for the removals. She suggests that the concerted use of the Freedom of Information Act by public interest groups and their constituents is a possible method of returning the information to the Internet….(More)”.

The hidden costs of open data


Sara Friedman at GCN: “As more local governments open their data for public use, the emphasis is often on “free” — using open source tools to freely share already-created government datasets, often with pro bono help from outside groups. But according to a new report, there are unforeseen costs when it comes pushing government datasets out of public-facing platforms — especially when geospatial data is involved.

The research, led by University of Waterloo professor Peter A. Johnson and McGill University professor Renee Sieber, was based on work as part of Geothink.ca partnership research grant and exploration of the direct and indirect costs of open data.

Costs related to data collection, publishing, data sharing, maintenance and updates are increasingly driving governments to third-party providers to help with hosting, standardization and analytical tools for data inspection, the researchers found. GIS implementation also has associated costs to train staff, develop standards, create valuations for geospatial data, connect data to various user communities and get feedback on challenges.

Due to these direct costs, some governments are more likely to avoid opening datasets that need complex assessment or anonymization techniques for GIS concerns. Johnson and Sieber identified four areas where the benefits of open geospatial data can generate unexpected costs.

First, open data can create “smoke and mirrors” situation where insufficient resources are put toward deploying open data for government use. Users then experience “transaction costs” when it comes to working in specialist data formats that need additional skills, training and software to use.

Second, the level of investment and quality of open data can lead to “material benefits and social privilege” for communities that devote resources to providing more comprehensive platforms.

While there are some open source data platforms, the majority of solutions are proprietary and charged on a pro-rata basis, which can present a challenge for cities with larger, poor populations compared to smaller, wealthier cities. Issues also arise when governments try to combine their data sets, leading to increased costs to reconcile problems.

The third problem revolves around the private sector pushing for the release of data sets that can benefit their business objectives. Companies could push for the release high-value sets, such as a real-time transit data, to help with their product development goals. This can divert attention from low-value sets, such as those detailing municipal services or installations, that could have a bigger impact on residents “from a civil society perspective.”

If communities decide to release the low-value sets first, Johnson and Sieber think the focus can then be shifted to high-value sets that can help recoup the costs of developing the platforms.

Lastly, the report finds inadvertent consequences could result from tying open data resources to private-sector companies. Public-private open data partnerships could lead to infrastructure problems that prevent data from being widely shared, and help private companies in developing their bids for public services….

Johnson and Sieber encourage communities to ask the following questions before investing in open data:

  1. Who are the intended constituents for this open data?
  2. What is the purpose behind the structure for providing this data set?
  3. Does this data enable the intended users to meet their goals?
  4. How are privacy concerns addressed?
  5. Who sets the priorities for release and updates?…(More)”

Read the full report here.

‘I’ve Got Nothing to Hide’ and Other Misunderstandings of Privacy


“In this short essay, written for a symposium in the San Diego Law Review, Professor Daniel Solove examines the nothing to hide argument. When asked about government surveillance and data mining, many people respond by declaring: “I’ve got nothing to hide.” According to the nothing to hide argument, there is no threat to privacy unless the government uncovers unlawful activity, in which case a person has no legitimate justification to claim that it remain private. The nothing to hide argument and its variants are quite prevalent, and thus are worth addressing. In this essay, Solove critiques the nothing to hide argument and exposes its faulty underpinnings….(More)”

The DeepMind debacle demands dialogue on data


Hetan Shah in Nature: “Without public approval, advances in how we use data will stall. That is why a regulator’s ruling against the operator of three London hospitals is about more than mishandling records from 1.6 million patients. It is a missed opportunity to have a conversation with the public about appropriate uses for their data….

What can be done to address this deficit? Beyond meeting legal standards, all relevant institutions must take care to show themselves trustworthy in the eyes of the public. The lapses of the Royal Free hospitals and DeepMind provide, by omission, valuable lessons.

The first is to be open about what data are transferred. The extent of data transfer between the Royal Free and DeepMind came to light through investigative journalism. In my opinion, had the project proceeded under open contracting, it would have been subject to public scrutiny, and to questions about whether a company owned by Google — often accused of data monopoly — was best suited to create a relatively simple app.

The second lesson is that data transfer should be proportionate to the task. Information-sharing agreements should specify clear limits. It is unclear why an app for kidney injury requires the identifiable records of every patient seen by three hospitals over a five-year period.

Finally, governance mechanisms must be strengthened. It is shocking to me that the Royal Free did not assess the privacy impact of its actions before handing over access to records. DeepMind does deserve credit for (belatedly) setting up an independent review panel for health-care projects, especially because the panel has a designated budget and has not required members to sign non-disclosure agreements. (The two groups also agreed a new contract late last year, after criticism.)

More is needed. The Information Commissioner asked the Royal Free to improve its processes but did not fine it or require it to rescind data. This rap on the knuckles is unlikely to deter future, potentially worse, misuses of data. People are aware of the potential for over-reach, from the US government’s demands for state voter records to the Chinese government’s alleged plans to create a ‘social credit’ system that would monitor private behaviour.

Innovations such as artificial intelligence, machine learning and the Internet of Things offer great opportunities, but will falter without a public consensus around the role of data. To develop this, all data collectors and crunchers must be open and transparent. Consider how public confidence in genetic modification was lost in Europe, and how that has set back progress.

Public dialogue can build trust through collaborative efforts. A 14-member Citizen’s Reference Panel on health technologies was convened in Ontario, Canada in 2009. The Engage2020 programme incorporates societal input in the Horizon2020 stream of European Union science funding….(More)”

Modernizing government’s approach to transportation and land use data: Challenges and opportunities


Adie Tomer and Ranjitha Shivaram at Brookings: “In the fields of transportation and land use planning, the public sector has long taken the leading role in the collection, analysis, and dissemination of data. Often, public data sets drawn from traveler diaries, surveys, and supply-side transportation maps were the only way to understand how people move around in the built environment – how they get to work, how they drop kids off at school, where they choose to work out or relax, and so on.

But, change is afoot: today, there are not only new data providers, but also new types of data. Cellphones, GPS trackers, and other navigation devices offer real-time demand-side data. For instance, mobile phone data can point to where distracted driving is a problem and help implement measures to deter such behavior. Insurance data and geo-located police data can guide traffic safety improvements, especially in accident-prone zones. Geotagged photo data can illustrate the use of popular public spaces by locals and tourists alike, enabling greater return on investment from public spaces. Data from exercise apps like Fitbit and Runkeeper can help identify recreational hot spots that attract people and those that don’t.

However, integrating all this data into how we actually plan and build communities—including the transportation systems that move all of us and our goods—will not be easy. There are several core challenges. Limited staff capacity and restricted budgets in public agencies can slow adoption. Governmental procurement policies are stuck in an analog era. Privacy concerns introduce risk and uncertainty. Private data could be simply unavailable to public consumers. And even if governments could acquire all of the new data and analytics that interest them, their planning and investment models must be updated to fully utilize these new resources.

Using a mix of primary research and expert interviews, this report catalogs emerging data sets related to transportation and land use, and assesses the ease by which they can be integrated into how public agencies manage the built environment. It finds that there is reason for the hype; we have the ability to know more about how humans move around today than at any time in history. But, despite all the obvious opportunities, not addressing core challenges will limit public agencies’ ability to put all that data to use for the collective good….(More)”

Uber Releases Open Source Project for Differential Privacy


Katie Tezapsidis at Uber Security: “Data analysis helps Uber continuously improve the user experience by preventing fraud, increasing efficiency, and providing important safety features for riders and drivers. Data gives our teams timely feedback about what we’re doing right and what needs improvement.

Uber is committed to protecting user privacy and we apply this principle throughout our business, including our internal data analytics. While Uber already has technical and administrative controls in place to limit who can access specific databases, we are adding additional protections governing how that data is used — even in authorized cases.

We are excited to give a first glimpse of our recent work on these additional protections with the release of a new open source tool, which we’ll introduce below.

Background: Differential Privacy

Differential privacy is a formal definition of privacy and is widely recognized by industry experts as providing strong and robust privacy assurances for individuals. In short, differential privacy allows general statistical analysis without revealing information about a particular individual in the data. Results do not even reveal whether any individual appears in the data. For this reason, differential privacy provides an extra layer of protection against re-identification attacks as well as attacks using auxiliary data.

Differential privacy can provide high accuracy results for the class of queries Uber commonly uses to identify statistical trends. Consequently, differential privacy allows us to calculate aggregations (averages, sums, counts, etc.) of elements like groups of users or trips on the platform without exposing information that could be used to infer details about a specific user or trip.

Differential privacy is enforced by adding noise to a query’s result, but some queries are more sensitive to the data of a single individual than others. To account for this, the amount of noise added must be tuned to the sensitivity of the query, which is defined as the maximum change in the query’s output when an individual’s data is added to or removed from the database.

As part of their job, a data analyst at Uber might need to know the average trip distance in a particular city. A large city, like San Francisco, might have hundreds of thousands of trips with an average distance of 3.5 miles. If any individual trip is removed from the data, the average remains close to 3.5 miles. This query therefore has low sensitivity, and thus requires less noise to enable each individual to remain anonymous within the crowd.

Conversely, the average trip distance in a smaller city with far fewer trips is more influenced by a single trip and may require more noise to provide the same degree of privacy. Differential privacy defines the precise amount of noise required given the sensitivity.

A major challenge for practical differential privacy is how to efficiently compute the sensitivity of a query. Existing methods lack sufficient support for the features used in Uber’s queries and many approaches require replacing the database with a custom runtime engine. Uber uses many different database engines and replacing these databases is infeasible. Moreover, custom runtimes cannot meet Uber’s demanding scalability and performance requirements.

Introducing Elastic Sensitivity

To address these challenges we adopted Elastic Sensitivity, a technique developed by security researchers at the University of California, Berkeley for efficiently calculating the sensitivity of a query without requiring changes to the database. The full technical details of Elastic Sensitivity are described here.

Today, we are excited to share a tool developed in collaboration with these researchers to calculate Elastic Sensitivity for SQL queries. The tool is available now on GitHub. It is designed to integrate easily with existing data environments and support additional state-of-the-art differential privacy mechanisms, which we plan to share in the coming months….(More)”.

Intelligent sharing: unleashing the potential of health and care data in the UK to transform outcomes


Report by Future Care Capital: “….Data is often referred to as the ‘new oil’ – the 21st century raw material which, when hitched to algorithmic refinement, may be mined for insight and value – and ‘data flows’ are said to have exerted a greater impact upon global growth than traditional goods flows in recent years (Manyika et al, 2016). Small wonder, then, that governments around the world are endeavouring to strike a balance between individual privacy rights and protections on the one hand, and organisational permissions to facilitate the creation of social, economic and environmental value from broad-ranging data on the other: ‘data rights’ are now of critical importance courtesy of technological advancements. The tension between the two is particularly evident where health and care data in the UK is concerned. Individuals are broadly content with anonymised data from their medical records being used for public benefit but are, understandably, anxious about the implications of the most intimate aspects of their lives being hacked or, else, shared without their knowledge or consent….

The potential for health and care data to be transformative remains, and there is growing concern that opportunities to improve the use of health and care data in peoples’ interests are being missed….

we recommend additional support for digitisation efforts in social care settings. We call upon the Government to streamline processes associated with Information Governance (IG) modelling to help data sharing initiatives that traverse organisational boundaries. We also advocate for investment and additional legal safeguards to make more anonymised data sets available for research and innovation. Crucially, we recommend expediting the scope for individuals to contribute health and care data to sharing initiatives led by the public sector through promotion, education and pilot activities – so that data is deployed to transform public health and support the ‘pivot to prevention’.

In Chapter Two, we explore the rationale and scope for the UK to build upon emergent practice from around the world and become a global leader in ‘data philanthropy’ – to push at the boundaries of existing plans and programmes, and support the development of and access to unrivalled health and care data sets. We look at member-controlled ‘data cooperatives’ and what we’ve termed ‘data communities’ operated by trusted intermediaries. We also explore ‘data collaboratives’ which involve the private sector engaging in data philanthropy for public benefit. Here, we make recommendations about promoting a culture of data philanthropy through the demonstration of tangible benefits to participants and the wider public, and we call upon Government to assess the appetite and feasibility of establishing the world’s first National Health and Care Data Donor Bank….(More)”

 

The ethics issue: Should we abandon privacy online?


Special issue of the New Scientist: “Those who would give up essential Liberty to purchase a little temporary Safety,” Benjamin Franklin once said, “deserve neither Liberty nor Safety.” But if Franklin were alive today, where would he draw the line? Is the freedom to send an encrypted text message essential? How about the right to keep our browsing history private? What is the sweet spot between our need to be left alone and our desire to keep potential criminals from communicating in secret?

In an age where fear of terrorism is high in the public consciousness, governments are likely to err on the side of safety. Over the past decade, the authorities have been pushing for – and getting – greater powers of surveillance than they have ever had, all in the name of national security.

The downsides are not immediately obvious. After all, you might think you have nothing to hide. But most of us have perfectly legal secrets we’d rather someone else didn’t see. And although the chances of the authorities turning up to take you away in a black SUV on the basis of your WhatsApp messages are small in free societies, the chances of insurance companies raising your premiums are not….(More)”.