Mapping the Privacy-Utility Tradeoff in Mobile Phone Data for Development


Paper by Alejandro Noriega-Campero, Alex Rutherford, Oren Lederman, Yves A. de Montjoye, and Alex Pentland: “Today’s age of data holds high potential to enhance the way we pursue and monitor progress in the fields of development and humanitarian action. We study the relation between data utility and privacy risk in large-scale behavioral data, focusing on mobile phone metadata as paradigmatic domain. To measure utility, we survey experts about the value of mobile phone metadata at various spatial and temporal granularity levels. To measure privacy, we propose a formal and intuitive measure of reidentification riskthe information ratioand compute it at each granularity level. Our results confirm the existence of a stark tradeoff between data utility and reidentifiability, where the most valuable datasets are also most prone to reidentification. When data is specified at ZIP-code and hourly levels, outside knowledge of only 7% of a person’s data suffices for reidentification and retrieval of the remaining 93%. In contrast, in the least valuable dataset, specified at municipality and daily levels, reidentification requires on average outside knowledge of 51%, or 31 data points, of a person’s data to retrieve the remaining 49%. Overall, our findings show that coarsening data directly erodes its value, and highlight the need for using data-coarsening, not as stand-alone mechanism, but in combination with data-sharing models that provide adjustable degrees of accountability and security….(More)”.

A roadmap for restoring trust in Big Data


Mark Lawler et al in the Lancet: “The fallout from the Cambridge Analytica–Facebook scandal marks a significant inflection point in the public’s trust concerning Big Data. The health-science community must use this crisis-in-confidence to redouble its commitment to talk openly and transparently about benefits and risks and to act decisively to deliver robust effective governance frameworks, under which personal health data can be responsibly used. Activities such as the Innovative Medicines Initiative’s Big Data for Better Outcomes emphasise how a more granular data-driven understanding of human diseases including cancer could underpin innovative therapeutic intervention.
 Health Data Research UK is developing national research expertise and infrastructure to maximise the value of health data science for the National Health Service and ultimately British citizens.
Comprehensive data analytics are crucial to national programmes such as the US Cancer Moonshot, the UK’s 100 000 Genomes project, and other national genomics programmes. Cancer Core Europe, a research partnership between seven leading European oncology centres, has personal data sharing at its core. The Global Alliance for Genomics and Health recently highlighted the need for a global cancer knowledge network to drive evidence-based solutions for a disease that kills more than 8·7 million citizens annually worldwide. These activities risk being fatally undermined by the recent data-harvesting controversy.
We need to restore the public’s trust in data science and emphasise its positive contribution in addressing global health and societal challenges. An opportunity to affirm the value of data science in Europe was afforded by Digital Day 2018, which took place on April 10, 2018, in Brussels, and where European Health Ministers signed a declaration of support to link existing or future genomic databanks across the EU, through the Million European Genomes Alliance.
So how do we address evolving challenges in analysis, sharing, and storage of information, ensure transparency and confidentiality, and restore public trust? We must articulate a clear Social Contract, where citizens (as data donors) are at the heart of decision-making. We need to demonstrate integrity, honesty, and transparency as to what happens to data and what level of control people can, or cannot, expect. We must embed ethical rigour in all our data-driven processes. The Framework for Responsible Sharing of Genomic and Health Related Data represents a practical global approach, promoting effective and ethical sharing and use of research or patient data, while safeguarding individual privacy through secure and accountable data transfer…(More)”.

Americans Want to Share Their Medical Data. So Why Can’t They?


Eleni Manis at RealClearHealth: “Americans are willing to share personal data — even sensitive medical data — to advance the common good. A recent Stanford University study found that 93 percent of medical trial participants in the United States are willing to share their medical data with university scientists and 82 percent are willing to share with scientists at for-profit companies. In contrast, less than a third are concerned that their data might be stolen or used for marketing purposes.

However, the majority of regulations surrounding medical data focus on individuals’ ability to restrict the use of their medical data, with scant attention paid to supporting the ability to share personal data for the common good. Policymakers can begin to right this balance by establishing a national medical data donor registry that lets individuals contribute their medical data to support research after their deaths. Doing so would help medical researchers pursue cures and improve health care outcomes for all Americans.

Increased medical data sharing facilitates advances in medical science in three key ways. First, de-identified participant-level data can be used to understand the results of trials, enabling researchers to better explicate the relationship between treatments and outcomes. Second, researchers can use shared data to verify studies and identify cases of data fraud and research misconduct in the medical community. For example, one researcher recently discovered a prolific Japanese anesthesiologist had falsified data for almost two decades. Third, shared data can be combined and supplemented to support new studies and discoveries.

Despite these benefits, researchers, research funders, and regulators have struggled to establish a norm for sharing clinical research data. In some cases, regulatory obstacles are to blame. HIPAA — the federal law regulating medical data — blocks some sharing on grounds of patient privacy, while federal and state regulations governing data sharing are inconsistent. Researchers themselves have a proprietary interest in data they produce, while academic researchers seeking to maximize publications may guard data jealously.

Though funding bodies are aware of this tension, they are unable to resolve it on their own. The National Institutes of Health, for example, requires a data sharing plan for big-ticket funding but recognizes that proprietary interests may make sharing impossible….(More)”.

Reclaiming the Smart City: Personal Data, Trust and the New Commons


Report by Theo Bass, Emma Sutherland and Tom Symons: “Cities are becoming a major focal point in the personal data economy. In city governments, there is a clamour for data-informed approaches to everything from waste management and public transport through to policing and emergency response

This is a triumph for advocates of the better use of data in how we run cities. After years of making the case, there is now a general acceptance that social, economic and environmental pressures can be better responded to by harnessing data.

But as that argument is won, a fresh debate is bubbling up under the surface of the glossy prospectus of the smart city: who decides what we do with all this data, and how do we ensure that its generation and use does not result in discrimination, exclusion and the erosion of privacy for citizens?

This report brings together a range of case studies featuring cities which have pioneered innovative practices and policies around the responsible use of data about people. Our methods combined desk research and over 20 interviews with city administrators in a number of cities across the world.

Recommendations

Based on our case studies, we also compile a range of lessons that policymakers can use to build an alternative version to the smart city – one which promotes ethical data collection practices and responsible innovation with new technologies:

  1. Build consensus around clear ethical principles, and translate them into practical policies.
  2. Train public sector staff in how to assess the benefits and risks of smart technologies.
  3. Look outside the council for expertise and partnerships, including with other city governments.
  4. Find and articulate the benefits of privacy and digital ethics to multiple stakeholders
  5. Become a test-bed for new services that give people more privacy and control.
  6. Make time and resources available for genuine public engagement on the use of surveillance technologies.
  7. Build digital literacy and make complex or opaque systems more understandable and accountable.
  8. Find opportunities to involve citizens in the process of data collection and analysis from start to finish….(More)”.

Big Data Is Getting Bigger. So Are the Privacy and Ethical Questions.


Goldie Blumenstyk at The Chronicle of Higher Education: “…The next step in using “big data” for student success is upon us. It’s a little cool. And also kind of creepy.

This new approach goes beyond the tactics now used by hundreds of colleges, which depend on data collected from sources like classroom teaching platforms and student-information systems. It not only makes a technological leap; it also raises issues around ethics and privacy.

Here’s how it works: Whenever you log on to a wireless network with your cellphone or computer, you leave a digital footprint. Move from one building to another while staying on the same network, and that network knows how long you stayed and where you went. That data is collected continuously and automatically from the network’s various nodes.

Now, with the help of a company called Degree Analytics, a few colleges are beginning to use location data collected from students’ cellphones and laptops as they move around campus. Some colleges are using it to improve the kind of advice they might send to students, like a text-message reminder to go to class if they’ve been absent.

Others see it as a tool for making decisions on how to use their facilities. St. Edward’s University, in Austin, Tex., used the data to better understand how students were using its computer-equipped spaces. It found that a renovated lounge, with relatively few computers but with Wi-Fi access and several comfy couches, was one of the most popular such sites on campus. Now the university knows it may not need to buy as many computers as it once thought.

As Gary Garofalo, a co-founder and chief revenue officer of Degree Analytics, told me, “the network data has very intriguing advantages” over the forms of data that colleges now collect.

Some of those advantages are obvious: If you’ve got automatic information on every person walking around with a cellphone, your dataset is more complete than if you need to extract it from a learning-management system or from the swipe-card readers some colleges use to track students’ activities. Many colleges now collect such data to determine students’ engagement with their coursework and campus activities.

Of course, the 24-7 reporting of the data is also what makes this approach seem kind of creepy….

I’m not the first to ask questions like this. A couple of years ago, a group of educators organized by Martin Kurzweil of Ithaka S+R and Mitchell Stevens of Stanford University issued a series of guidelines for colleges and companies to consider as they began to embrace data analytics. Among other principles, the guidelines highlighted the importance of being transparent about how the information is used, and ensuring that institutions’ leaders really understand what companies are doing with the data they collect. Experts at New America weighed in too.

I asked Kurzweil what he makes of the use of Wi-Fi information. Location tracking tends toward the “dicey” side of the spectrum, he says, though perhaps not as far out as using students’ social-media habits, health information, or what they check out from the library. The fundamental question, he says, is “how are they managing it?”… So is this the future? Benz, at least, certainly hopes so. Inspired by the Wi-Fi-based StudentLife research project at Dartmouth College and the experiences Purdue University is having with students’ use of its Forecast app, he’s in talks now with a research university about a project that would generate other insights that might be gleaned from students’ Wi-Fi-usage patterns….(More)

‘Mayor for a Day’ – Is Gamified Urban Management the Way Forward?


Paper by Gianluca Sgueo: “…aims at describing the use, exploring the potential – but also at understanding the limits – of the use of ‘gamification’ strategies into urban management. Commonly defined as the introduction of game-design elements into non-game contexts, with the former aimed at making the latter more fun, gamification is recognised among the technological paradigms that are shaping the evolution of public administrations.

The paper is divided in three sections.

SECTION I discusses the definition (and appropriateness of) gamification in urban management, and locates it conceptually at the crossroads between nudging, democratic innovations, and crowdsourcing.

SECTION II analyses the potentials of gamified urban management. Four benefits are assessed: first, gamified urban management seems to encourage adaptation of policy-making to structural/societal changes; second, it offers a chance to local administrators to (re-)gain trust from citizens, and thus be perceived as legitimate; third, it adapts policy-making to budgetary challenges; fourth, it helps to efficiently tackle complex regulatory issues.

SECTION III of this paper turns to consider the risks related with the use of gamification in urban management. The first consists of the obstacles faced by participatory rights within gamified policies; the second risk is defined ‘paradox of incentives’; the third is related with privacy issues. In the concluding section, this paper advances some proposals (or, alternatively, highlight valuable theoretical and empirical research efforts) aimed at solving some of the most pressing threats posed by gamified urban management.

The main features of the case studies described in SECTIONS II and III are summarised in a table at the end of the paper….(More)”.

Algorithms are taking over – and woe betide anyone they class as a ‘deadbeat’


Zoe Williams at The Guardian: “The radical geographer and equality evangelist Danny Dorling tried to explain to me once why an algorithm could be bad for social justice.

Imagine if email inboxes became intelligent: your messages would be prioritised on arrival, so if the recipient knew you and often replied to you, you’d go to the top; I said that was fine. That’s how it works already. If they knew you and never replied, you’d go to the bottom, he continued. I said that was fair – it would teach me to stop annoying that person.

If you were a stranger, but typically other people replied to you very quickly – let’s say you were Barack Obama – you’d sail right to the top. That seemed reasonable. And if you were a stranger who others usually ignored, you’d fall off the face of the earth.

“Well, maybe they should get an allotment and stop emailing people,” I said.

“Imagine how angry those people would be,” Dorling said. “They already feel invisible and they [would] become invisible by design.”…

All our debates about the use of big data have centred on privacy, and all seem a bit distant: I care, in principle, whether or not Ocado knows what I bought on Amazon. But in my truest heart, I don’t really care whether or not my Frube vendor knows that I also like dystopian fiction of the 1970s.

I do, however, care that a program exists that will determine my eligibility for a loan by how often I call my mother. I care if landlords are using tools to rank their tenants by compliant behaviour, to create a giant, shared platform of desirable tenants, who never complain about black mould and greet each rent increase with a basket of muffins. I care if the police in Durham are using Experian credit scores to influence their custodial decisions, an example – as you may have guessed by its specificity – that is already real. I care that the same credit-rating company has devised a Mosaic score, which splits households into comically bigoted stereotypes: if your name is Liam and you are an “avid texter”, that puts you in “disconnected youth”, while if you’re Asha you’re in “crowded kaleidoscope”. It’s not a privacy issue so much as a profiling one, although, as anyone who has ever been the repeated victim of police stop-and-search could have told me years ago, these are frequently the same thing.

Privacy isn’t the right to keep secrets: it’s the right to be an individual, not a type; the right to make a choice that’s entirely your own; the right to be private….(More)”.

The Case for Accountability: How it Enables Effective Data Protection and Trust in the Digital Society


Centre for Information Policy Leadership: “Accountability now has broad international support and has been adopted in many laws, including in the EU General Data Protection Regulation (GDPR), regulatory policies and organisational practices. It is essential that there is consensus and clarity on the precise meaning and application of organisational accountability among all stakeholders, including organisations implementing accountability and data protection authorities (DPAs) overseeing accountability.

Without such consensus, organisations will not know what DPAs expect of them and DPAs will not know how to assess organisations’ accountability-based privacy programs with any degree of consistency and predictability. Thus, drawing from the global experience with accountability to date and from the Centre for Information Policy Leadership’s (CIPL) own extensive prior work on accountability, this paper seeks to explain the following issues:

  • The concept of organisational accountability and how it is reflected in the GDPR;
  • The essential elements of accountability and how the requirements of the GDPR (and of other normative frameworks) map to these elements;
  • Global acceptance and adoption of accountability;
  • How organisations can implement accountability (including by and between controllers and processors) through comprehensive internal privacy programs that implement external rules or the organisation’s own data protection policies and goals, or through verified or certified accountability mechanisms, such as Binding Corporate Rules (BCR), APEC Cross-Border Privacy Rules (CBPR), APEC Privacy Recognition for Processors (PRP), other seals and certifications, including future GDPR certifications and codes of conduct; and
  • The benefits that accountability can deliver to each stakeholder group.

In addition, the paper argues that accountability exists along a spectrum, ranging from basic accountability requirements required by law (such as under the GDPR) to stronger and more granular accountability measures that may not be required by law but that organisations may nevertheless want to implement because they convey substantial benefits….(More)”.

The Data Transfer Project


About: “The Data Transfer Project was formed in 2017 to create an open-source, service-to-service data portability platform so that all individuals across the web could easily move their data between online service providers whenever they want.

The contributors to the Data Transfer Project believe portability and interoperability are central to innovation. Making it easier for individuals to choose among services facilitates competition, empowers individuals to try new services and enables them to choose the offering that best suits their needs.

Current contributors include Facebook, Google, Microsoft and Twitter.

Individuals have many reasons to transfer data, but we want to highlight a few examples that demonstrate the additional value of service-to-service portability.

  • A user discovers a new photo printing service offering beautiful and innovative photo book formats, but their photos are stored in their social media account. With the Data Transfer Project, they could visit a website or app offered by the photo printing service and initiate a transfer directly from their social media platform to the photo book service.
  • A user doesn’t agree with the privacy policy of their music service. They want to stop using it immediately, but don’t want to lose the playlists they have created. Using this open-source software, they could use the export functionality of the original Provider to save a copy of their playlists to the cloud. This enables them to import the lists to a new Provider, or multiple Providers, once they decide on a new service.
  • A large company is getting requests from customers who would like to import data from a legacy Provider that is going out of business. The legacy Provider has limited options for letting customers move their data. The large company writes an Adapter for the legacy Provider’s Application Program Interfaces (APIs) that permits users to transfer data to their service, also benefiting other Providers that handle the same data type.
  • A user in a low bandwidth area has been working with an architect on drawings and graphics for a new house. At the end of the project, they both want to transfer all the files from a shared storage system to the user’s cloud storage drive. They go to the cloud storage Data Transfer Project User Interface (UI) and move hundreds of large files directly, without straining their bandwidth.
  • An industry association for supermarkets wants to allow customers to transfer their loyalty card data from one member grocer to another, so they can get coupons based on buying habits between stores. The Association would do this by hosting an industry-specific Host Platform of DTP.

The innovation in each of these examples lies behind the scenes: Data Transfer Project makes it easy for Providers to allow their customers to interact with their data in ways their customers would expect. In most cases, the direct-data transfer experience will be branded and managed by the receiving Provider, and the customer wouldn’t need to see DTP branding or infrastructure at all….

To get a more in-depth understanding of the project, its fundamentals and the details involved, please download “Data Transfer Project Overview and Fundamentals”….(More)”.

How Mobile Network Operators Can Help Achieve the Sustainable Development Goals Profitably


Press Release: “Today, the Digital Impact Alliance (DIAL) released its second paper in a series focused on the promise of data for development (D4D). The paper, Leveraging Data for Development to Achieve Your Triple Bottom Line: Mobile Network Operators with Advanced Data for Good Capabilities See Stronger Impact to Profits, People and the Planet, will be presented at GSMA’s Mobile 360 Africa in Kigali.

“The mobile industry has already taken a driving seat in helping reach the Sustainable Development Goals by 2030 and this research reinforces the role mobile network operators in lower-income economies can play to leverage their network data for development and build a new data business safely and securely,” said Kate Wilson, CEO of the Digital Impact Alliance. “Mobile network operators (MNOs) hold unique data on customers’ locations and behaviors that can help development efforts. They have been reluctant to share data because there are inherent business risks and to do so has been expensive and time consuming.  DIAL’s research illustrates a path forward for MNOs on which data is useful to achieve the SDGs and why acting now is critical to building a long-term data business.”

DIAL worked with Altai Consulting on both primary and secondary research to inform this latest paper.  Primary research included one-on-one in-depth interviews with more than 50 executives across the data for development value chain, including government officials, civil society leaders, mobile network operators and other private sector representatives from both developed and emerging markets. These interviews help inform how operators can best tap into the shared value creation opportunities data for development provides.

Key findings from the in-depth interviews include:

  • There are several critical barriers that have prevented scaled use of mobile data for social good – including 1) unclear market opportunities, 2) not enough collaboration among MNOs, governments and non-profit stakeholders and 3) regulatory and privacy concerns;
  • While it may be an ideal time for MNOs to increase their involvement in D4D efforts given the unique data they have that can inform development, market shifts suggest the window of opportunity to implement large-scale D4D initiatives will likely not remain open for much longer;
  • Mobile Network Operators with advanced data for good capabilities will have the most success in establishing sustainable D4D efforts; and as a result, achieving triple bottom line mandates; and
  • Mobile Network Operators should focus on providing value-added insights and services rather than raw data and drive pricing and product innovation to meet the sector’s needs.

“Private sector data availability to drive public sector decision-making is a critical enabler for meeting SDG targets,” said Syed Raza, Senior Director of the Data for Development Team at the Digital Impact Alliance.  “Our data for development paper series aims to elevate the efforts of our industry colleagues with the information, insights and tools they need to help drive ethical innovation in this space….(More)”.