‘Data is a fingerprint’: why you aren’t as anonymous as you think online


Olivia Solon at The Guardian: “In August 2016, the Australian government released an “anonymised” data set comprising the medical billing records, including every prescription and surgery, of 2.9 million people.

Names and other identifying features were removed from the records in an effort to protect individuals’ privacy, but a research team from the University of Melbourne soon discovered that it was simple to re-identify people, and learn about their entire medical history without their consent, by comparing the dataset to other publicly available information, such as reports of celebrities having babies or athletes having surgeries.

The government pulled the data from its website, but not before it had been downloaded 1,500 times.

This privacy nightmare is one of many examples of seemingly innocuous, “de-identified” pieces of information being reverse-engineered to expose people’s identities. And it’s only getting worse as people spend more of their lives online, sprinkling digital breadcrumbs that can be traced back to them to violate their privacy in ways they never expected.

Nameless New York taxi logs were compared with paparazzi shots at locations around the city to reveal that Bradley Cooper and Jessica Alba were bad tippers. In 2017 German researchers were able to identify people based on their “anonymous” web browsing patterns. This week University College London researchers showed how they could identify an individual Twitter user based on the metadata associated with their tweets, while the fitness tracking app Polar revealed the homes and in some cases names of soldiers and spies.

“It’s convenient to pretend it’s hard to re-identify people, but it’s easy. The kinds of things we did are the kinds of things that any first-year data science student could do,” said Vanessa Teague, one of the University of Melbourne researchers to reveal the flaws in the open health data.

One of the earliest examples of this type of privacy violation occurred in 1996 when the Massachusetts Group Insurance Commission released “anonymised” data showing the hospital visits of state employees. As with the Australian data, the state removed obvious identifiers like name, address and social security number. Then the governor, William Weld, assured the public that patients’ privacy was protected….(More)”.

Is Open Data Working for Women in Africa?


Web Foundation: “Open data has the potential to change politics, economies and societies for the better by giving people more opportunities to engage in the decisions that affect their lives. But to reach the full potential of open data, it must be available to and used by all. Yet, across the globe — and in Africa in particular — there is a significant data gap.

This report — Is open data working for women in Africa — maps the current state of open data for women across Africa, with insights from country-specific research in Nigeria, Cameroon, Uganda and South Africa with additional data from a survey of experts in 12 countries across the continent.

Our findings show that, despite the potential for open data to empower people, it has so far changed little for women living in Africa.

Key findings

  • There is a closed data culture in Africa — Most countries lack an open culture and have legislation and processes that are not gender-responsive. Institutional resistance to disclosing data means few countries have open data policies and initiatives at the national level. In addition, gender equality legislation and policies are incomplete and failing to reduce gender inequalities. And overall, Africa lacks the cross-organisational collaboration needed to strengthen the open data movement.
  • There are barriers preventing women from using the data that is available — Cultural and social realities create additional challenges for women to engage with data and participate in the technology sector. 1GB of mobile data in Africa costs, on average, 10% of average monthly income. This high cost keeps women, who generally earn less than men, offline. Moreover, time poverty, the gender pay gap and unpaid labour create economic obstacles for women to engage with digital technology.
  • Key datasets to support the advocacy objectives of women’s groups are missing — Data on budget, health and crime are largely absent as open data. Nearly all datasets in sub-Saharan Africa (373 out of 375) are closed, and sex-disaggregated data, when available online, is often not published as open data. There are few open data policies to support opening up of key datasets and even when they do exist, they largely remain in draft form. With little investment in open data initiatives, good data management practices or for implementing Right To Information (RTI) reforms, improvement is unlikely.
  • There is no strong base of research on women’s access and use of open data — There is lack of funding, little collaboration and few open data champions. Women’s groups, digital rights groups and gender experts rarely collaborate on open data and gender issues. To overcome this barrier, multi-stakeholder collaborations are essential to develop effective solutions….(More)”.

What if people were paid for their data?


The Economist: “Data Slavery” Jennifer Lyn Morone, an American artist, thinks this is the state in which most people now live. To get free online services, she laments, they hand over intimate information to technology firms. “Personal data are much more valuable than you think,” she says. To highlight this sorry state of affairs, Ms Morone has resorted to what she calls “extreme capitalism”: she registered herself as a company in Delaware in an effort to exploit her personal data for financial gain. She created dossiers containing different subsets of data, which she displayed in a London gallery in 2016 and offered for sale, starting at £100 ($135). The entire collection, including her health data and social-security number, can be had for £7,000.

Only a few buyers have taken her up on this offer and she finds “the whole thing really absurd”. ..Given the current state of digital affairs, in which the collection and exploitation of personal data is dominated by big tech firms, Ms Morone’s approach, in which individuals offer their data for sale, seems unlikely to catch on. But what if people really controlled their data—and the tech giants were required to pay for access? What would such a data economy look like?…

Labour, like data, is a resource that is hard to pin down. Workers were not properly compensated for labour for most of human history. Even once people were free to sell their labour, it took decades for wages to reach liveable levels on average. History won’t repeat itself, but chances are that it will rhyme, Mr Weyl predicts in “Radical Markets”, a provocative new book he has co-written with Eric Posner of the University of Chicago. He argues that in the age of artificial intelligence, it makes sense to treat data as a form of labour.

To understand why, it helps to keep in mind that “artificial intelligence” is something of a misnomer. Messrs Weyl and Posner call it “collective intelligence”: most AI algorithms need to be trained using reams of human-generated examples, in a process called machine learning. Unless they know what the right answers (provided by humans) are meant to be, algorithms cannot translate languages, understand speech or recognise objects in images. Data provided by humans can thus be seen as a form of labour which powers AI. As the data economy grows up, such data work will take many forms. Much of it will be passive, as people engage in all kinds of activities—liking social-media posts, listening to music, recommending restaurants—that generate the data needed to power new services. But some people’s data work will be more active, as they make decisions (such as labelling images or steering a car through a busy city) that can be used as the basis for training AI systems….

But much still needs to happen for personal data to be widely considered as labour, and paid for as such. For one thing, the right legal framework will be needed to encourage the emergence of a new data economy. The European Union’s new General Data Protection Regulation, which came into effect in May, already gives people extensive rights to check, download and even delete personal data held by companies. Second, the technology to keep track of data flows needs to become much more capable. Research to calculate the value of particular data to an AI service is in its infancy.

Third, and most important, people will have to develop a “class consciousness” as data workers. Most people say they want their personal information to be protected, but then trade it away for nearly nothing, something known as the “privacy paradox”. Yet things may be changing: more than 90% of Americans think being in control of who can get data on them is important, according to the Pew Research Centre, a think-tank….(More)”.

Motivating Bureaucrats through Social Recognition


Evidence from Simultaneous Field Experiments by Varun Gauri,Julian C. Jamison, Nina Mazar, Owen Ozier, Shomikho Raha and Karima Saleh: “Bureaucratic performance is a crucial determinant of economic growth. Little is known about how to improve it in resource-constrained settings.

This study describes a field trial of a social recognition intervention to improve record keeping in clinics in two Nigerian states, replicating the intervention—implemented by a single organization—on bureaucrats performing identical tasks in both states.

Social recognition improved performance in one state but had no effect in the other, highlighting both the potential and the limitations of behavioral interventions. Differences in observables did not explain cross-state differences in impacts, however, illustrating the limitations of observable-based approaches to external validity….(More)”.

Blockchain’s governance paradox


Izabella Kaminska at the Financial Times: “Distributed ledger technologies “are starting to look an awful lot like some of the more conventional technical solutions that we have,” says Prof. Vili Lehdonvirta, an associate professor and senior research fellow at the Oxford Internet Institute, at a recent talk he gave at the Alan Turing Institute.

At the heart of the issue (as always) is who dictates and enforces the rules of the system if and when things go wrong, according to Lehdonvirta. He echoes a point we’ve long made, namely, that what really matters in these systems is how they deal with exceptions rather than norms.

The industry’s continuous shifting of nomenclature hints at the inherent challenges and revisionism at hand. As blockchains become DLTs, shared databases and permissioned consensus networks, what the techies working on these systems fail to publicly highlight is that much of the time, “advance” means returning to tried and tested paradigms, or reintroducing trusted or governance-focused nodes.

Albeit, the “back to square one” solution isn’t unique to blockchain. We see the same pattern playing out across the network/platform industry. For example, Airbnb was built on the notion that peers could organise accommodation for each other bilaterally without any dependence on a centralised manager. As time went on, however, trust issues across the platform — everything from fraud, misrepresentation, bad consumer experience, abuse, vandalism or damage — forced the once proudly employee-light company to load up on staff who could troubleshoot many of these problems. In so doing, Airbnb — much like Ebay before it — transformed itself from a tech company into an adjudicator, value custodian and rules-and-standards authority.

And by and large, that’s not been an unwelcome transformation, from the consumer’s perspective. Indeed, what libertarian tech anarchists often fail to understand is that the public is not opposed to the idea of putting their trust in institutions, especially when they’re operated by real people who can be held accountable for things going wrong.

What they seemingly understand and technologists don’t is this: Trusting other parties to protect, enforce and adjudicate the rules of operation enhances division of labour and thus efficiency. I no longer have to waste hours of time trying to figure out if the counterparty I’ve dealt with on Ebay is trustworthy or not. Ebay governs the platform in such a way that I can be confident failed trades will always be compensated, and that Ebay’s own judgement about compensation entitlement will always be fair. After all, its continuing reputation as an efficient exchange platform depends on it.

But back to blockchian.

As Lehdonvirta observes, the vision of blockchain is of a system which can enforce contracts, prevent double spending, and cap the money supply pool without ceding power to anyone:

No rent-seeking, no abuses of power, no politics — blockchain technologies can be used to create “math-based money” and “unstoppable” contracts that are enforced with the impartiality of a machine instead of the imperfect and capricious human bureaucracy of a state or a bank. This is why so many people are so excited about blockchain: its supposed ability change economic organization in a way that transforms dominant relationships of power.

The problem which blockchain claims to have solved, in other words, is a rule-enforcement one, not a technological one….(More)”.

Mapping Puerto Rico’s Hurricane Migration With Mobile Phone Data


Martin Echenique and Luis Melgar at CityLab: “It is well known that the U.S. Census Bureau keeps track of state-to-state migration flows. But that’s not the case with Puerto Rico. Most of the publicly known numbers related to the post-Maria diaspora from the island to the continental U.S. were driven by estimates, and neither state nor federal institutions kept track of how many Puerto Ricans have left (or returned) after the storm ravaged the entire territory last September.

But Teralytics, a New York-based tech company with offices in Zurich and Singapore, has developed a map that reflects exactly how, when, and where Puerto Ricans have moved between August 2017 and February 2018. They did it by tracking data that was harvested from a sample of nearly 500,000 smartphones in partnership with one major undisclosed U.S. cell phone carrier….

The usefulness of this kind of geo-referenced data is clear in disaster relief efforts, especially when it comes to developing accurate emergency planning and determining when and where the affected population is moving.

“Generally speaking, people have their phones with them the entire time. This tells you where people are, where they’re going to, coming from, and movement patterns,” said Steven Bellovin, a computer science professor at Columbia University and former chief technologist for the U.S. Federal Trade Commission. “It could be very useful for disaster-relief efforts.”…(More)”.

When Technology Gets Ahead of Society


Tarun Khanna at Harvard Business Review: “Drones, originally developed for military purposes, weren’t approved for commercial use in the United States until 2013. When that happened, it was immediately clear that they could be hugely useful to a whole host of industries—and almost as quickly, it became clear that regulation would be a problem. The new technology raised multiple safety and security issues, there was no consensus on who should write rules to mitigate those concerns, and the knowledge needed to develop the rules didn’t yet exist in many cases. In addition, the little flying robots made a lot of people nervous.

Such regulatory, logistical, and social barriers to adopting novel products and services are very common. In fact, technology routinely outstrips society’s ability to deal with it. That’s partly because tech entrepreneurs are often insouciant about the legal and social issues their innovations birth. Although electric cars are subsidized by the federal government, Tesla has run afoul of state and local regulations because it bypasses conventional dealers to sell directly to consumers. Facebook is only now facing up to major regulatory concerns about its use of data, despite being massively successful with users and advertisers.

It’s clear that even as innovations bring unprecedented comfort and convenience, they also threaten old ways of regulating industries, running a business, and making a living. This has always been true. Thus early cars weren’t allowed to go faster than horses, and some 19th-century textile workers used sledgehammers to attack the industrial machinery they feared would displace them. New technology can even upend social norms: Consider how dating apps have transformed the way people meet.

Entrepreneurs, of course, don’t really care that the problems they’re running into are part of a historical pattern. They want to know how they can manage—and shorten—the period between the advent of a technology and the emergence of the rules and new behaviors that allow society to embrace its possibilities.

Interestingly, the same institutional murkiness that pervades nascent industries such as drones and driverless cars is something I’ve also seen in developing countries. And strange though this may sound, I believe that tech entrepreneurs can learn a lot from businesspeople who have succeeded in the world’s emerging markets.

Entrepreneurs in Brazil or Nigeria know that it’s pointless to wait for the government to provide the institutional and market infrastructure their businesses need, because that will simply take too long. They themselves must build support structures to compensate for what Krishna Palepu and I have referred to in earlier writings as “institutional voids.” They must create the conditions that will allow them to create successful products or services.

Tech-forward entrepreneurs in developed economies may want to believe that it’s not their job to guide policy makers and the public—but the truth is that nobody else can play that role. They may favor hardball tactics, getting ahead by evading rules, co-opting regulators, or threatening to move overseas. But in the long term, they’d be wiser to use soft power, working with a range of partners to co-create the social and institutional fabric that will support their growth—as entrepreneurs in emerging markets have done.…(More)”.

AI Nationalism


Blog by Ian Hogarth: “The central prediction I want to make and defend in this post is that continued rapid progress in machine learning will drive the emergence of a new kind of geopolitics; I have been calling it AI Nationalism. Machine learning is an omni-use technology that will come to touch all sectors and parts of society.

The transformation of both the economy and the military by machine learning will create instability at the national and international level forcing governments to act. AI policy will become the single most important area of government policy. An accelerated arms race will emerge between key countries and we will see increased protectionist state action to support national champions, block takeovers by foreign firms and attract talent. I use the example of Google, DeepMind and the UK as a specific example of this issue.

This arms race will potentially speed up the pace of AI development and shorten the timescale for getting to AGI. Although there will be many common aspects to this techno-nationalist agenda, there will also be important state specific policies. There is a difference between predicting that something will happen and believing this is a good thing. Nationalism is a dangerous path, particular when the international order and international norms will be in flux as a result and in the concluding section I discuss how a period of AI Nationalism might transition to one of global cooperation where AI is treated as a global public good….(More)”.

Data Protection and e-Privacy: From Spam and Cookies to Big Data, Machine Learning and Profiling


Chapter by Lilian Edwards in L Edwards ed Law, Policy and the Internet (Hart , 2018): “In this chapter, I examine in detail how data subjects are tracked, profiled and targeted by their activities on line and, increasingly, in the “offline” world as well. Tracking is part of both commercial and state surveillance, but in this chapter I concentrate on the former. The European law relating to spam, cookies, online behavioural advertising (OBA), machine learning (ML) and the Internet of Things (IoT) is examined in detail, using both the GDPR and the forthcoming draft ePrivacy Regulation. The chapter concludes by examining both code and law solutions which might find a way forward to protect user privacy and still enable innovation, by looking to paradigms not based around consent, and less likely to rely on a “transparency fallacy”. Particular attention is drawn to the new work around Personal Data Containers (PDCs) and distributed ML analytics….(More)”.

Why Do We Care So Much About Privacy?


Louis Menand in The New Yorker: “…Possibly the discussion is using the wrong vocabulary. “Privacy” is an odd name for the good that is being threatened by commercial exploitation and state surveillance. Privacy implies “It’s nobody’s business,” and that is not really what Roe v. Wade is about, or what the E.U. regulations are about, or even what Katz and Carpenter are about. The real issue is the one that Pollak and Martin, in their suit against the District of Columbia in the Muzak case, said it was: liberty. This means the freedom to choose what to do with your body, or who can see your personal information, or who can monitor your movements and record your calls—who gets to surveil your life and on what grounds.

As we are learning, the danger of data collection by online companies is not that they will use it to try to sell you stuff. The danger is that that information can so easily fall into the hands of parties whose motives are much less benign. A government, for example. A typical reaction to worries about the police listening to your phone conversations is the one Gary Hart had when it was suggested that reporters might tail him to see if he was having affairs: “You’d be bored.” They were not, as it turned out. We all may underestimate our susceptibility to persecution. “We were just talking about hardwood floors!” we say. But authorities who feel emboldened by the promise of a Presidential pardon or by a Justice Department that looks the other way may feel less inhibited about invading the spaces of people who belong to groups that the government has singled out as unpatriotic or undesirable. And we now have a government that does that….(More)”.