Origin Privacy: Protecting Privacy in the Big-Data Era

Paper by Helen Nissenbaum, Sebastian Benthall, Anupam Datta, Michael Carl Tschantz, and Piot Mardziel: “Machine learning over big data poses challenges for our conceptualization of privacy. Such techniques can discover surprising and counteractive associations that take innocent looking data and turns it into important inferences about a person. For example, the buying carbon monoxide monitors has been linked to paying credit card bills, while buying chrome-skull car accessories predicts not doing so. Also, Target may have used the buying of scent-free hand lotion and vitamins as a sign that the buyer is pregnant. If we take pregnancy status to be private and assume that we should prohibit the sharing information that can reveal that fact, then we have created an unworkable notion of privacy, one in which sharing any scrap of data may violate privacy.

Prior technical specifications of privacy depend on the classification of certain types of information as private or sensitive; privacy policies in these frameworks limit access to data that allow inference of this sensitive information. As the above examples show, today’s data rich world creates a new kind of problem: it is difficult if not impossible to guarantee that information does notallow inference of sensitive topics. This makes information flow rules based on information topic unstable.

We address the problem of providing a workable definition of private data that takes into account emerging threats to privacy from large-scale data collection systems. We build on Contextual Integrity and its claim that privacy is appropriate information flow, or flow according to socially or legally specified rules.

As in other adaptations of Contextual Integrity (CI) to computer science, the parameterization of social norms in CI is translated into a logical specification. In this work, we depart from CI by considering rules that restrict information flow based on its origin and provenance, instead of on it’s type, topic, or subject.

We call this concept of privacy as adherence to origin-based rules Origin Privacy. Origin Privacy rules can be found in some existing data protection laws. This motivates the computational implementation of origin-based rules for the simple purpose of compliance engineering. We also formally model origin privacy to determine what security properties it guarantees relative to the concerns that motivate it….(More)”.

Biometric Mirror

University of Melbourne: “Biometric Mirror exposes the possibilities of artificial intelligence and facial analysis in public space. The aim is to investigate the attitudes that emerge as people are presented with different perspectives on their own, anonymised biometric data distinguished from a single photograph of their face. It sheds light on the specific data that people oppose and approve, the sentiments it evokes, and the underlying reasoning. Biometric Mirror also presents an opportunity to reflect on whether the plausible future of artificial intelligence is a future we want to see take shape.

Big data and artificial intelligence are some of today’s most popular buzzwords. Both are promised to help deliver insights that were previously too complex for computer systems to calculate. With examples ranging from personalised recommendation systems to automatic facial analyses, user-generated data is now analysed by algorithms to identify patterns and predict outcomes. And the common view is that these developments will have a positive impact on society.

Within the realm of artificial intelligence (AI), facial analysis gains popularity. Today, CCTV cameras and advertising screens increasingly link with analysis systems that are able to detect emotions, age, gender and demographic information of people passing by. It has proven to increase advertising effectiveness in retail environments, since campaigns can now be tailored to specific audience profiles and situations. But facial analysis models are also being developed to predict your aggression levelsexual preferencelife expectancy and likeliness of being a terrorist (or an academic) by simply monitoring surveillance camera footage or analysing a single photograph. Some of these developments have gained widespread media coverage for their innovative nature, but often the ethical and social impact is only a side thought.

Current technological developments approach ethical boundaries of the artificial intelligence age. Facial recognition and analysis in public space raise concerns as people are photographed without prior consent, and their photos disappear into a commercial operator’s infrastructure. It remains unclear how the data is processed, how the data is tailored for specific purposes and how the data is retained or disposed of. People also do not have the opportunity to review or amend their facial recognition data. Perhaps most worryingly, artificial intelligence systems may make decisions or deliver feedback based on the data, regardless of its accuracy or completeness. While facial recognition and analysis may be harmless for tailored advertising in retail environments or to unlock your phone, it quickly pushes ethical boundaries when the general purpose is to more closely monitor society… (More).

One of New York City’s most urgent design challenges is invisible

Diana Budds at Curbed: “Algorithms are invisible, but they already play a large role in shaping New York City’s built environment, schooling, public resources, and criminal justice system. Earlier this year, the City Council and Mayor Bill de Blasio formed the Automated Decision Systems Task Force, the first of its kind in the country, to analyze how NYC deploys automated systems to ensure fairness, equity, and accountability are upheld.

This week, 20 experts in the field of civil rights and artificial intelligence co-signed a letter to the task force to help influence its official report, which is scheduled to be published in December 2019.

The letter’s recommendations include creating a publicly accessible list of all the automated decision systems in use; consulting with experts before adopting an automated decision system; creating a permanent government body to oversee the procurement and regulation of automated decision systems; and upholding civil liberties in all matters related to automation. This could lay the groundwork for future legislation around automation in the city….Read the full letter here.”

An Overview of National AI Strategies

Medium Article by Tim Dutton: “The race to become the global leader in artificial intelligence (AI) has officially begun. In the past fifteen months, Canada, China, Denmark, the EU Commission, Finland, France, India, Italy, Japan, Mexico, the Nordic-Baltic region, Singapore, South Korea, Sweden, Taiwan, the UAE, and the UK have all released strategies to promote the use and development of AI. No two strategies are alike, with each focusing on different aspects of AI policy: scientific research, talent development, skills and education, public and private sector adoption, ethics and inclusion, standards and regulations, and data and digital infrastructure.

This article summarizes the key policies and goals of each strategy, as well as related policies and initiatives that have announced since the release of the initial strategies. It also includes countries that have announced their intention to develop a strategy or have related AI policies in place….(More)”.

Odd Numbers: Algorithms alone can’t meaningfully hold other algorithms accountable

Frank Pasquale at Real Life Magazine: “Algorithms increasingly govern our social world, transforming data into scores or rankings that decide who gets credit, jobs, dates, policing, and much more. The field of “algorithmic accountability” has arisen to highlight the problems with such methods of classifying people, and it has great promise: Cutting-edge work in critical algorithm studies applies social theory to current events; law and policy experts seem to publish new articles daily on how artificial intelligence shapes our lives, and a growing community of researchers has developed a field known as “Fairness, Accuracy, and Transparency in Machine Learning.”

The social scientists, attorneys, and computer scientists promoting algorithmic accountability aspire to advance knowledge and promote justice. But what should such “accountability” more specifically consist of? Who will define it? At a two-day, interdisciplinary roundtable on AI ethics I recently attended, such questions featured prominently, and humanists, policy experts, and lawyers engaged in a free-wheeling discussion about topics ranging from robot arms races to computationally planned economies. But at the end of the event, an emissary from a group funded by Elon Musk and Peter Thiel among others pronounced our work useless. “You have no common methodology,” he informed us (apparently unaware that that’s the point of an interdisciplinary meeting). “We have a great deal of money to fund real research on AI ethics and policy”— which he thought of as dry, economistic modeling of competition and cooperation via technology — “but this is not the right group.” He then gratuitously lashed out at academics in attendance as “rent seekers,” largely because we had the temerity to advance distinctive disciplinary perspectives rather than fall in line with his research agenda.

Most corporate contacts and philanthrocapitalists are more polite, but their sense of what is realistic and what is utopian, what is worth studying and what is mere ideology, is strongly shaping algorithmic accountability research in both social science and computer science. This influence in the realm of ideas has powerful effects beyond it. Energy that could be put into better public transit systems is instead diverted to perfect the coding of self-driving cars. Anti-surveillance activism transmogrifies into proposals to improve facial recognition systems to better recognize all faces. To help payday-loan seekers, developers might design data-segmentation protocols to show them what personal information they should reveal to get a lower interest rate. But the idea that such self-monitoring and data curation can be a trap, disciplining the user in ever finer-grained ways, remains less explored. Trying to make these games fairer, the research elides the possibility of rejecting them altogether….(More)”.

Countries Can Learn from France’s Plan for Public Interest Data and AI

Nick Wallace at the Center for Data Innovation: “French President Emmanuel Macron recently endorsed a national AI strategy that includes plans for the French state to make public and private sector datasets available for reuse by others in applications of artificial intelligence (AI) that serve the public interest, such as for healthcare or environmental protection. Although this strategy fails to set out how the French government should promote widespread use of AI throughout the economy, it will nevertheless give a boost to AI in some areas, particularly public services. Furthermore, the plan for promoting the wider reuse of datasets, particularly in areas where the government already calls most of the shots, is a practical idea that other countries should consider as they develop their own comprehensive AI strategies.

The French strategy, drafted by mathematician and Member of Parliament Cédric Villani, calls for legislation to mandate repurposing both public and private sector data, including personal data, to enable public-interest uses of AI by government or others, depending on the sensitivity of the data. For example, public health services could use data generated by Internet of Things (IoT) devices to help doctors better treat and diagnose patients. Researchers could use data captured by motorway CCTV to train driverless cars. Energy distributors could manage peaks and troughs in demand using data from smart meters.

Repurposed data held by private companies could be made publicly available, shared with other companies, or processed securely by the public sector, depending on the extent to which sharing the data presents privacy risks or undermines competition. The report suggests that the government would not require companies to share data publicly when doing so would impact legitimate business interests, nor would it require that any personal data be made public. Instead, Dr. Villani argues that, if wider data sharing would do unreasonable damage to a company’s commercial interests, it may be appropriate to only give public authorities access to the data. But where the stakes are lower, companies could be required to share the data more widely, to maximize reuse. Villani rightly argues that it is virtually impossible to come up with generalizable rules for how data should be shared that would work across all sectors. Instead, he argues for a sector-specific approach to determining how and when data should be shared.

After making the case for state-mandated repurposing of data, the report goes on to highlight four key sectors as priorities: health, transport, the environment, and defense. Since these all have clear implications for the public interest, France can create national laws authorizing extensive repurposing of personal data without violating the General Data Protection Regulation (GDPR) which allows national laws that permit the repurposing of personal data where it serves the public interest. The French strategy is the first clear effort by an EU member state to proactively use this clause in aid of national efforts to bolster AI….(More)”.

How Taiwan’s online democracy may show future of humans and machines

Shuyang Lin at the Sydney Morning Herald: “Taiwanese citizens have spent the past 30 years prototyping future democracy since the lift of martial law in 1987. Public participation in Taiwan has been developed in several formats, from face-to-face to deliberation over the internet. This trajectory coincides with the advancement of technology, and as new tools arrived, democracy evolved.

The launch of vTaiwan (v for virtual, vote, voice and verb), an experiment that prototypes an open consultation process for the civil society, showed that by using technology creatively humanity can facilitate deep and fair conversations, form collective consensus, and deliver solutions we can all live with.

It is a prototype that helps us envision what future democracy could look like….

Decision-making is not an easy task, especially when it has to do with a larger group of people. Group decision-making could take several protocols, such as mandate, to decide and take questions; advise, to listen before decisions; consent, to decide if no one objects; and consensus, to decide if everyone agrees. So there is a pressing need for us to be able to collaborate together in a large scale decision-making process to update outdated standards and regulations.

The future of human knowledge is on the web. Technology can help us to learn, communicate, and make better decisions faster with larger scale. The internet could be the facilitation and AI could be the catalyst. It is extremely important to be aware that decision-making is not a one-off interaction. The most important direction of decision-making technology development is to have it allow humans to be engaged in the process anytime and also have an invitation to request and submit changes.

Humans have started working with computers, and we will continue to work with them. They will help us in the decision-making process and some will even make decisions for us; the actors in collaboration don’t necessarily need to be just humans. While it is up to us to decide what and when to opt in or opt out, we should work together with computers in a transparent, collaborative and inclusive space.

Where shall we go as a society? What do we want from technology? As Audrey Tang,  Digital Minister without Portfolio of Taiwan, puts it: “Deliberation — listening to each other deeply, thinking together and working out something that we can all live with — is magical.”…(More)”.

#TrendingLaws: How can Machine Learning and Network Analysis help us identify the “influencers” of Constitutions?

Unicef: “New research by scientists from UNICEF’s Office of Innovation — published today in the journal Nature Human Behaviour — applies methods from network science and machine learning to constitutional law.  UNICEF Innovation Data Scientists Alex Rutherford and Manuel Garcia-Herranz collaborated with computer scientists and political scientists at MIT, George Washington University, and UC Merced to apply data analysis to the world’s constitutions over the last 300 years. This work sheds new light on how to better understand why countries’ laws change and incorporate social rights…

Data science techniques allow us to use methods like network science and machine learning to uncover patterns and insights that are hard for humans to see. Just as we can map influential users on Twitter — and patterns of relations between places to predict how diseases will spread — we can identify which countries have influenced each other in the past and what are the relations between legal provisions.

Why The Science of Constitutions?

One way UNICEF fulfills its mission is through advocacy with national governments — to enshrine rights for minorities, notably children, formally in law. Perhaps the most renowned example of this is the International Convention on the Rights of the Child (ICRC).

Constitutions, such as Mexico’s 1917 constitution — the first to limit the employment of children — are critical to formalizing rights for vulnerable populations. National constitutions describe the role of a country’s institutions, its character in the eyes of the world, as well as the rights of its citizens.

From a scientific standpoint, the work is an important first step in showing that network analysis and machine learning technique can be used to better understand the dynamics of caring for and protecting the rights of children — critical to the work we do in a complex and interconnected world. It shows the significant, and positive policy implications of using data science to uphold children’s rights.

What the Research Shows:

Through this research, we uncovered:

  • A network of relationships between countries and their constitutions.
  • A natural progression of laws — where fundamental rights are a necessary precursor to more specific rights for minorities.
  • The effect of key historical events in changing legal norms….(More)”.

The economic value of data: discussion paper

HM Treasury (UK): “Technological change has radically increased both the volume of data in the economy, and our ability to process it. This change presents an opportunity to transform our economy and society for the better.

Data-driven innovation holds the keys to addressing some of the most significant challenges confronting modern Britain, whether that is tackling congestion and improving air quality in our cities, developing ground-breaking diagnosis systems to support our NHS, or making our businesses more productive.

The UK’s strengths in cutting-edge research and the intangible economy make it well-placed to be a world leader, and estimates suggest that data-driven technologies will contribute over £60 billion per year to the UK economy by 2020.1 Recent events have raised public questions and concerns about the way that data, and particularly personal data, can be collected, processed, and shared with third party organisations.

These are concerns that this government takes seriously. The Data Protection Act 2018 updates the UK’s world-leading data protection framework to make it fit for the future, giving individuals strong new rights over how their data is used. Alongside maintaining a secure, trusted data environment, the government has an important role to play in laying the foundations for a flourishing data-driven economy.

This means pursuing policies that improve the flow of data through our economy, and ensure that those companies who want to innovate have appropriate access to high-quality and well-maintained data.

This discussion paper describes the economic opportunity presented by data-driven innovation, and highlights some of the key challenges that government will need to address, such as: providing clarity around ownership and control of data; maintaining a strong, trusted data protection framework; making effective use of public sector data; driving interoperability and standards; and enabling safe, legal and appropriate data sharing.

Over the last few years, the government has taken significant steps to strengthen the UK’s position as a world leader in data-driven innovation, including by agreeing the Artificial Intelligence Sector Deal, establishing the Geospatial Commission, and making substantial investments in digital skills. The government will build on those strong foundations over the coming months, including by commissioning an Expert Panel on Competition in Digital Markets. This Expert Panel will support the government’s wider review of competition law by considering how competition policy can better enable innovation and support consumers in the digital economy.

There are still big questions to be answered. This document marks the beginning of a wider set of conversations that government will be holding over the coming year, as we develop a new National Data Strategy….(More)”.

To Better Predict Traffic, Look to the Electric Grid

Linda Poon at CityLab: “The way we consume power after midnight can reveal how we bad the morning rush hour will be….

Commuters check Google Maps for traffic updates the same way they check the weather app for rain predictions. And for good reasons: By pooling information from millions of drivers already on the road, Google can paint an impressively accurate real-time portrait of congestion. Meanwhile, historical numbers can roughly predict when your morning commutes may be particularly bad.

But “the information we extract from traffic data has been exhausted,” said Zhen (Sean) Qian, who directs the Mobility Data Analytics Center at Carnegie Mellon University. He thinks that to more accurately predict how gridlock varies from day to day, there’s a whole other set of data that cities haven’t mined yet: electricity use.

“Essentially we all use the urban system—the electricity, water, the sewage system and gas—and when people use them and how heavily they do is correlated to the way they use the transportation system,” he said. How we use electricity at night, it turns out, can reveal when we leave for work the next day. “So we might be able to get new information that helps explain travel time one or two hours in advance by having a better understanding of human activity.”

 In a recent study in the journal Transportation Research Part C, Qian and his student Pinchao Zhang used 2014 data to demonstrate how electricity usage patterns can predict when peak congestion begins on various segments of a major highway in Austin, Texas—the 14th most congested city in the U.S. They crunched 79 days worth of electricity usage data for 322 households (stripped of all private information, including location), feeding it into a machine learning algorithm that then categorized the households into 10 groups according to the time and amount of electricity use between midnight and 6 a.m. By extrapolating the most critical traffic-related information about each group for each day, the model then predicted what the commute may look like that morning.
When compared with 2014 traffic data, they found that 8 out of the 10 patterns had an impact on highway traffic. Households that show a spike of electricity use from midnight to 2 a.m., for example, may be night owls who sleep in, leave late, and likely won’t contribute to the early morning congestion. In contrast, households that report low electricity use from midnight to 5 a.m., followed by a rise after 5:30 a.m., could be early risers who will be on the road during rush hour. If the researchers’ model detects more households falling into the former group, it might predict that peak congestion will start closer to, say, 7:45 a.m. rather than the usual 7:30….(More)”.