Countries Can Learn from France’s Plan for Public Interest Data and AI


Nick Wallace at the Center for Data Innovation: “French President Emmanuel Macron recently endorsed a national AI strategy that includes plans for the French state to make public and private sector datasets available for reuse by others in applications of artificial intelligence (AI) that serve the public interest, such as for healthcare or environmental protection. Although this strategy fails to set out how the French government should promote widespread use of AI throughout the economy, it will nevertheless give a boost to AI in some areas, particularly public services. Furthermore, the plan for promoting the wider reuse of datasets, particularly in areas where the government already calls most of the shots, is a practical idea that other countries should consider as they develop their own comprehensive AI strategies.

The French strategy, drafted by mathematician and Member of Parliament Cédric Villani, calls for legislation to mandate repurposing both public and private sector data, including personal data, to enable public-interest uses of AI by government or others, depending on the sensitivity of the data. For example, public health services could use data generated by Internet of Things (IoT) devices to help doctors better treat and diagnose patients. Researchers could use data captured by motorway CCTV to train driverless cars. Energy distributors could manage peaks and troughs in demand using data from smart meters.

Repurposed data held by private companies could be made publicly available, shared with other companies, or processed securely by the public sector, depending on the extent to which sharing the data presents privacy risks or undermines competition. The report suggests that the government would not require companies to share data publicly when doing so would impact legitimate business interests, nor would it require that any personal data be made public. Instead, Dr. Villani argues that, if wider data sharing would do unreasonable damage to a company’s commercial interests, it may be appropriate to only give public authorities access to the data. But where the stakes are lower, companies could be required to share the data more widely, to maximize reuse. Villani rightly argues that it is virtually impossible to come up with generalizable rules for how data should be shared that would work across all sectors. Instead, he argues for a sector-specific approach to determining how and when data should be shared.

After making the case for state-mandated repurposing of data, the report goes on to highlight four key sectors as priorities: health, transport, the environment, and defense. Since these all have clear implications for the public interest, France can create national laws authorizing extensive repurposing of personal data without violating the General Data Protection Regulation (GDPR) which allows national laws that permit the repurposing of personal data where it serves the public interest. The French strategy is the first clear effort by an EU member state to proactively use this clause in aid of national efforts to bolster AI….(More)”.

How Taiwan’s online democracy may show future of humans and machines


Shuyang Lin at the Sydney Morning Herald: “Taiwanese citizens have spent the past 30 years prototyping future democracy since the lift of martial law in 1987. Public participation in Taiwan has been developed in several formats, from face-to-face to deliberation over the internet. This trajectory coincides with the advancement of technology, and as new tools arrived, democracy evolved.

The launch of vTaiwan (v for virtual, vote, voice and verb), an experiment that prototypes an open consultation process for the civil society, showed that by using technology creatively humanity can facilitate deep and fair conversations, form collective consensus, and deliver solutions we can all live with.

It is a prototype that helps us envision what future democracy could look like….

Decision-making is not an easy task, especially when it has to do with a larger group of people. Group decision-making could take several protocols, such as mandate, to decide and take questions; advise, to listen before decisions; consent, to decide if no one objects; and consensus, to decide if everyone agrees. So there is a pressing need for us to be able to collaborate together in a large scale decision-making process to update outdated standards and regulations.

The future of human knowledge is on the web. Technology can help us to learn, communicate, and make better decisions faster with larger scale. The internet could be the facilitation and AI could be the catalyst. It is extremely important to be aware that decision-making is not a one-off interaction. The most important direction of decision-making technology development is to have it allow humans to be engaged in the process anytime and also have an invitation to request and submit changes.

Humans have started working with computers, and we will continue to work with them. They will help us in the decision-making process and some will even make decisions for us; the actors in collaboration don’t necessarily need to be just humans. While it is up to us to decide what and when to opt in or opt out, we should work together with computers in a transparent, collaborative and inclusive space.

Where shall we go as a society? What do we want from technology? As Audrey Tang,  Digital Minister without Portfolio of Taiwan, puts it: “Deliberation — listening to each other deeply, thinking together and working out something that we can all live with — is magical.”…(More)”.

#TrendingLaws: How can Machine Learning and Network Analysis help us identify the “influencers” of Constitutions?


Unicef: “New research by scientists from UNICEF’s Office of Innovation — published today in the journal Nature Human Behaviour — applies methods from network science and machine learning to constitutional law.  UNICEF Innovation Data Scientists Alex Rutherford and Manuel Garcia-Herranz collaborated with computer scientists and political scientists at MIT, George Washington University, and UC Merced to apply data analysis to the world’s constitutions over the last 300 years. This work sheds new light on how to better understand why countries’ laws change and incorporate social rights…

Data science techniques allow us to use methods like network science and machine learning to uncover patterns and insights that are hard for humans to see. Just as we can map influential users on Twitter — and patterns of relations between places to predict how diseases will spread — we can identify which countries have influenced each other in the past and what are the relations between legal provisions.

Why The Science of Constitutions?

One way UNICEF fulfills its mission is through advocacy with national governments — to enshrine rights for minorities, notably children, formally in law. Perhaps the most renowned example of this is the International Convention on the Rights of the Child (ICRC).

Constitutions, such as Mexico’s 1917 constitution — the first to limit the employment of children — are critical to formalizing rights for vulnerable populations. National constitutions describe the role of a country’s institutions, its character in the eyes of the world, as well as the rights of its citizens.

From a scientific standpoint, the work is an important first step in showing that network analysis and machine learning technique can be used to better understand the dynamics of caring for and protecting the rights of children — critical to the work we do in a complex and interconnected world. It shows the significant, and positive policy implications of using data science to uphold children’s rights.

What the Research Shows:

Through this research, we uncovered:

  • A network of relationships between countries and their constitutions.
  • A natural progression of laws — where fundamental rights are a necessary precursor to more specific rights for minorities.
  • The effect of key historical events in changing legal norms….(More)”.

The economic value of data: discussion paper


HM Treasury (UK): “Technological change has radically increased both the volume of data in the economy, and our ability to process it. This change presents an opportunity to transform our economy and society for the better.

Data-driven innovation holds the keys to addressing some of the most significant challenges confronting modern Britain, whether that is tackling congestion and improving air quality in our cities, developing ground-breaking diagnosis systems to support our NHS, or making our businesses more productive.

The UK’s strengths in cutting-edge research and the intangible economy make it well-placed to be a world leader, and estimates suggest that data-driven technologies will contribute over £60 billion per year to the UK economy by 2020.1 Recent events have raised public questions and concerns about the way that data, and particularly personal data, can be collected, processed, and shared with third party organisations.

These are concerns that this government takes seriously. The Data Protection Act 2018 updates the UK’s world-leading data protection framework to make it fit for the future, giving individuals strong new rights over how their data is used. Alongside maintaining a secure, trusted data environment, the government has an important role to play in laying the foundations for a flourishing data-driven economy.

This means pursuing policies that improve the flow of data through our economy, and ensure that those companies who want to innovate have appropriate access to high-quality and well-maintained data.

This discussion paper describes the economic opportunity presented by data-driven innovation, and highlights some of the key challenges that government will need to address, such as: providing clarity around ownership and control of data; maintaining a strong, trusted data protection framework; making effective use of public sector data; driving interoperability and standards; and enabling safe, legal and appropriate data sharing.

Over the last few years, the government has taken significant steps to strengthen the UK’s position as a world leader in data-driven innovation, including by agreeing the Artificial Intelligence Sector Deal, establishing the Geospatial Commission, and making substantial investments in digital skills. The government will build on those strong foundations over the coming months, including by commissioning an Expert Panel on Competition in Digital Markets. This Expert Panel will support the government’s wider review of competition law by considering how competition policy can better enable innovation and support consumers in the digital economy.

There are still big questions to be answered. This document marks the beginning of a wider set of conversations that government will be holding over the coming year, as we develop a new National Data Strategy….(More)”.

To Better Predict Traffic, Look to the Electric Grid


Linda Poon at CityLab: “The way we consume power after midnight can reveal how we bad the morning rush hour will be….

Commuters check Google Maps for traffic updates the same way they check the weather app for rain predictions. And for good reasons: By pooling information from millions of drivers already on the road, Google can paint an impressively accurate real-time portrait of congestion. Meanwhile, historical numbers can roughly predict when your morning commutes may be particularly bad.

But “the information we extract from traffic data has been exhausted,” said Zhen (Sean) Qian, who directs the Mobility Data Analytics Center at Carnegie Mellon University. He thinks that to more accurately predict how gridlock varies from day to day, there’s a whole other set of data that cities haven’t mined yet: electricity use.

“Essentially we all use the urban system—the electricity, water, the sewage system and gas—and when people use them and how heavily they do is correlated to the way they use the transportation system,” he said. How we use electricity at night, it turns out, can reveal when we leave for work the next day. “So we might be able to get new information that helps explain travel time one or two hours in advance by having a better understanding of human activity.”

 In a recent study in the journal Transportation Research Part C, Qian and his student Pinchao Zhang used 2014 data to demonstrate how electricity usage patterns can predict when peak congestion begins on various segments of a major highway in Austin, Texas—the 14th most congested city in the U.S. They crunched 79 days worth of electricity usage data for 322 households (stripped of all private information, including location), feeding it into a machine learning algorithm that then categorized the households into 10 groups according to the time and amount of electricity use between midnight and 6 a.m. By extrapolating the most critical traffic-related information about each group for each day, the model then predicted what the commute may look like that morning.
When compared with 2014 traffic data, they found that 8 out of the 10 patterns had an impact on highway traffic. Households that show a spike of electricity use from midnight to 2 a.m., for example, may be night owls who sleep in, leave late, and likely won’t contribute to the early morning congestion. In contrast, households that report low electricity use from midnight to 5 a.m., followed by a rise after 5:30 a.m., could be early risers who will be on the road during rush hour. If the researchers’ model detects more households falling into the former group, it might predict that peak congestion will start closer to, say, 7:45 a.m. rather than the usual 7:30….(More)”.

Our misguided love affair with techno-politics


The Economist: “What might happen if technology, which smothers us with its bounty as consumers, made the same inroads into politics? Might data-driven recommendations suggest “policies we may like” just as Amazon recommends books? Would we swipe right to pick candidates in elections or answers in referendums? Could businesses expand into every cranny of political and social life, replete with ® and ™ at each turn? What would this mean for political discourse and individual freedom?

This dystopian yet all-too-imaginable world has been conjured up by Giuseppe Porcaro in his novel “Disco Sour”. The story takes place in the near future, after a terrible war and breakdown of nations, when the (fictional) illegitimate son of Roman Polanski creates an app called Plebiscitum that works like Tinder for politics.

Mr Porcaro—who comes armed with a doctorate in political geography—uses the plot to consider questions of politics in the networked age. The Economist’s Open Future initiative asked him to reply to five questions in around 100 words each. An excerpt from the book appears thereafter.

*     *     *

The Economist: In your novel, an entrepreneur attempts to replace elections with an app that asks people to vote on individual policies. Is that science fiction or prediction? And were you influenced by Italy’s Five Star Movement?

Giuseppe Porcaro: The idea of imagining a Tinder-style app replacing elections came up because I see connections between the evolution of dating habits and 21st-century politics. A new sort of “tinderpolitics” kicking in when instant gratification substitutes substantial participation. Think about tweet trolling, for example.

Italy’s Five Star Movement was certainly another inspiration as it is has been a pioneer in using an online platform to successfully create a sort of new political mass movement. Another one was an Australian political party called Flux. They aim to replace the world’s elected legislatures with a new system known as issue-based direct democracy.

The Economist: Is it too cynical to suggest that a more direct relationship between citizens and policymaking would lead to a more reactionary political landscape? Or does the ideal of liberal democracy depend on an ideal citizenry that simply doesn’t exist?  

Mr Porcaro: It would be cynical to put the blame on citizens for getting too close to influence decision-making. That would go against the very essence of the “liberal democracy ideal”. However, I am critical towards the pervasive idea that technology can provide quick fixes to bridge the gap between citizens and the government. By applying computational thinking to democracy, an extreme individualisation and instant participation, we forget democracy is not simply the result of an election or the mathematical sum of individual votes. Citizens risk entering a vicious circle where reactionary politics are easier to go through.

The Economist: Modern representative democracy was in some ways a response to the industrial revolution. If AI and automation radically alter the world we live in, will we have to update the way democracy works too—and if so, how? 

Mr Porcaro: Democracy has already morphed several times. 19th century’s liberal democracy was shaken by universal suffrage, and adapted to the Fordist mode of production with the mass party. May 1968 challenged that model. Today, the massive availability of data and the increasing power of decision-making algorithms will change both political institutions.

The policy “production” process might be utterly redesigned. Data collected by devices we use on a daily basis (such as vehicles, domestic appliances and wearable sensors) will provide evidence about the drivers of personal voting choices, or the accountability of government decisions. …(More)

This surprising, everyday tool might hold the key to changing human behavior


Annabelle Timsit at Quartz: “To be a person in the modern world is to worry about your relationship with your phone. According to critics, smartphones are making us ill-mannered and sore-necked, dragging parents’ attention away from their kids, and destroying an entire generation.

But phones don’t have to be bad. With 4.68 billion people forecast to become mobile phone users by 2019, nonprofits and social science researchers are exploring new ways to turn our love of screens into a force for good. One increasingly popular option: Using texting to help change human behavior.

Texting: A unique tool

The short message service (SMS) was invented in the late 1980s, and the first text message was sent in 1992. (Engineer Neil Papworth sent “merry Christmas” to then-Vodafone director Richard Jarvis.) In the decades since, texting has emerged as the preferred communication method for many, and in particular younger generations. While that kind of habit-forming can be problematic—47% of US smartphone users say they “couldn’t live without” the device—our attachment to our phones also makes text-based programs a good way to encourage people to make better choices.

“Texting, because it’s anchored in mobile phones, has the ability to be with you all the time, and that gives us an enormous flexibility on precision,” says Todd Rose, director of the Mind, Brain, & Education Program at the Harvard Graduate School of Education. “When people lead busy lives, they need timely, targeted, actionable information.”

And who is busier than a parent? Text-based programs can help current or would-be moms and dads with everything from medication pickup to childhood development. Text4Baby, for example, messages pregnant women and young moms with health information and reminders about upcoming doctor visits. Vroom, an app for building babies’ brains, sends parents research-based prompts to help them build positive relationships with their children (for example, by suggesting they ask toddlers to describe how they’re feeling based on the weather). Muse, an AI-powered app, uses machine learning and big data to try and help parents raise creative, motivated, emotionally intelligent kids. As Jenny Anderson writes in Quartz: “There is ample evidence that we can modify parents’ behavior through technological nudges.”

Research suggests text-based programs may also be helpful in supporting young children’s academic and cognitive development. …Texts aren’t just being used to help out parents. Non-governmental organizations (NGOs) have also used them to encourage civic participation in kids and young adults. Open Progress, for example, has an all-volunteer community called “text troop” that messages young adults across the US, reminding them to register to vote and helping them find their polling location.

Text-based programs are also useful in the field of nutrition, where private companies and public-health organizations have embraced them as a way to give advice on healthy eating and weight loss. The National Cancer Institute runs a text-based program called SmokefreeTXT that sends US adults between three and five messages per day for up to eight weeks, to help them quit smoking.

Texting programs can be a good way to nudge people toward improving their mental health, too. Crisis Text Line, for example, was the first national 24/7 crisis-intervention hotline to conduct counseling conversations entirely over text…(More).

Regulatory Technology – Replacing Law with Computer Code


LSE Legal Studies Working Paper by Eva Micheler and Anna Whaley: “Recently both the Bank of England and the Financial Conduct Authority have carried out experiments using new digital technology for regulatory purposes. The idea is to replace rules written in natural legal language with computer code and to use artificial intelligence for regulatory purposes.

This new way of designing public law is in line with the government’s vision for the UK to become a global leader in digital technology. It is also reflected in the FCA’s business plan.

The article reviews the technology and the advantages and disadvantages of combining the technology with regulatory law. It then informs the discussion from a broader public law perspective. It analyses regulatory technology through criteria developed in the mainstream regulatory discourse. It contributes to that discourse by anticipating problems that will arise as the technology evolves. In addition, the hope is to assist the government in avoiding mistakes that have occurred in the past and creating a better system from the start…(More)”.

We Need Transparency in Algorithms, But Too Much Can Backfire


Kartik Hosanagar and Vivian Jair at Harvard Business Review: “In 2013, Stanford professor Clifford Nass faced a student revolt. Nass’s students claimed that those in one section of his technology interface course received higher grades on the final exam than counterparts in another. Unfortunately, they were right: two different teaching assistants had graded the two different sections’ exams, and one had been more lenient than the other. Students with similar answers had ended up with different grades.

Nass, a computer scientist, recognized the unfairness and created a technical fix: a simple statistical model to adjust scores, where students got a certain percentage boost on their final mark when graded by a TA known to give grades that percentage lower than average. In the spirit of openness, Nass sent out emails to the class with a full explanation of his algorithm. Further complaints poured in, some even angrier than before. Where had he gone wrong?…

Kizilcec had in fact tested three levels of transparency: low and medium but also high, where the students got not only a paragraph explaining the grading process but also their raw peer-graded scores and how these were each precisely adjusted by the algorithm to get to a final grade. And this is where the results got more interesting. In the experiment, while medium transparency increased trust significantly, high transparency eroded it completely, to the point where trust levels were either equal to or lower than among students experiencing low transparency.

Making Modern AI Transparent: A Fool’s Errand?

 What are businesses to take home from this experiment?  It suggests that technical transparency – revealing the source code, inputs, and outputs of the algorithm – can build trust in many situations. But most algorithms in the world today are created and managed by for-profit companies, and many businesses regard their algorithms as highly valuable forms of intellectual property that must remain in a “black box.” Some lawmakers have proposed a compromise, suggesting that the source code be revealed to regulators or auditors in the event of a serious problem, and this adjudicator will assure consumers that the process is fair.

This approach merely shifts the burden of belief from the algorithm itself to the regulators. This may a palatable solution in many arenas: for example, few of us fully understand financial markets, so we trust the SEC to take on oversight. But in a world where decisions large and small, personal and societal, are being handed over to algorithms, this becomes less acceptable.

Another problem with technical transparency is that it makes algorithms vulnerable to gaming. If an instructor releases the complete source code for an algorithm grading student essays, it becomes easy for students to exploit loopholes in the code:  maybe, for example, the algorithm seeks evidence that the students have done research by looking for phrases such as “according to published research.” A student might then deliberately use this language at the start of every paragraph in her essay.

But the biggest problem is that modern AI is making source code – transparent or not – less relevant compared with other factors in algorithmic functioning. Specifically, machine learning algorithms – and deep learning algorithms in particular – are usually built on just a few hundred lines of code. The algorithms logic is mostly learned from training data and is rarely reflected in its source code. Which is to say, some of today’s best-performing algorithms are often the most opaque. High transparency might involve getting our heads around reams and reams of data – and then still only being able to guess at what lessons the algorithm has learned from it.

This is where Kizilcec’s work becomes relevant – a way to embrace rather than despair over deep learning’s impenetrability. His work shows that users will not trust black box models, but they don’t need – or even want – extremely high levels of transparency. That means responsible companies need not fret over what percentage of source code to reveal, or how to help users “read” massive datasets. Instead, they should work to provide basic insights on the factors driving algorithmic decisions….(More)”

Doing good data science


Mike Loukides, Hilary Mason and DJ Patil at O’Reilly: “(This post is the first in a series on data ethics) The hard thing about being an ethical data scientist isn’t understanding ethics. It’s the junction between ethical ideas and practice. It’s doing good data science.

There has been a lot of healthy discussion about data ethics lately. We want to be clear: that discussion is good, and necessary. But it’s also not the biggest problem we face. We already have good standards for data ethics. The ACM’s code of ethics, which dates back to 1993, is clear, concise, and surprisingly forward-thinking; 25 years later, it’s a great start for anyone thinking about ethics. The American Statistical Association has a good set of ethical guidelines for working with data. So, we’re not working in a vacuum.

And, while there are always exceptions, we believe that most people want to be fair. Data scientists and software developers don’t want to harm the people using their products. There are exceptions, of course; we call them criminals and con artists. Defining “fairness” is difficult, and perhaps impossible, given the many crosscutting layers of “fairness” that we might be concerned with. But we don’t have to solve that problem in advance, and it’s not going to be solved in a simple statement of ethical principles, anyway.

The problem we face is different: how do we put ethical principles into practice? We’re not talking about an abstract commitment to being fair. Ethical principles are worse than useless if we don’t allow them to change our practice, if they don’t have any effect on what we do day-to-day. For data scientists, whether you’re doing classical data analysis or leading-edge AI, that’s a big challenge. We need to understand how to build the software systems that implement fairness. That’s what we mean by doing good data science.

Any code of data ethics will tell you that you shouldn’t collect data from experimental subjects without informed consent. But that code won’t tell you how to implement “informed consent.” Informed consent is easy when you’re interviewing a few dozen people in person for a psychology experiment. Informed consent means something different when someone clicks on an item in an online catalog (hello, Amazon), and ads for that item start following them around ad infinitum. Do you use a pop-up to ask for permission to use their choice in targeted advertising? How many customers would you lose? Informed consent means something yet again when you’re asking someone to fill out a profile for a social site, and you might (or might not) use that data for any number of experimental purposes. Do you pop up a consent form in impenetrable legalese that basically says “we will use your data, but we don’t know for what”? Do you phrase this agreement as an opt-out, and hide it somewhere on the site where nobody will find it?…

To put ethical principles into practice, we need space to be ethical. We need the ability to have conversations about what ethics means, what it will cost, and what solutions to implement. As technologists, we frequently share best practices at conferences, write blog posts, and develop open source technologies—but we rarely discuss problems such as how to obtain informed consent.

There are several facets to this space that we need to think about.

First, we need corporate cultures in which discussions about fairness, about the proper use of data, and about the harm that can be done by inappropriate use of data can be considered. In turn, this means that we can’t rush products out the door without thinking about how they’re used. We can’t allow “internet time” to mean ignoring the consequences. Indeed, computer security has shown us the consequences of ignoring the consequences: many companies that have never taken the time to implement good security practices and safeguards are now paying with damage to their reputations and their finances. We need to do the same when thinking about issues like fairness, accountability, and unintended consequences….(More)”.