We should extend EU bank data sharing to all sectors


Carlos Torres Vila in the Financial Times: “Data is now driving the global economy — just look at the list of the world’s most valuable companies. They collect and exploit the information that users generate through billions of online interactions taking place every day. 


But companies are hoarding data too, preventing others, including the users to whom the data relates, from accessing and using it. This is true of traditional groups such as banks, telcos and utilities, as well as the large digital enterprises that rely on “proprietary” data. 
Global and national regulators must address this problem by forcing companies to give users an easy way to share their own data, if they so choose. This is the logical consequence of personal data belonging to users. There is also the potential for enormous socio-economic benefits if we can create consent-based free data flows. 
We need data-sharing across companies in all sectors in a real time, standardised way — not at a speed and in a format dictated by the companies that stockpile user data. These new rules should apply to all electronic data generated by users, whether provided directly or observed during their online interactions with any provider, across geographic borders and in any sector. This could include everything from geolocation history and electricity consumption to recent web searches, pension information or even most recently played songs. 

This won’t be easy to achieve in practice, but the good news is that we already have a framework that could be the model for a broader solution. The UK’s Open Banking system provides a tantalising glimpse of what may be possible. In Europe, the regulation known as the Payment Services Directive 2 allows banking customers to share data about their transactions with multiple providers via secure, structured IT interfaces. We are already seeing this unlock new business models and drive competition in digital financial services. But these rules do not go far enough — they only apply to payments history, and that isn’t enough to push forward a data-driven economic revolution across other sectors of the economy. 

We need a global framework with common rules across regions and sectors. This has already happened in financial services: after the 2008 financial crisis, the G20 strengthened global banking standards and created the Financial Stability Board. The rules, while not perfect, have delivered uniformity which has strengthened the system. 

We need a similar global push for common rules on the use of data. While it will be difficult to achieve consensus on data, and undoubtedly more difficult still to implement and enforce it, I believe that now is the time to decide what we want. The involvement of the G20 in setting up global standards will be essential to realising the potential that data has to deliver a better world for all of us. There will be complaints about the cost of implementation. I know first hand how expensive it can be to simultaneously open up and protect sensitive core systems. 

The alternative is siloed data that holds back innovation. There will also be justified concerns that easier data sharing could lead to new user risks. Security must be a non-negotiable principle in designing intercompany interfaces and protecting access to sensitive data. But Open Banking shows that these challenges are resolvable. …(More)”.

France Bans Judge Analytics, 5 Years In Prison For Rule Breakers


Artificial Lawyer: “In a startling intervention that seeks to limit the emerging litigation analytics and prediction sector, the French Government has banned the publication of statistical information about judges’ decisions – with a five year prison sentence set as the maximum punishment for anyone who breaks the new law.

Owners of legal tech companies focused on litigation analytics are the most likely to suffer from this new measure.

The new law, encoded in Article 33 of the Justice Reform Act, is aimed at preventing anyone – but especially legal tech companies focused on litigation prediction and analytics – from publicly revealing the pattern of judges’ behaviour in relation to court decisions.

A key passage of the new law states:

‘The identity data of magistrates and members of the judiciary cannot be reused with the purpose or effect of evaluating, analysing, comparing or predicting their actual or alleged professional practices.’ *

As far as Artificial Lawyer understands, this is the very first example of such a ban anywhere in the world.

Insiders in France told Artificial Lawyer that the new law is a direct result of an earlier effort to make all case law easily accessible to the general public, which was seen at the time as improving access to justice and a big step forward for transparency in the justice sector.

However, judges in France had not reckoned on NLP and machine learning companies taking the public data and using it to model how certain judges behave in relation to particular types of legal matter or argument, or how they compare to other judges.

In short, they didn’t like how the pattern of their decisions – now relatively easy to model – were potentially open for all to see.

Unlike in the US and the UK, where judges appear to have accepted the fait accompli of legal AI companies analysing their decisions in extreme detail and then creating models as to how they may behave in the future, French judges have decided to stamp it out….(More)”.

Journalism Initiative Crowdsources Feedback on Failed Foreign Aid Projects


Abigail Higgins at SSIR: “It isn’t unusual that a girl raped in northeastern Kenya would be ignored by law enforcement. But for Mary, whose name has been changed to protect her identity, it should have been different—NGOs had established a hotline to report sexual violence just a few years earlier to help girls like her get justice. Even though the hotline was backed by major aid institutions like Mercy Corps and the British government, calls to it regularly went unanswered.

“That was the story that really affected me. It touched me in terms of how aid failures could impact someone,” says Anthony Langat, a Nairobi-based reporter who investigated the hotline as part of a citizen journalism initiative called What Went Wrong? that examines failed foreign aid projects.

Over six months in 2018, What Went Wrong? collected 142 reports of failed aid projects in Kenya, each submitted over the phone or via social media by the very people the project was supposed to benefit. It’s a move intended to help upend the way foreign aid is disbursed and debated. Although aid organizations spend significant time evaluating whether or not aid works, beneficiaries are often excluded from that process.

“There’s a serious power imbalance,” says Peter DiCampo, the photojournalist behind the initiative. “The people receiving foreign aid generally do not have much say. They don’t get to choose which intervention they want, which one would feel most beneficial for them. Our goal is to help these conversations happen … to put power into the hands of the people receiving foreign aid.”

What Went Wrong? documented eight failed projects in an investigative series published by Devex in March. In Kibera, one of Kenya’s largest slums, public restrooms meant to improve sanitation failed to connect to water and sewage infrastructure and were later repurposed as churches. In another story, the World Bank and local thugs struggled for control over the slum’s electrical grid….(More)”

Here’s a prediction: In the future, predictions will only get worse


Allison Schrager at Quartz: “Forecasts rely on data from the past, and while we now have better data than ever—and better techniques and technology with which to measure them—when it comes to forecasting, in many ways, data has never been more useless. And as data become more integral to our lives and the technology we rely upon, we must take a harder look at the past before we assume it tells us anything about the future.

To some extent, the weaknesses of data has always existed. Data are, by definition, information about what has happened in the past. Because populations and technology are constantly changing, they alter how we respond to incentives, policy, opportunities available to us, and even social cues. This undermines the accuracy of everything we try to forecast: elections, financial markets, even how long it will take to get to the airport.

But there is reason to believe we are experiencing more change than before. The economy is undergoing a major structural change by becoming more globally integrated, which increases some risks while reducing others, while technology has changed how we transact and communicate. I’ve written before how it’s now impossible for the movie industry to forecast hit films. Review-aggregation site Rotten Tomatoes undermines traditional marketing plans and the rise of the Chinese market means film makers must account for different tastes. Meanwhile streaming has changed how movies are consumed and who watches them. All these changes mean data from 10, or even five, years ago tell producers almost nothing about movie-going today.

We are in the age of big data that offers to promise of more accurate predictions. But it seems in some of the most critical aspects of our lives, data has never been more wrong….(More)”.

How Can We Overcome the Challenge of Biased and Incomplete Data?


Knowledge@Wharton: “Data analytics and artificial intelligence are transforming our lives. Be it in health care, in banking and financial services, or in times of humanitarian crises — data determine the way decisions are made. But often, the way data is collected and measured can result in biased and incomplete information, and this can significantly impact outcomes.  

In a conversation with Knowledge@Wharton at the SWIFT Institute Conference on the Impact of Artificial Intelligence and Machine Learning in the Financial Services Industry, Alexandra Olteanu, a post-doctoral researcher at Microsoft Research, U.S. and Canada, discussed the ethical and people considerations in data collection and artificial intelligence and how we can work towards removing the biases….

….Knowledge@Wharton: Bias is a big issue when you’re dealing with humanitarian crises, because it can influence who gets help and who doesn’t. When you translate that into the business world, especially in financial services, what implications do you see for algorithmic bias? What might be some of the consequences?

Olteanu: A good example is from a new law in the New York state according to which insurance companies can now use social media to decide the level for your premiums. But, they could in fact end up using incomplete information. For instance, you might be buying your vegetables from the supermarket or a farmer’s market, but these retailers might not be tracking you on social media. So nobody knows that you are eating vegetables. On the other hand, a bakery that you visit might post something when you buy from there. Based on this, the insurance companies may conclude that you only eat cookies all the time. This shows how even incomplete data can affect you….(More)”.

107 Years Later, The Titanic Sinking Helps Train Problem-Solving AI


Kiona N. Smith at Forbes: “What could the 107-year-old tragedy of the Titanic possibly have to do with modern problems like sustainable agriculture, human trafficking, or health insurance premiums? Data turns out to be the common thread. The modern world, for better or or worse, increasingly turns to algorithms to look for patterns in the data and and make predictions based on those patterns. And the basic methods are the same whether the question they’re trying to answer is “Would this person survive the Titanic sinking?” or “What are the most likely routes for human trafficking?”

An Enduring Problem

Predicting survival at sea based on the Titanic dataset is a standard practice problem for aspiring data scientists and programmers. Here’s the basic challenge: feed your algorithm a portion of the Titanic passenger list, which includes some basic variables describing each passenger and their fate. From that data, the algorithm (if you’ve programmed it well) should be able to draw some conclusions about which variables made a person more likely to live or die on that cold April night in 1912. To test its success, you then give the algorithm the rest of the passenger list (minus the outcomes) and see how well it predicts their fates.

Online communities like Kaggle.com have held competitions to see who can develop the algorithm that predicts survival most accurately, and it’s also a common problem presented to university classes. The passenger list is big enough to be useful, but small enough to be manageable for beginners. There’s a simple set out of outcomes — life or death — and around a dozen variables to work with, so the problem is simple enough for beginners to tackle but just complex enough to be interesting. And because the Titanic’s story is so famous, even more than a century later, the problem still resonates.

“It’s interesting to see that even in such a simple problem as the Titanic, there are nuggets,” said Sagie Davidovich, Co-Founder & CEO of SparkBeyond, who used the Titanic problem as an early test for SparkBeyond’s AI platform and still uses it as a way to demonstrate the technology to prospective customers….(More)”.

Can tracking people through phone-call data improve lives?


Amy Maxmen in Nature: “After an earthquake tore through Haiti in 2010, killing more than 100,000 people, aid agencies spread across the country to work out where the survivors had fled. But Linus Bengtsson, a graduate student studying global health at the Karolinska Institute in Stockholm, thought he could answer the question from afar. Many Haitians would be using their mobile phones, he reasoned, and those calls would pass through phone towers, which could allow researchers to approximate people’s locations. Bengtsson persuaded Digicel, the biggest phone company in Haiti, to share data from millions of call records from before and after the quake. Digicel replaced the names and phone numbers of callers with random numbers to protect their privacy.

Bengtsson’s idea worked. The analysis wasn’t completed or verified quickly enough to help people in Haiti at the time, but in 2012, he and his collaborators reported that the population of Haiti’s capital, Port-au-Prince, dipped by almost one-quarter soon after the quake, and slowly rose over the next 11 months1. That result aligned with an intensive, on-the-ground survey conducted by the United Nations.

Humanitarians and researchers were thrilled. Telecommunications companies scrutinize call-detail records to learn about customers’ locations and phone habits and improve their services. Researchers suddenly realized that this sort of information might help them to improve lives. Even basic population statistics are murky in low-income countries where expensive household surveys are infrequent, and where many people don’t have smartphones, credit cards and other technologies that leave behind a digital trail, making remote-tracking methods used in richer countries too patchy to be useful.

Since the earthquake, scientists working under the rubric of ‘data for good’ have analysed calls from tens of millions of phone owners in Pakistan, Bangladesh, Kenya and at least two dozen other low- and middle-income nations. Humanitarian groups say that they’ve used the results to deliver aid. And researchers have combined call records with other information to try to predict how infectious diseases travel, and to pinpoint locations of poverty, social isolation, violence and more (see ‘Phone calls for good’)….(More)”.

Africa must reap the benefits of its own data


Tshilidzi Marwala at Business Insider: “Twenty-two years ago when I was a doctoral student in artificial intelligence (AI) at the University of Cambridge, I had to create all the AI algorithms I needed to understand the complex phenomena related to this field.

For starters, AI is a computer software that performs intelligent tasks that normally require human beings, while an algorithm is a set of rules that instruct a computer to execute specific tasks. In that era, the ability to create AI algorithms was more important than the ability to acquire and use data.

Google has created an open-source library called TensorFlow, which contains all the developed AI algorithms. This way Google wants people to develop applications (apps) using their software, with the payoff being that Google will collect data on any individual using the apps developed with TensorFlow.

Today, an AI algorithm is not a competitive advantage but data is. The World Economic Forum calls data the new “oxygen”, while Chinese AI specialist Kai-Fu Lee calls it the new “oil”.

Africa’s population is increasing faster than in any region in the world. The continent has a population of 1.3-billion people and a total nominal GDP of $2.3-trillion. This increase in the population is in effect an increase in data, and if data is the new oil, it is akin to an increase in oil reserve.

Even oil-rich countries such as Saudi Arabia do not experience an increase in their oil reserve. How do we as Africans take advantage of this huge amount of data?

There are two categories of data in Africa: heritage and personal. Heritage data resides in society, whereas personal data resides in individuals. Heritage data includes data gathered from our languages, emotions and accents. Personal data includes health, facial and fingerprint data.

Facebook, Amazon, Apple, Netflix and Google are data companies. They trade data to advertisers, banks and political parties, among others. For example, the controversial company Cambridge Analytica harvested Facebook data to influence the presidential election that potentially contributed to Donald Trump’s victory in the US elections.

The company Google collects language data to build an application called Google Translate that translates from one language to another. This app claims to cover African languages such as Zulu, Yoruba and Swahili. Google Translate is less effective in handling African languages than it is in handling European and Asian languages.

Now, how do we capitalise on our language heritage to create economic value? We need to build our own language database and create our own versions of Google Translate.

An important area is the creation of an African emotion database. Different cultures exhibit emotions differently. These are very important in areas such as safety of cars and aeroplanes. If we can build a system that can read pilots’ emotions, this would enable us to establish if a pilot is in a good state of mind to operate an aircraft, which would increase safety.

To capitalise on the African emotion database, we should create a data bank that captures emotions of African people in various parts of the continent, and then use this database to create AI apps to read people’s emotions. Mercedes-Benz has already implemented the “Attention Assist”, which alerts drivers to fatigue.

Another important area is the creation of an African health database. AI algorithms are able to diagnose diseases better than human doctors. However, these algorithms depend on the availability of data. To capitalise on this, we need to collect such data and use it to build algorithms that will be able to augment medical care….(More)”.

Airbnb and New York City Reach a Truce on Home-Sharing Data


Paris Martineau at Wired: “For much of the past decade, Airbnb and New York City have been embroiled in a high-profile feud. Airbnb wants legitimacy in its biggest market. City officials want to limit home-sharing platforms, which they argue exacerbate the city’s housing crisis and pose safety risks by allowing people to transform homes into illegal hotels.

Despite years of lawsuits, countersuits, lobbying campaigns, and failed attempts at legislation, progress on resolving the dispute has been incremental at best. The same could be said for many cities around the nation, as local government officials struggle to come to grips with the increasing popularity of short-term rental platforms like Airbnb, HomeAway, and VRBO in high-tourism areas.

In New York last week, there were two notable breaks in the logjam. On May 14, Airbnb agreed to give city officials partially anonymized host and reservation data for more than 17,000 listings. Two days later, a judge ordered Airbnb to turn over more detailed and nonanonymized information on dozens of hosts and hundreds of guests who have listed or stayed in more than a dozen buildings in Manhattan, Brooklyn, and Queens in the past seven years.

In both cases, the information will be used by investigators with the Mayor’s Office of Special Enforcement to identify hosts and property owners who may have broken the city’s notoriously strict short-term rental laws by converting residences into de facto hotels by listing them on Airbnb.

City officials originally subpoenaed Airbnb for the data—not anonymized—on the more than 17,000 listings in February. Mayor Bill de Blasio called the move an effort to force the company to “come clean about what they’re actually doing in this city.” The agreement outlining the data sharing was signed as a compromise on May 14, according to court records.

In addition to the 17,000 listings identified by the city, Airbnb will also share data on every listing rented through its platform between January 1, 2018, and February 18, 2019, that could have potentially violated New York’s short-term rental laws. The city prohibits rentals of an entire apartment or home for less than 30 days without the owner present in the unit, making many stays traditionally associated with services like Airbnb, HomeAway, and VRBO illegal. Only up to two guests are permitted in the short-term rental of an apartment or room, and they must be given “free and unobstructed access to every room and to each exit within the apartment,” meaning hosts can’t get around the ban on whole-apartment rentals by renting out three separate private rooms at once….(More)”.

Companies That Rely On Census Data Worry Citizenship Question Will Hurt


Hansi Lo Wang at NPR: “Some critics of the citizenship question the Trump administration wants to add to the 2020 census are coming from a group that tends to stay away from politically heated issues — business leaders.

From longtime corporations like Levi Strauss & Co. to upstarts like Warby Parker, some companies say that including the question — “Is this person a citizen of the United States?” — could harm not only next year’s national head count, but also their bottom line.

How governments use census data is a common refrain in the lead-up to a constitutionally mandated head count of every person living in the U.S. The new population counts, gathered once a decade, are used to determine how congressional seats and Electoral College votes are distributed among the states. They also guide how hundreds of billions in federal tax dollars are spread around the country to fund public services.

What is often less visible is how the census data undergird decisions made by large and small businesses across the country. The demographic information the census collects — including the age, sex, race, ethnicity and housing status of all U.S. residents — informs business owners about who their existing and future customers are, which new products and services those markets may want and where to build new locations.

Weeks before the Supreme Court heard oral arguments over the citizenship question last month, more than two dozen companies and business groups filed a friend-of-the-court brief against the question. Its potential impact on the accuracy of census data, especially about immigrants and people of color, is drawing concern from both Lyft and Uber, as well as Levi Strauss, Warby Parker and Univision.

“We don’t view this as a political situation at all,” says Christine Pierce, the senior vice president of data science at Nielsen — a major data analytics company in the business world that filed its own brief with the high court. “We see this as one that is around sound research and good science.”…(More)”.