machine learning

Google is using AI to predict floods in India and warn users

Curated on September 26, 2018 by Stefaan Verhulst

James Vincent at The Verge: “For years Google has warned users about natural disasters by incorporating alerts from government agencies like FEMA into apps like Maps and Search. Now, the company is making predictions of its own. As part of a partnership with the Central Water Commission of India, Google will now alert users in the country about impending floods. The service is only currently available in the Patna region, with the first alert going out earlier this month.

As Google’s engineering VP Yossi Matias outlines in a blog post, these predictions are being made using a combination of machine learning, rainfall records, and flood simulations.

“A variety of elements — from historical events, to river level readings, to the terrain and elevation of a specific area — feed into our models,” writes Matias. “With this information, we’ve created river flood forecasting models that can more accurately predict not only when and where a flood might occur, but the severity of the event as well.”

The US tech giant announced its partnership with the Central Water Commission back in June. The two organizations agreed to share technical expertise and data to work on the predictions, with the Commission calling the collaboration a “milestone in flood management and in mitigating the flood losses.” Such warnings are particularly important in India, where 20 percent of the world’s flood-related fatalities are estimated to occur….(More)”.

Swarm AI Outperforms in Stanford Medical Study

Curated on September 16, 2018December 11, 2018 by Stefaan Verhulst

Press Release: “Stanford University School of Medicine and Unanimous AI presented a new study today showing that a small group of doctors, connected by intelligence algorithms that enable them to work together as a “hive mind,” could achieve higher diagnostic accuracy than the individual doctors or machine learning algorithms alone. The technology used is called Swarm AI and it empowers networked human groups to combine their individual insights in real-time, using AI algorithms to converge on optimal solutions.

As presented at the 2018 SIIM Conference on Machine Intelligence in Medical Imaging, the study tasked a group of experienced radiologists with diagnosing the presence of pneumonia in chest X-rays. This is one of the most widely performed imaging procedures in the US, with more than 1 million adults hospitalized with pneumonia each year. But, despite this prevalence, accurately diagnosing X-rays is highly challenging with significant variability across radiologists. This makes it both an optimal task for applying new AI technologies, and an important problem to solve for the medical community.

When generating diagnoses using Swarm AI technology, the average error rate was reduced by 33% compared to traditional diagnoses by individual practitioners. This is an exciting result, showing the potential of AI technologies to amplify the accuracy of human practitioners while maintaining their direct participation in the diagnostic process.

Swarm AI technology was also compared to the state-of-the-art in automated diagnosis using software algorithms that do not employ human practitioners. Currently, the best system in the world for the automated diagnosing of pneumonia from chest X-rays is the CheXNet system from Stanford University, which made headlines in 2017 by significantly outperforming individual practitioners using deep-learning derived algorithms.

The Swarm AI system, which combines real-time human insights with AI technology, was 22% more accurate in binary classification than the software-only CheXNet system. In other words, by connecting a group of radiologists into a medical “hive mind”, the hybrid human-machine system was able to outperform individual human doctors as well as the state-of-the-art in deep-learning derived algorithms….(More)”.

Don’t forget people in the use of big data for development

Curated on September 15, 2018November 13, 2018 by Stefaan Verhulst

Joshua Blumenstock at Nature: “Today, 95% of the global population has mobile-phone coverage, and the number of people who own a phone is rising fast (see ‘Dialling up’)¹. Phones generate troves of personal data on billions of people, including those who live on a few dollars a day. So aid organizations, researchers and private companies are looking at ways in which this ‘data revolution’ could transform international development.

Some businesses are starting to make their data and tools available to those trying to solve humanitarian problems. The Earth-imaging company Planet in San Francisco, California, for example, makes its high-resolution satellite pictures freely available after natural disasters so that researchers and aid organizations can coordinate relief efforts. Meanwhile, organizations such as the World Bank and the United Nations are recruiting teams of data scientists to apply their skills in statistics and machine learning to challenges in international development.

But in the rush to find technological solutions to complex global problems there’s a danger of researchers and others being distracted by the technology and losing track of the key hardships and constraints that are unique to each local context. Designing data-enabled applications that work in the real world will require a slower approach that pays much more attention to the people behind the numbers…(More)”.

Reflecting the Past, Shaping the Future: Making AI Work for International Development

Curated on September 5, 2018 by Stefaan Verhulst

USAID Report: “We are in the midst of an unprecedented surge of interest in machine learning (ML) and artificial intelligence (AI) technologies. These tools, which allow computers to make data-derived predictions and automate decisions, have become part of daily life for billions of people. Ubiquitous digital services such as interactive maps, tailored advertisements, and voice-activated personal assistants are likely only the beginning. Some AI advocates even claim that AI’s impact will be as profound as “electricity or fire” that it will revolutionize nearly every field of human activity. This enthusiasm has reached international development as well. Emerging ML/AI applications promise to reshape healthcare, agriculture, and democracy in the developing world. ML and AI show tremendous potential for helping to achieve sustainable development objectives globally. They can improve efficiency by automating labor-intensive tasks, or offer new insights by finding patterns in large, complex datasets. A recent report suggests that AI advances could double economic growth rates and increase labor productivity 40% by 2035. At the same time, the very nature of these tools — their ability to codify and reproduce patterns they detect — introduces significant concerns alongside promise.

In developed countries, ML tools have sometimes been found to automate racial profiling, to foster surveillance, and to perpetuate racial stereotypes. Algorithms may be used, either intentionally or unintentionally, in ways that result in disparate or unfair outcomes between minority and majority populations. Complex models can make it difficult to establish accountability or seek redress when models make mistakes. These shortcomings are not restricted to developed countries. They can manifest in any setting, especially in places with histories of ethnic conflict or inequality. As the development community adopts tools enabled by ML and AI, we need a cleareyed understanding of how to ensure their application is effective, inclusive, and fair. This requires knowing when ML and AI offer a suitable solution to the challenge at hand. It also requires appreciating that these technologies can do harm — and committing to addressing and mitigating these harms.

ML and AI applications may sometimes seem like science fiction, and the technical intricacies of ML and AI can be off-putting for those who haven’t been formally trained in the field. However, there is a critical role for development actors to play as we begin to lean on these tools more and more in our work. Even without technical training in ML, development professionals have the ability — and the responsibility — to meaningfully influence how these technologies impact people.

You don’t need to be an ML or AI expert to shape the development and use of these tools. All of us can learn to ask the hard questions that will keep solutions working for, and not against, the development challenges we care about. Development practitioners already have deep expertise in their respective sectors or regions. They bring necessary experience in engaging local stakeholders, working with complex social systems, and identifying structural inequities that undermine inclusive progress. Unless this expert perspective informs the construction and adoption of ML/AI technologies, ML and AI will fail to reach their transformative potential in development.

This document aims to inform and empower those who may have limited technical experience as they navigate an emerging ML/AI landscape in developing countries. Donors, implementers, and other development partners should expect to come away with a basic grasp of common ML techniques and the problems ML is uniquely well-suited to solve. We will also explore some of the ways in which ML/AI may fail or be ill-suited for deployment in developing-country contexts. Awareness of these risks, and acknowledgement of our role in perpetuating or minimizing them, will help us work together to protect against harmful outcomes and ensure that AI and ML are contributing to a fair, equitable, and empowering future…(More)”.

Origin Privacy: Protecting Privacy in the Big-Data Era

Curated on August 30, 2018 by Stefaan Verhulst

Paper by Helen Nissenbaum, Sebastian Benthall, Anupam Datta, Michael Carl Tschantz, and Piot Mardziel: “Machine learning over big data poses challenges for our conceptualization of privacy. Such techniques can discover surprising and counteractive associations that take innocent looking data and turns it into important inferences about a person. For example, the buying carbon monoxide monitors has been linked to paying credit card bills, while buying chrome-skull car accessories predicts not doing so. Also, Target may have used the buying of scent-free hand lotion and vitamins as a sign that the buyer is pregnant. If we take pregnancy status to be private and assume that we should prohibit the sharing information that can reveal that fact, then we have created an unworkable notion of privacy, one in which sharing any scrap of data may violate privacy.

Prior technical specifications of privacy depend on the classification of certain types of information as private or sensitive; privacy policies in these frameworks limit access to data that allow inference of this sensitive information. As the above examples show, today’s data rich world creates a new kind of problem: it is difficult if not impossible to guarantee that information does notallow inference of sensitive topics. This makes information flow rules based on information topic unstable.

We address the problem of providing a workable definition of private data that takes into account emerging threats to privacy from large-scale data collection systems. We build on Contextual Integrity and its claim that privacy is appropriate information flow, or flow according to socially or legally specified rules.

As in other adaptations of Contextual Integrity (CI) to computer science, the parameterization of social norms in CI is translated into a logical specification. In this work, we depart from CI by considering rules that restrict information flow based on its origin and provenance, instead of on it’s type, topic, or subject.

We call this concept of privacy as adherence to origin-based rules Origin Privacy. Origin Privacy rules can be found in some existing data protection laws. This motivates the computational implementation of origin-based rules for the simple purpose of compliance engineering. We also formally model origin privacy to determine what security properties it guarantees relative to the concerns that motivate it….(More)”.

Odd Numbers: Algorithms alone can’t meaningfully hold other algorithms accountable

Curated on August 18, 2018 by Stefaan Verhulst

Frank Pasquale at Real Life Magazine: “Algorithms increasingly govern our social world, transforming data into scores or rankings that decide who gets credit, jobs, dates, policing, and much more. The field of “algorithmic accountability” has arisen to highlight the problems with such methods of classifying people, and it has great promise: Cutting-edge work in critical algorithm studies applies social theory to current events; law and policy experts seem to publish new articles daily on how artificial intelligence shapes our lives, and a growing community of researchers has developed a field known as “Fairness, Accuracy, and Transparency in Machine Learning.”

The social scientists, attorneys, and computer scientists promoting algorithmic accountability aspire to advance knowledge and promote justice. But what should such “accountability” more specifically consist of? Who will define it? At a two-day, interdisciplinary roundtable on AI ethics I recently attended, such questions featured prominently, and humanists, policy experts, and lawyers engaged in a free-wheeling discussion about topics ranging from robot arms races to computationally planned economies. But at the end of the event, an emissary from a group funded by Elon Musk and Peter Thiel among others pronounced our work useless. “You have no common methodology,” he informed us (apparently unaware that that’s the point of an interdisciplinary meeting). “We have a great deal of money to fund real research on AI ethics and policy”— which he thought of as dry, economistic modeling of competition and cooperation via technology — “but this is not the right group.” He then gratuitously lashed out at academics in attendance as “rent seekers,” largely because we had the temerity to advance distinctive disciplinary perspectives rather than fall in line with his research agenda.

Most corporate contacts and philanthrocapitalists are more polite, but their sense of what is realistic and what is utopian, what is worth studying and what is mere ideology, is strongly shaping algorithmic accountability research in both social science and computer science. This influence in the realm of ideas has powerful effects beyond it. Energy that could be put into better public transit systems is instead diverted to perfect the coding of self-driving cars. Anti-surveillance activism transmogrifies into proposals to improve facial recognition systems to better recognize all faces. To help payday-loan seekers, developers might design data-segmentation protocols to show them what personal information they should reveal to get a lower interest rate. But the idea that such self-monitoring and data curation can be a trap, disciplining the user in ever finer-grained ways, remains less explored. Trying to make these games fairer, the research elides the possibility of rejecting them altogether….(More)”.

#TrendingLaws: How can Machine Learning and Network Analysis help us identify the “influencers” of Constitutions?

Curated on August 7, 2018December 11, 2018 by Stefaan Verhulst

Unicef: “New research by scientists from UNICEF’s Office of Innovation — published today in the journal Nature Human Behaviour — applies methods from network science and machine learning to constitutional law. UNICEF Innovation Data Scientists Alex Rutherford and Manuel Garcia-Herranz collaborated with computer scientists and political scientists at MIT, George Washington University, and UC Merced to apply data analysis to the world’s constitutions over the last 300 years. This work sheds new light on how to better understand why countries’ laws change and incorporate social rights…

Data science techniques allow us to use methods like network science and machine learning to uncover patterns and insights that are hard for humans to see. Just as we can map influential users on Twitter — and patterns of relations between places to predict how diseases will spread — we can identify which countries have influenced each other in the past and what are the relations between legal provisions.

Why The Science of Constitutions?

One way UNICEF fulfills its mission is through advocacy with national governments — to enshrine rights for minorities, notably children, formally in law. Perhaps the most renowned example of this is the International Convention on the Rights of the Child (ICRC).

Constitutions, such as Mexico’s 1917 constitution — the first to limit the employment of children — are critical to formalizing rights for vulnerable populations. National constitutions describe the role of a country’s institutions, its character in the eyes of the world, as well as the rights of its citizens.

From a scientific standpoint, the work is an important first step in showing that network analysis and machine learning technique can be used to better understand the dynamics of caring for and protecting the rights of children — critical to the work we do in a complex and interconnected world. It shows the significant, and positive policy implications of using data science to uphold children’s rights.

What the Research Shows:

Through this research, we uncovered:

A network of relationships between countries and their constitutions.
A natural progression of laws — where fundamental rights are a necessary precursor to more specific rights for minorities.
The effect of key historical events in changing legal norms….(More)”.

To Better Predict Traffic, Look to the Electric Grid

Curated on August 2, 2018 by Stefaan Verhulst

Linda Poon at CityLab: “The way we consume power after midnight can reveal how we bad the morning rush hour will be….

Commuters check Google Maps for traffic updates the same way they check the weather app for rain predictions. And for good reasons: By pooling information from millions of drivers already on the road, Google can paint an impressively accurate real-time portrait of congestion. Meanwhile, historical numbers can roughly predict when your morning commutes may be particularly bad.

But “the information we extract from traffic data has been exhausted,” said Zhen (Sean) Qian, who directs the Mobility Data Analytics Center at Carnegie Mellon University. He thinks that to more accurately predict how gridlock varies from day to day, there’s a whole other set of data that cities haven’t mined yet: electricity use.

“Essentially we all use the urban system—the electricity, water, the sewage system and gas—and when people use them and how heavily they do is correlated to the way they use the transportation system,” he said. How we use electricity at night, it turns out, can reveal when we leave for work the next day. “So we might be able to get new information that helps explain travel time one or two hours in advance by having a better understanding of human activity.”

When compared with 2014 traffic data, they found that 8 out of the 10 patterns had an impact on highway traffic. Households that show a spike of electricity use from midnight to 2 a.m., for example, may be night owls who sleep in, leave late, and likely won’t contribute to the early morning congestion. In contrast, households that report low electricity use from midnight to 5 a.m., followed by a rise after 5:30 a.m., could be early risers who will be on the road during rush hour. If the researchers’ model detects more households falling into the former group, it might predict that peak congestion will start closer to, say, 7:45 a.m. rather than the usual 7:30….(More)”.

This surprising, everyday tool might hold the key to changing human behavior

Curated on August 1, 2018October 9, 2018 by Stefaan Verhulst

Annabelle Timsit at Quartz: “To be a person in the modern world is to worry about your relationship with your phone. According to critics, smartphones are making us ill-mannered and sore-necked, dragging parents’ attention away from their kids, and destroying an entire generation.

But phones don’t have to be bad. With 4.68 billion people forecast to become mobile phone users by 2019, nonprofits and social science researchers are exploring new ways to turn our love of screens into a force for good. One increasingly popular option: Using texting to help change human behavior.

Texting: A unique tool

The short message service (SMS) was invented in the late 1980s, and the first text message was sent in 1992. (Engineer Neil Papworth sent “merry Christmas” to then-Vodafone director Richard Jarvis.) In the decades since, texting has emerged as the preferred communication method for many, and in particular younger generations. While that kind of habit-forming can be problematic—47% of US smartphone users say they “couldn’t live without” the device—our attachment to our phones also makes text-based programs a good way to encourage people to make better choices.

“Texting, because it’s anchored in mobile phones, has the ability to be with you all the time, and that gives us an enormous flexibility on precision,” says Todd Rose, director of the Mind, Brain, & Education Program at the Harvard Graduate School of Education. “When people lead busy lives, they need timely, targeted, actionable information.”

And who is busier than a parent? Text-based programs can help current or would-be moms and dads with everything from medication pickup to childhood development. Text4Baby, for example, messages pregnant women and young moms with health information and reminders about upcoming doctor visits. Vroom, an app for building babies’ brains, sends parents research-based prompts to help them build positive relationships with their children (for example, by suggesting they ask toddlers to describe how they’re feeling based on the weather). Muse, an AI-powered app, uses machine learning and big data to try and help parents raise creative, motivated, emotionally intelligent kids. As Jenny Anderson writes in Quartz: “There is ample evidence that we can modify parents’ behavior through technological nudges.”

Research suggests text-based programs may also be helpful in supporting young children’s academic and cognitive development. …Texts aren’t just being used to help out parents. Non-governmental organizations (NGOs) have also used them to encourage civic participation in kids and young adults. Open Progress, for example, has an all-volunteer community called “text troop” that messages young adults across the US, reminding them to register to vote and helping them find their polling location.

Text-based programs are also useful in the field of nutrition, where private companies and public-health organizations have embraced them as a way to give advice on healthy eating and weight loss. The National Cancer Institute runs a text-based program called SmokefreeTXT that sends US adults between three and five messages per day for up to eight weeks, to help them quit smoking.

Texting programs can be a good way to nudge people toward improving their mental health, too. Crisis Text Line, for example, was the first national 24/7 crisis-intervention hotline to conduct counseling conversations entirely over text…(More).

We Need Transparency in Algorithms, But Too Much Can Backfire

Curated on July 30, 2018 by Stefaan Verhulst

Kartik Hosanagar and Vivian Jair at Harvard Business Review: “In 2013, Stanford professor Clifford Nass faced a student revolt. Nass’s students claimed that those in one section of his technology interface course received higher grades on the final exam than counterparts in another. Unfortunately, they were right: two different teaching assistants had graded the two different sections’ exams, and one had been more lenient than the other. Students with similar answers had ended up with different grades.

Nass, a computer scientist, recognized the unfairness and created a technical fix: a simple statistical model to adjust scores, where students got a certain percentage boost on their final mark when graded by a TA known to give grades that percentage lower than average. In the spirit of openness, Nass sent out emails to the class with a full explanation of his algorithm. Further complaints poured in, some even angrier than before. Where had he gone wrong?…

Kizilcec had in fact tested three levels of transparency: low and medium but also high, where the students got not only a paragraph explaining the grading process but also their raw peer-graded scores and how these were each precisely adjusted by the algorithm to get to a final grade. And this is where the results got more interesting. In the experiment, while medium transparency increased trust significantly, high transparency eroded it completely, to the point where trust levels were either equal to or lower than among students experiencing low transparency.

Making Modern AI Transparent: A Fool’s Errand?

What are businesses to take home from this experiment? It suggests that technical transparency – revealing the source code, inputs, and outputs of the algorithm – can build trust in many situations. But most algorithms in the world today are created and managed by for-profit companies, and many businesses regard their algorithms as highly valuable forms of intellectual property that must remain in a “black box.” Some lawmakers have proposed a compromise, suggesting that the source code be revealed to regulators or auditors in the event of a serious problem, and this adjudicator will assure consumers that the process is fair.

This approach merely shifts the burden of belief from the algorithm itself to the regulators. This may a palatable solution in many arenas: for example, few of us fully understand financial markets, so we trust the SEC to take on oversight. But in a world where decisions large and small, personal and societal, are being handed over to algorithms, this becomes less acceptable.

Another problem with technical transparency is that it makes algorithms vulnerable to gaming. If an instructor releases the complete source code for an algorithm grading student essays, it becomes easy for students to exploit loopholes in the code: maybe, for example, the algorithm seeks evidence that the students have done research by looking for phrases such as “according to published research.” A student might then deliberately use this language at the start of every paragraph in her essay.

But the biggest problem is that modern AI is making source code – transparent or not – less relevant compared with other factors in algorithmic functioning. Specifically, machine learning algorithms – and deep learning algorithms in particular – are usually built on just a few hundred lines of code. The algorithms logic is mostly learned from training data and is rarely reflected in its source code. Which is to say, some of today’s best-performing algorithms are often the most opaque. High transparency might involve getting our heads around reams and reams of data – and then still only being able to guess at what lessons the algorithm has learned from it.

This is where Kizilcec’s work becomes relevant – a way to embrace rather than despair over deep learning’s impenetrability. His work shows that users will not trust black box models, but they don’t need – or even want – extremely high levels of transparency. That means responsible companies need not fret over what percentage of source code to reveal, or how to help users “read” massive datasets. Instead, they should work to provide basic insights on the factors driving algorithmic decisions….(More)”