Content Volatility of Scientific Topics in Wikipedia: A Cautionary Tale


Paper by Wilson AM and Likens GE at PLOS: “Wikipedia has quickly become one of the most frequently accessed encyclopedic references, despite the ease with which content can be changed and the potential for ‘edit wars’ surrounding controversial topics. Little is known about how this potential for controversy affects the accuracy and stability of information on scientific topics, especially those with associated political controversy. Here we present an analysis of the Wikipedia edit histories for seven scientific articles and show that topics we consider politically but not scientifically “controversial” (such as evolution and global warming) experience more frequent edits with more words changed per day than pages we consider “noncontroversial” (such as the standard model in physics or heliocentrism). For example, over the period we analyzed, the global warming page was edited on average (geometric mean ±SD) 1.9±2.7 times resulting in 110.9±10.3 words changed per day, while the standard model in physics was only edited 0.2±1.4 times resulting in 9.4±5.0 words changed per day. The high rate of change observed in these pages makes it difficult for experts to monitor accuracy and contribute time-consuming corrections, to the possible detriment of scientific accuracy. As our society turns to Wikipedia as a primary source of scientific information, it is vital we read it critically and with the understanding that the content is dynamic and vulnerable to vandalism and other shenanigans….(More)”

5 Tips for Designing a Data for Good Initiative


Mitul Desai at Mastercard Center for Inclusive Growth: “The transformative impact of data on development projects, captured in the hashtag #DATARevolution, offers the social and private sectors alike a rallying point to enlist data in the service of high-impact development initiatives.

To help organizations design initiatives that are authentic to their identity and capabilities, we’re sharing what’s necessary to navigate the deeply interconnected organizational, technical and ethical aspects of creating a Data for Good initiative.

1) Define the need

At the center of a Data for Good initiative are the individual beneficiaries you are seeking to serve. This is foundation on which the “Good” of Data for Good rests.

Understanding the data and expertise needed to better serve such individuals will bring into focus the areas where your organization can contribute and the partners you might engage. As we’ve covered in past posts, collaboration between agents who bring different layers of expertise to Data for Good projects is a powerful formula for change….

2) Understand what data can make a difference

Think about what kind of data can tell a story that’s relevant to your mission. Claudia Perlich of Dstillery says: “The question is first and foremost, what decision do I have to make and which data can tell me something about that decision.” This great introduction to what different kinds of data are relevant in different settings can give you concrete examples.

3) Get the right tools for the job

By one estimate, some 90% of business-relevant data are unstructured or semi-structured (think texts, tweets, images, audio) as opposed to structured data like numbers that easily fit into the lines of a spreadsheet. Perlich notes that while it’s more challenging to mine this unstructured data, they can yield especially powerful insights with the right tools—which thankfully aren’t that hard to identify…..

4) Build a case that moves your organization

“While our programs are designed to serve organizations no matter what their capacity, we do find that an organization’s clarity around mission and commitment to using data to drive decision-making are two factors that can make or break a project,” says Jake Porway, founder and executive director of DataKind, a New York-based data science nonprofit that helps organizations develop Data for Good initiatives…..

5) Make technology serve people-centric ethics

The two most critical ethical factors to consider are informed consent and privacy—both require engaging the community you wish to serve as individual actors….

“Employ data-privacy walls, mask the data from the point of collection and encrypt the data you store. Ensure that appropriate technical and organizational safeguards are in place to verify that the data can’t be used to identify individuals or target demographics in a way that could harm them,” recommends Quid’s Pedraza. To understand the technology of data encryption and masking, check out this post. (More)”

President Obama Signs Executive Order Making Presidential Innovation Fellows Program Permanent


White House Press Release: “My hope is this continues to encourage a culture of public service among our innovators, and tech entrepreneurs, so that we can keep building a government that’s as modern, as innovative, and as engaging as our incredible tech sector is.  To all the Fellows who’ve served so far – thank you.  I encourage all Americans with bold ideas to apply.  And I can’t wait to see what those future classes will accomplish on behalf of the American people.” –- President Barack Obama

Today, President Obama signed an executive order that makes the Presidential Innovation Fellows Program a permanent part of the Federal government going forward. The program brings executives, entrepreneurs, technologists, and other innovators into government, and teams them up with Federal employees to improve programs that serve more than 150 million Americans.

The Presidential Innovation Fellows Program is built on four key principles:

  • Recruit the best our nation has to offer: Fellows include entrepreneurs, startup founders, and innovators with experience at large technology companies and startups, each of whom leverage their proven skills and technical expertise to create huge value for the public.
  • Partner with innovators inside government: Working as teams, the Presidential Innovation Fellows and their partners across the government create products and services that are responsive, user-friendly, and help to improve the way the Federal government interacts with the American people.
  • Deploy proven private sector strategies: Fellows leverage best practices from the private sector to deliver better, more effective programs and policies across the Federal government.
  • Focus on some of the Nation’s biggest and most pressing challenges: Projects focus on topics such as improving access to education, fueling job creation and the economy, and expanding the public’s ability to access their personal health data.

Additional Details on Today’s Announcements

The Executive Order formally establishes the Presidential Innovation Fellows Program within the General Services Administration (GSA), where it will continue to serve departments and agencies throughout the Executive Branch. The Presidential Innovation Fellow Program will be administered by a Director and guided by a newly-established Advisory Board. The Director will outline steps for the selection, hiring, and deployment of Fellows within government….

Fellows have partnered with leaders at more than 25 government agencies, delivering impressive results in months, not years, driving extraordinary work and innovative solutions in areas such as health care; open data and data science; crowd-sourcing initiatives; education; veterans affairs; jobs and the economy; and disaster response and recovery. Examples of projects include:

Open Data

When government acts as a platform, entrepreneurs, startups, and the private sector can build value-added services and tools on top of federal datasets supported by federal policies. Taking this approach, Fellows and agency stakeholders have supported the creation of new products and services focused on education, health, the environment, and social justice. As a result of their efforts and the agencies they have worked with:….

Jobs and the Economy

Fellows continue to work on solutions that will give the government better access to innovative tools and services. This is also helping small and medium-sized companies create jobs and compete for Federal government contracts….

Digital Government

The Presidential Innovation Fellows Program is a part of the Administration’s strategy to create lasting change across the Federal Government by improving how it uses technology. The Fellows played a part in launching 18F within the General Services Administration (GSA) and the U.S. Digital Services (USDS) team within the Office of Management and Budget….

Supporting Our Veterans

  • …Built a one-stop shop for finding employment opportunities. The Veterans Employment Center was developed by a team of Fellows working with the Department of Veterans Affairs in connection with the First Lady’s Joining Forces Initiative and the Department of Labor. This is the first interagency website connecting Veterans, transitioning Servicemembers, and their spouses to meaningful employment opportunities. The portal has resulted in cost savings of over $27 million to the Department of Veterans Affairs.

Education

  • …More than 1,900 superintendents pledged to more effectively leverage education technology in their schools. Fellows working at the Department of Education helped develop the idea of Future Ready, which later informed the creation of the Future Ready District Pledge. The Future Ready District Pledge is designed to set out a roadmap to achieve successful personalized digital learning for every student and to commit districts to move as quickly as possible towards our shared vision of preparing students for success. Following the President’s announcement of this effort in 2014, more than 1,900 superintendents have signed this pledge, representing 14 million students.

Health and Patient Care

  • More than 150 million Americans are able to access their health records online. Multiple rounds of Fellows have worked with the Department of Health and Human Services (HHS) and the Department of Veterans Affairs (VA) to expand the reach of theBlue Button Initiative. As a result, patients are able to access their electronic health records to make more informed decisions about their own health care. The Blue Button Initiative has received more than 600 commitments from organizations to advance health information access efforts across the country and has expanded into other efforts that support health care system interoperability….

Disaster Response and Recovery

  • Communities are piloting crowdsourcing tools to assess damage after disasters. Fellows developed the GeoQ platform with FEMA and the National Geospatial-Intelligence Agency that crowdsources photos of disaster-affected areas to assess damage over large regions.  This information helps the Federal government better allocate critical response and recovery efforts following a disaster and allows local governments to use geospatial information in their communities…. (More)

The Last Mile: Creating Social and Economic Value from Behavioral Insights


New book by Dilip Soman: “Most organizations spend much of their effort on the start of the value creation process: namely, creating a strategy, developing new products or services, and analyzing the market. They pay a lot less attention to the end: the crucial “last mile” where consumers come to their website, store, or sales representatives and make a choice.

In The Last Mile, Dilip Soman shows how to use insights from behavioral science in order to close that gap. Beginning with an introduction to the last mile problem and the concept of choice architecture, the book takes a deep dive into the psychology of choice, money, and time. It explains how to construct behavioral experiments and understand the data on preferences that they provide. Finally, it provides a range of practical tools with which to overcome common last mile difficulties.

The Last Mile helps lay readers not only to understand behavioral science, but to apply its lessons to their own organizations’ last mile problems, whether they work in business, government, or the nonprofit sector. Appealing to anyone who was fascinated by Dan Ariely’s Predictably Irrational, Richard Thaler and Cass Sunstein’s Nudge, or Daniel Kahneman’s Thinking, Fast and Slow but was not sure how those insights could be practically used, The Last Mile is full of solid, practical advice on how to put the lessons of behavioral science to work….(More)”

Big data algorithms can discriminate, and it’s not clear what to do about it


 at the Conversation“This program had absolutely nothing to do with race…but multi-variable equations.”

That’s what Brett Goldstein, a former policeman for the Chicago Police Department (CPD) and current Urban Science Fellow at the University of Chicago’s School for Public Policy, said about a predictive policing algorithm he deployed at the CPD in 2010. His algorithm tells police where to look for criminals based on where people have been arrested previously. It’s a “heat map” of Chicago, and the CPD claims it helps them allocate resources more effectively.

Chicago police also recently collaborated with Miles Wernick, a professor of electrical engineering at Illinois Institute of Technology, to algorithmically generate a “heat list” of 400 individuals it claims have thehighest chance of committing a violent crime. In response to criticism, Wernick said the algorithm does not use “any racial, neighborhood, or other such information” and that the approach is “unbiased” and “quantitative.” By deferring decisions to poorly understood algorithms, industry professionals effectively shed accountability for any negative effects of their code.

But do these algorithms discriminate, treating low-income and black neighborhoods and their inhabitants unfairly? It’s the kind of question many researchers are starting to ask as more and more industries use algorithms to make decisions. It’s true that an algorithm itself is quantitative – it boils down to a sequence of arithmetic steps for solving a problem. The danger is that these algorithms, which are trained on data produced by people, may reflect the biases in that data, perpetuating structural racism and negative biases about minority groups.

There are a lot of challenges to figuring out whether an algorithm embodies bias. First and foremost, many practitioners and “computer experts” still don’t publicly admit that algorithms can easily discriminate.More and more evidence supports that not only is this possible, but it’s happening already. The law is unclear on the legality of biased algorithms, and even algorithms researchers don’t precisely understand what it means for an algorithm to discriminate….

While researchers clearly understand the theoretical dangers of algorithmic discrimination, it’s difficult to cleanly measure the scope of the issue in practice. No company or public institution is willing to publicize its data and algorithms for fear of being labeled racist or sexist, or maybe worse, having a great algorithm stolen by a competitor.

Even when the Chicago Police Department was hit with a Freedom of Information Act request, they did not release their algorithms or heat list, claiming a credible threat to police officers and the people on the list. This makes it difficult for researchers to identify problems and potentially provide solutions.

Legal hurdles

Existing discrimination law in the United States isn’t helping. At best, it’s unclear on how it applies to algorithms; at worst, it’s a mess. Solon Barocas, a postdoc at Princeton, and Andrew Selbst, a law clerk for the Third Circuit US Court of Appeals, argued together that US hiring law fails to address claims about discriminatory algorithms in hiring.

The crux of the argument is called the “business necessity” defense, in which the employer argues that a practice that has a discriminatory effect is justified by being directly related to job performance….(More)”

Mining Administrative Data to Spur Urban Revitalization


New paper by Ben Green presented at the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: “After decades of urban investment dominated by sprawl and outward growth, municipal governments in the United States are responsible for the upkeep of urban neighborhoods that have not received sufficient resources or maintenance in many years. One of city governments’ biggest challenges is to revitalize decaying neighborhoods given only limited resources. In this paper, we apply data science techniques to administrative data to help the City of Memphis, Tennessee improve distressed neighborhoods. We develop new methods to efficiently identify homes in need of rehabilitation and to predict the impacts of potential investments on neighborhoods. Our analyses allow Memphis to design neighborhood-improvement strategies that generate greater impacts on communities. Since our work uses data that most US cities already collect, our models and methods are highly portable and inexpensive to implement. We also discuss the challenges we encountered while analyzing government data and deploying our tools, and highlight important steps to improve future data-driven efforts in urban policy….(More)”

Push, Pull, and Spill: A Transdisciplinary Case Study in Municipal Open Government


New paper by Jan Whittington et al: “Cities hold considerable information, including details about the daily lives of residents and employees, maps of critical infrastructure, and records of the officials’ internal deliberations. Cities are beginning to realize that this data has economic and other value: If done wisely, the responsible release of city information can also release greater efficiency and innovation in the public and private sector. New services are cropping up that leverage open city data to great effect.

Meanwhile, activist groups and individual residents are placing increasing pressure on state and local government to be more transparent and accountable, even as others sound an alarm over the privacy issues that inevitably attend greater data promiscuity. This takes the form of political pressure to release more information, as well as increased requests for information under the many public records acts across the country.

The result of these forces is that cities are beginning to open their data as never before. It turns out there is surprisingly little research to date into the important and growing area of municipal open data. This article is among the first sustained, cross-disciplinary assessments of an open municipal government system. We are a team of researchers in law, computer science, information science, and urban studies. We have worked hand-in-hand with the City of Seattle, Washington for the better part of a year to understand its current procedures from each disciplinary perspective. Based on this empirical work, we generate a set of recommendations to help the city manage risk latent in opening its data….(More)”

Algorithms and Bias


Q. and A. With Cynthia Dwork in the New York Times: “Algorithms have become one of the most powerful arbiters in our lives. They make decisions about the news we read, the jobs we get, the people we meet, the schools we attend and the ads we see.

Yet there is growing evidence that algorithms and other types of software can discriminate. The people who write them incorporate their biases, and algorithms often learn from human behavior, so they reflect the biases we hold. For instance, research has shown that ad-targeting algorithms have shown ads for high-paying jobs to men but not women, and ads for high-interest loans to people in low-income neighborhoods.

Cynthia Dwork, a computer scientist at Microsoft Research in Silicon Valley, is one of the leading thinkers on these issues. In an Upshot interview, which has been edited, she discussed how algorithms learn to discriminate, who’s responsible when they do, and the trade-offs between fairness and privacy.

Q: Some people have argued that algorithms eliminate discriminationbecause they make decisions based on data, free of human bias. Others say algorithms reflect and perpetuate human biases. What do you think?

A: Algorithms do not automatically eliminate bias. Suppose a university, with admission and rejection records dating back for decades and faced with growing numbers of applicants, decides to use a machine learning algorithm that, using the historical records, identifies candidates who are more likely to be admitted. Historical biases in the training data will be learned by the algorithm, and past discrimination will lead to future discrimination.

Q: Are there examples of that happening?

A: A famous example of a system that has wrestled with bias is the resident matching program that matches graduating medical students with residency programs at hospitals. The matching could be slanted to maximize the happiness of the residency programs, or to maximize the happiness of the medical students. Prior to 1997, the match was mostly about the happiness of the programs.

This changed in 1997 in response to “a crisis of confidence concerning whether the matching algorithm was unreasonably favorable to employers at the expense of applicants, and whether applicants could ‘game the system,’ ” according to a paper by Alvin Roth and Elliott Peranson published in The American Economic Review.

Q: You have studied both privacy and algorithm design, and co-wrote a paper, “Fairness Through Awareness,” that came to some surprising conclusions about discriminatory algorithms and people’s privacy. Could you summarize those?

A: “Fairness Through Awareness” makes the observation that sometimes, in order to be fair, it is important to make use of sensitive information while carrying out the classification task. This may be a little counterintuitive: The instinct might be to hide information that could be the basis of discrimination….

Q: The law protects certain groups from discrimination. Is it possible to teach an algorithm to do the same?

A: This is a relatively new problem area in computer science, and there are grounds for optimism — for example, resources from the Fairness, Accountability and Transparency in Machine Learning workshop, which considers the role that machines play in consequential decisions in areas like employment, health care and policing. This is an exciting and valuable area for research. …(More)”

Citizen Science used in studying Seasonal Variation in India


Rohin Daswani at the Commons Lab, Woodrow Wilson International Center for Scholars: “Climate change has started affecting many countries around the world. While every country is susceptible to the risks of global warming some countries, such as India, are especially vulnerable.

India’s sheer dependence on rainfall to irrigate its vast agricultural lands and to feed its economy makes it highly vulnerable to climate change. A report from the UN Intergovernmental Panel on Climate Change (IPCC) predicts global temperature will increase between 0.3 and 4.8 degrees Celsius and sea levels will rise 82cm (32 in) by the late 21st century. But what effect will the changing rainfall pattern have on the seasonal variation?

One way to study seasonal variation in India is to analyze the changing patterns of flowering and fruiting of common trees like the Mango and Amaltas trees. SeasonWatch , a program part of the National Center for Biological Sciences (NCBS), the biological wing of the Tata Institute for Fundamental Research, does exactly that. It is an India-wide program that studies the changing seasons by monitoring the seasonal cycles of flowering, fruiting and leaf flush of common trees. And how does it do that? It does it by utilizing the idea of Citizen Science. Anybody, be it children or adults, interested in trees and the effects of climate change can participate. All they have to do is register, select a tree near them and monitor it every week. The data is uploaded to a central website and is analyzed for changing patterns of plant life, and the effects of climate change on plant life cycle. The data is also open source so anyone can get access to it if they wish to. With all this information one could answer questions which were previously impossible to answer such as:

  • How does the flowering of Neem change across India?
  • Is fruiting of Tamarind different in different parts of the country depending on rainfall in the previous year?
  • Is year to year variation in flowering and fruiting time of Mango related to Winter temperatures?

Using Citizen Science and crowdsourcing, programs such as SeasonWatch have expanded the scope and work of conservation biology in various ecosystems across India….(More)”

ENGAGE: Building and Harnessing Networks for Social Impact


Faizal Karmali and Claudia Juech at the Rockefeller Foundation: “Have you heard of ‘X’ organization? They’re doing interesting work that you should know about. You might even want to work together.”

Words like these abound between individuals at conferences, at industry events, in email, and, all too often, trapped in the minds of those who see the potential in connecting the dots. Bridging individuals, organizations, or ideas is fulfilling because these connections often result in value for everyone, sometimes immediately, but often over the long term. While many of us can think of that extraordinary network connector in our personal or professional circles, if asked to identify an organization that plays a similar role at scale, across multiple sectors, we may be hard-pressed to name more than a few—let alone understand how they do it well….

In an effort to capture and codify the growing breadth of knowledge and experience around leveraging networks for social impact, the Monitor Institute, a part of Deloitte Consulting, with support from The Rockefeller Foundation, have produced ENGAGE: How Funders Can Support and Leverage Networks for Social Impact— an online guide which offers a series of frameworks, tools, insights, and stories to help funders explore the critical questions around using networks as part of their grantmaking strategy—particularly as a means to accelerating impact….

ENGAGE draws on the experience and knowledge of over 40 leaders and practitioners in the field who are using networks to create change; digs into the deep pool of writing on the topic; and mines the significant experience in working with networks that is resident in both Monitor Institute and The Rockefeller Foundation. The result is an aggregation and synthesis of some of the leading thinking in both the theory and practice of engaging with networks as a grantmaker.

Compelling examples on how the Foundation leverages the power of networks can be seen in the creation of formal network institutions like the Global Impact Investing Network (GIIN) and the Joint Learning Network for Universal Health Coverage, but also through more targeted and time-bound network engagement activities, such as enabling greater connectivity among grantees and unleashing the power of technology to surface innovation from loosely curated crowds.

Building and harnessing networks is more an art than a science. It is our hope that ENGAGE will enable grantmakers and other network practitioners to be more deliberate and thoughtful about how and when a network can help accelerate their work…. (More)