Foundation Transparency: Game Over?


Brad Smith at Glass Pockets (Foundation Center): “The tranquil world of America’s foundations is about to be shaken, but if you read the Center for Effective Philanthropy’s (CEP) new study — Sharing What Matters, Foundation Transparency — you would never know it.

Don’t get me wrong. That study, like everything CEP produces, is carefully researched, insightful and thoroughly professional. But it misses the single biggest change in foundation transparency in decades: the imminent release by the Internal Revenue Service of foundation 990-PF (and 990) tax returns as machine-readable open data.

Clara Miller, President of the Heron Foundation, writes eloquently in her manifesto, Building a Foundation for the 21St Century: “…the private foundation model was designed to be protective and separate, much like a terrarium.”

Terrariums, of course, are highly “curated” environments over which their creators have complete control. The CEP study, proves that point, to the extent that much of the study consists of interviews with foundation leaders and reviews of their websites as if transparency were a kind of optional endeavor in which foundations may choose to participate, if at all, and to what degree.

To be fair, CEP also interviewed the grantees of various foundations (sometimes referred to as “partners”), which helps convey the reality that foundations have stakeholders beyond their four walls. However, the terrarium metaphor is about to become far more relevant as the release of 990 tax returns as open data will literally make it possible for anyone to look right through those glass walls to the curated foundation world within.

What Is Open Data?

It is safe to say that most foundation leaders and a fair majority of their staff do not understand what open data really is. Open data is free, yes, but more importantly it is digital and machine-readable. This means it can be consumed in enormous volumes at lightning speed, directly by computers.

Once consumed, open data can be tagged, sorted, indexed and searched using statistical methods to make obvious comparisons while discovering previously undetected correlations. Anyone with a computer, some coding skills and a hard drive or cloud storage can access open data. In today’s world, a lot of people meet those requirements, and they are free to do whatever they please with your information once it is, as open data enthusiasts like to say, “in the wild.”

What is the Internal Revenue Service Releasing?

Thanks to the Aspen Institute’s leadership of a joint effort – funded by foundations and including Foundation Center, GuideStar, the National Center for Charitable Statistics, the Johns Hopkins Center for Civil Society Studies, and others – the IRS has started to make some 1,000,000 Form 990s and 40,000 Form 990PF available as machine-readable open data.

Previously, all Form 990s had been released as image (TIFF) files, essentially a picture, making it both time-consuming and expensive to extract useful data from them. Credit where credit is due; a kick in the butt in the form of a lawsuit from open data crusader Carl Malamud helped speed the process along.

The current test phase includes only those tax returns that were digitally filed by nonprofits and community foundations (990s) and private foundations (990PFs). Over time, the IRS will phase in a mandatory digital filing requirement for all Form 990s, and the intent is to release them all as open data. In other words, that which is born digital will be opened up to the public in digital form. Because of variations in the 990 forms, getting the information from them into a database will still require some technical expertise, but will be far more feasible and faster than ever before.

The Good

The work of organizations like Foundation Center– who have built expensive infrastructure in order to turn years of 990 tax returns into information that can be used by nonprofits looking for funding, researchers trying to understand the role of foundations and foundations, themselves, seeking to benchmark themselves against peers—will be transformed.

Work will shift away from the mechanics of capturing and processing the data to higher level analysis and visualization to stimulate the generation and sharing of new insights and knowledge. This will fuel greater collaboration between peer organizations, innovation, the merging of previous disparate bodies of data, better philanthropy, and a stronger social sector… (more)

 

Legal Aid With a Digital Twist


Tina Rosenberg in the New York Times: “Matthew Stubenberg was a law student at the University of Maryland in 2010 when he spent part of a day doing expungements. It was a standard law school clinic where students learn by helping clients — in this case, he helped them to fill out and file petitions to erase parts of their criminal records. (Last week I wrote about the lifelong effects of these records, even if there is no conviction, and the expungement process that makes them go away.)

Although Maryland has a public database called Case Search, using that data to fill out the forms was tedious. “We spent all this time moving data from Case Search onto our forms,” Stubenberg said. “We spent maybe 30 seconds on the legal piece. Why could this not be easier? This was a problem that could be fixed by a computer.”

Stubenberg knew how to code. After law school, he set out to build software that automatically did that tedious work. By September 2014 he had a prototype for MDExpungement, which went live in January 2015. (The website is not pretty — Stubenberg is a programmer, not a designer.)

With MDExpungement, entering a case number brings it up on Case Search. The software then determines whether the case is expungeable. If so, the program automatically transfers the information from Case Search to the expungement form. All that’s left is to print, sign and file it with the court.

In October 2015 a change in Maryland law made more cases eligible for expungement. Between then and March 2016, people filed 7,600 petitions to have their criminal records removed in Baltimore City District Court. More than two-thirds of them came from MDExpungement.

“With the ever-increasing amount of expungements we’re all doing, the app has just made it a lot easier,” said Mary-Denise Davis, a public defender in Baltimore. “I put in a case number and it fills the form out for me. Like magic.”

The rise of online legal forms may not be a gripping subject, but it matters. Tens of millions of Americans need legal help for civil problems — they need a divorce, child support or visitation, protection from abuse or a stay of eviction. They must hold off debt collectors or foreclosure, or get government benefits….(more)

Could a tweet or a text increase college enrollment or student achievement?


 at the Conversation: “Can a few text messages, a timely email or a letter increase college enrollment and student achievement? Such “nudges,” designed carefully using behavioral economics, can be effective.

But when do they work – and when not?

Barriers to success

Consider students who have just graduated high school intending to enroll in college. Even among those who have been accepted to college, 15 percent of low-income students do not enroll by the next fall. For the large share who intend to enroll in community colleges, this number can be as high as 40 percent….

Can a few text messages or a timely email overcome these barriers? My research uses behavioral economics to design low-cost, scalable interventions aimed at improving education outcomes. Behavioral economics suggests several important features to make a nudge effective: simplify complex information, make tasks easier to complete and ensure that support is timely.

So, what makes for an effective nudge?

Improving college enrollment

In 2012, researchers Ben Castleman and Lindsay Page sent 10 text messages to nearly 2,000 college-intending students the summer after high school graduation. These messages provided just-in-time reminders on key financial aid, housing and enrollment deadlines from early July to mid August.

Instead of set meetings with counselors, students could reply to messages and receive on-demand support from college guidance counselors to complete key tasks.

In another intervention – the Expanding College Opportunities Project (ECO) – researchers Caroline Hoxby and Sarah Turner worked to help high-achieving, low-income students enroll in colleges on par with their achievement. The intervention arrived to students as a packet in the mail.

The mailer simplified information by providing a list of colleges tailored to each student’s location along with information about net costs, graduation rates, and application deadlines. Moreover, the mailer included easy-to-claim application fee waivers. All these features reduced both the complexity and cost in applying to a wider range of colleges.

In both cases, researchers found that it significantly improved college outcomes. College enrollment went up by 15 percent in the intervention designed to reduce summer melt for community college students. The ECO project increased the likelihood of admission to a selective college by 78 percent.

When there is no impact

While these interventions are promising, there are important caveats.

For instance, our preliminary findings from ongoing research show that information alone may not be enough. We sent emails and letters to more than one hundred thousand college applicants about financial aid and education-related tax benefits. However, we didn’t provide any additional support to help families through the process of claiming these benefits.

In other words, we didn’t provide any support to complete the tasks – no fee waivers, no connection to guidance counselors – just the email and the letter. Without this support to answer questions or help families complete forms to claim the benefits, we found no impact, even when students opened the emails.

More generally, “nudges” often lead to modest impacts and should be considered only a part of the solution. But there’s a dearth of low-cost, scalable interventions in education, and behavioral economics can help.

Identifying the crucial decision points – when applications are due, forms need to be filled out or school choices are made – and supplying the just-in-time support to families is key….(More).”

Robot Regulators Could Eliminate Human Error


 in the San Francisco Chronicle and Regblog: “Long a fixture of science fiction, artificial intelligence is now part of our daily lives, even if we do not realize it. Through the use of sophisticated machine learning algorithms, for example, computers now work to filter out spam messages automatically from our email. Algorithms also identify us by our photos on Facebook, match us with new friends on online dating sites, and suggest movies to watch on Netflix.

These uses of artificial intelligence hardly seem very troublesome. But should we worry if government agencies start to use machine learning?

Complaints abound even today about the uncaring “bureaucratic machinery” of government. Yet seeing how machine learning is starting to replace jobs in the private sector, we can easily fathom a literal machinery of government in which decisions made by human public servants increasingly become made by machines.

Technologists warn of an impending “singularity,” when artificial intelligence surpasses human intelligence. Entrepreneur Elon Musk cautions that artificial intelligence poses one of our “biggest existential threats.” Renowned physicist Stephen Hawking eerily forecasts that artificial intelligence might even “spell the end of the human race.”

Are we ready for a world of regulation by robot? Such a world is closer than we think—and it could actually be worth welcoming.

Already government agencies rely on machine learning for a variety of routine functions. The Postal Service uses learning algorithms to sort mail, and cities such as Los Angeles use them to time their traffic lights. But while uses like these seem relatively benign, consider that machine learning could also be used to make more consequential decisions. Disability claims might one day be processed automatically with the aid of artificial intelligence. Licenses could be awarded to airplane pilots based on what kinds of safety risks complex algorithms predict each applicant poses.

Learning algorithms are already being explored by the Environmental Protection Agency to help make regulatory decisions about what toxic chemicals to control. Faced with tens of thousands of new chemicals that could potentially be harmful to human health, federal regulators have supported the development of a program to prioritize which of the many chemicals in production should undergo the more in-depth testing. By some estimates, machine learning could save the EPA up to $980,000 per toxic chemical positively identified.

It’s not hard then to imagine a day in which even more regulatory decisions are automated. Researchers have shown that machine learning can lead to better outcomes when determining whether parolees ought to be released or domestic violence orders should be imposed. Could the imposition of regulatory fines one day be determined by a computer instead of a human inspector or judge? Quite possibly so, and this would be a good thing if machine learning could improve accuracy, eliminate bias and prejudice, and reduce human error, all while saving money.

But can we trust a government that bungled the initial rollout of Healthcare.gov to deploy artificial intelligence responsibly? In some circumstances we should….(More)”

Big Risks, Big Opportunities: the Intersection of Big Data and Civil Rights


Latest White House report on Big Data charts pathways for fairness and opportunity but also cautions against re-encoding bias and discrimination into algorithmic systems: ” Advertisements tailored to reflect previous purchasing decisions; targeted job postings based on your degree and social networks; reams of data informing predictions around college admissions and financial aid. Need a loan? There’s an app for that.

As technology advances and our economic, social, and civic lives become increasingly digital, we are faced with ethical questions of great consequence. Big data and associated technologies create enormous new opportunities to revisit assumptions and instead make data-driven decisions. Properly harnessed, big data can be a tool for overcoming longstanding bias and rooting out discrimination.

The era of big data is also full of risk. The algorithmic systems that turn data into information are not infallible—they rely on the imperfect inputs, logic, probability, and people who design them. Predictors of success can become barriers to entry; careful marketing can be rooted in stereotype. Without deliberate care, these innovations can easily hardwire discrimination, reinforce bias, and mask opportunity.

Because technological innovation presents both great opportunity and great risk, the White House has released several reports on “big data” intended to prompt conversation and advance these important issues. The topics of previous reports on data analytics included privacy, prices in the marketplace, and consumer protection laws. Today, we are announcing the latest report on big data, one centered on algorithmic systems, opportunity, and civil rights.

The first big data report warned of “the potential of encoding discrimination in automated decisions”—that is, discrimination may “be the inadvertent outcome of the way big data technologies are structured and used.” A commitment to understanding these risks and harnessing technology for good prompted us to specifically examine the intersection between big data and civil rights.

Using case studies on credit lending, employment, higher education, and criminal justice, the report we are releasing today illustrates how big data techniques can be used to detect bias and prevent discrimination. It also demonstrates the risks involved, particularly how technologies can deliberately or inadvertently perpetuate, exacerbate, or mask discrimination.

The purpose of the report is not to offer remedies to the issues it raises, but rather to identify these issues and prompt conversation, research—and action—among technologists, academics, policy makers, and citizens, alike.

The report includes a number of recommendations for advancing work in this nascent field of data and ethics. These include investing in research, broadening and diversifying technical leadership, cross-training, and expanded literacy on data discrimination, bolstering accountability, and creating standards for use within both the government and the private sector. It also calls on computer and data science programs and professionals to promote fairness and opportunity as part of an overall commitment to the responsible and ethical use of data.

Big data is here to stay; the question is how it will be used: to advance civil rights and opportunity, or to undermine them….(More)”

Citizen scientists aid Ecuador earthquake relief


Mark Zastrow at Nature: “After a magnitude-7.8 earthquake struck Ecuador’s Pacific coast on 16 April, a new ally joined the international relief effort: a citizen-science network called Zooniverse.

On 25 April, Zooniverse launched a website that asks volunteers to analyse rapidly-snapped satellite imagery of the disaster, which led to more than 650 reported deaths and 16,000 injuries. The aim is to help relief workers on the ground to find the most heavily damaged regions and identify which roads are passable.

Several crisis-mapping programmes with thousands of volunteers already exist — but it can take days to train satellites on the damaged region and to transmit data to humanitarian organizations, and results have not always proven useful. The Ecuador quake marked the first live public test for an effort dubbed the Planetary Response Network (PRN), which promises to be both more nimble than previous efforts, and to use more rigorous machine-learning algorithms to evaluate the quality of crowd-sourced analyses.

The network relies on imagery from the satellite company Planet Labs in San Francisco, California, which uses an array of shoebox-sized satellites to map the planet. In order to speed up the crowd-sourced process, it uses the Zooniverse platform to distribute the tasks of spotting features in satellite images. Machine-learning algorithms employed by a team at the University of Oxford, UK, then classify the reliability of each volunteer’s analysis and weight their contributions accordingly.

Rapid-fire data

Within 2 hours of the Ecuador test project going live with a first set of 1,300 images, each photo had been checked at least 20 times. “It was one of the fastest responses I’ve seen,” says Brooke Simmons, an astronomer at the University of California, San Diego, who leads the image processing. Steven Reece, who heads the Oxford team’s machine-learning effort, says that results — a “heat map” of damage with possible road blockages — were ready in another two hours.

In all, more than 2,800 Zooniverse users contributed to analysing roughly 25,000 square kilometres of imagery centred around the coastal cities of Pedernales and Bahia de Caraquez. That is where the London-based relief organization Rescue Global — which requested the analysis the day after the earthquake — currently has relief teams on the ground, including search dogs and medical units….(More)”

Using Data to Help People in Distress Get Help Faster


Nicole Wallace in The Chronicle of Philanthropy: “Answering text messages to a crisis hotline is different from handling customer-service calls: You don’t want counselors to answer folks in the order their messages were received. You want them to take the people in greatest distress first.

Crisis Text Line, a charity that provides counseling by text message, uses sophisticated data analysis to predict how serious the conversations are likely to be and ranks them by severity. Using an algorithm to automate triage ensures that people in crisis get help fast — with an unexpected side benefit for other texters contacting the hotline: shorter wait times.

When the nonprofit started in 2013, deciding which messages to take first was much more old-school. Counselors had to read all the messages in the queue and make a gut-level decision on which person was most in need of help.

“It was slow,” says Bob Filbin, the organization’s chief data scientist.

To solve the problem, Mr. Filbin and his colleagues used past messages to the hotline to create an algorithm that analyzes the language used in incoming messages and ranks them in order of predicted severity.

And it’s working. Since the algorithm went live on the platform, messages it marked as severe — code orange — led to conversations that were six times more likely to include thoughts of suicide or self-harm than exchanges started by other texts that weren’t marked code orange, and nine times more likely to have resulted in the counselor contacting emergency services to intervene in a suicide attempt.

Counselors don’t even see the queue of waiting texts anymore. They just click a button marked “Help Another Texter,” and the system connects them to the person whose message has been marked most urgent….(More)”

E-Government Strategy, ICT and Innovation for Citizen Engagement


Brief by Dennis Anderson, Robert Wu, Dr. June-Suh Cho, and Katja Schroeder: “This book discusses three levels of e-government and national strategies to reach a citizen-centric participatory e-government, and examines how disruptive technologies help shape the future of e-government. The authors examine how e-government can facilitate a symbiotic relationship between the government and its citizens. ICTs aid this relationship and promote transparencies so that citizens can place greater trust in the activities of their government. If a government can manage resources more effectively by better understanding the needs of its citizens, it can create a sustainable environment for citizens. Having a national strategy on ICT in government and e-government can significantly reduce government waste, corruption, and inefficiency. Businesses, CIOs and CTOs in the public sector interested in meeting sustainability requirements will find this book useful. …(More)”

Foundation Openness: A Critical Component of Foundation Effectiveness


Lindsay Louie at PhilanthroFiles: “We created the Fund for Shared Insight—a funder collaborative with diverse support from 30 different funders—to increase foundation openness. We believe that if foundations are more open—which we define as how they share about their goals and strategies; make decisions and measure progress; listen and engage in dialogue with others; act on what they hear; and share what they themselves have learned—they will be more effective.

WPhilanthropy Lessonse were so pleased to support Exponent Philanthropy’s video series featuring philanthropists being more open about their work: Philanthropy Lessons. To date, Exponent Philanthropy has released 5 of the total 9 videos, including:

Future video releases include:

  • Who Knows More? (expected 4/27/16)
  • Being Transparent (expected 4/27/16)
  • Value Beyond Dollars (expected 5/25/16)
  • Getting Out of the Office (expected 6/22/16)

We would love to see many more foundations make videos like these; engage in conversation with each other about these philanthropy lessons online and in person; share their experiences live at regional grantmaker association meetings or a national conferences like those Exponent Philanthropy hosts; and find other ways to be more open.

Why is this so important?

Recent research from the Center for Effective Philanthropy (report on CEP’s website here, full disclosure we funded this research) found that foundation CEOs see grantees, nonprofits that are considering applying for a grant, and other foundations working on similar issues as the top three audiences who benefit from a foundation being open about its work. Further, 86% of foundation CEOs who responded to the survey said they believe transparency is necessary for building strong relationships with grantees.

It was great to learn from this research that many foundations are open about their criteria for nonprofits seeking funding, their programmatic goals, and their strategies; and share about who makes decisions about the grantee selection process. Yet the research also found that foundations are not as open about sharing what they are achieving, how they assess their work, and their experiences with what has and hasn’t worked—and that foundation CEOs believe it would be beneficial for foundations to share more in these specific areas….(More)”

What Should We Do About Big Data Leaks?


Paul Ford at the New Republic: “I have a great fondness for government data, and the government has a great fondness for making more of it. Federal elections financial data, for example, with every contribution identified, connected to a name and address. Or the results of the census. I don’t know if you’ve ever had the experience of downloading census data but it’s pretty exciting. You can hold America on your hard drive! Meditate on the miracles of zip codes, the way the country is held together and addressable by arbitrary sets of digits.

You can download whole books, in PDF format, about the foreign policy of the Reagan Administration as it related to Russia. Negotiations over which door the Soviet ambassador would use to enter a building. Gigabytes and gigabytes of pure joy for the ephemeralist. The government is the greatest creator of ephemera ever.

Consider the Financial Crisis Inquiry Commission, or FCIC, created in 2009 to figure out exactly how the global economic pooch was screwed. The FCIC has made so much data, and has done an admirable job (caveats noted below) of arranging it. So much stuff. There are reams of treasure on a single FCIC web site, hosted at Stanford Law School: Hundreds of MP3 files, for example, with interviews with Jamie Dimonof JPMorgan Chase and Lloyd Blankfein of Goldman Sachs. I am desperate to find  time to write some code that automatically extracts random audio snippets from each and puts them on top of a slow ambient drone with plenty of reverb, so that I can relax to the dulcet tones of the financial industry explaining away its failings. (There’s a Paul Krugman interview that I assume is more critical.)

The recordings are just the beginning. They’ve released so many documents, and with the documents, a finding aid that you can download in handy PDF format, which will tell you where to, well, find things, pointing to thousands of documents. That aid alone is 1,439 pages.

Look, it is excellent that this exists, in public, on the web. But it also presents a very contemporary problem: What is transparency in the age of massive database drops? The data is available, but locked in MP3s and PDFs and other documents; it’s not searchable in the way a web page is searchable, not easy to comment on or share.

Consider the WikiLeaks release of State Department cables. They were exhausting, there were so many of them, they were in all caps. Or the trove of data Edward Snowden gathered on aUSB drive, or Chelsea Manning on CD. And the Ashley Madison leak, spread across database files and logs of credit card receipts. The massive and sprawling Sony leak, complete with whole email inboxes. And with the just-released Panama Papers, we see two exciting new developments: First, the consortium of media organizations that managed the leak actually came together and collectively, well, branded the papers, down to a hashtag (#panamapapers), informational website, etc. Second, the size of the leak itself—2.5 terabytes!—become a talking point, even though that exact description of what was contained within those terabytes was harder to understand. This, said the consortia of journalists that notably did not include The New York Times, The Washington Post, etc., is the big one. Stay tuned. And we are. But the fact remains: These artifacts are not accessible to any but the most assiduous amateur conspiracist; they’re the domain of professionals with the time and money to deal with them. Who else could be bothered?

If you watched the movie Spotlight, you saw journalists at work, pawing through reams of documents, going through, essentially, phone books. I am an inveterate downloader of such things. I love what they represent. And I’m also comfortable with many-gigabyte corpora spread across web sites. I know how to fetch data, how to consolidate it, and how to search it. I share this skill set with many data journalists, and these capacities have, in some ways, become the sole province of the media. Organs of journalism are among the only remaining cultural institutions that can fund investigations of this size and tease the data apart, identifying linkages and thus constructing informational webs that can, with great effort, be turned into narratives, yielding something like what we call “a story” or “the truth.” 

Spotlight was set around 2001, and it features a lot of people looking at things on paper. The problem has changed greatly since then: The data is everywhere. The media has been forced into a new cultural role, that of the arbiter of the giant and semi-legal database. ProPublica, a nonprofit that does a great deal of data gathering and data journalism and then shares its findings with other media outlets, is one example; it funded a project called DocumentCloud with other media organizations that simplifies the process of searching through giant piles of PDFs (e.g., court records, or the results of Freedom of Information Act requests).

At some level the sheer boredom and drudgery of managing these large data leaks make them immune to casual interest; even the Ashley Madison leak, which I downloaded, was basically an opaque pile of data and really quite boring unless you had some motive to poke around.

If this is the age of the citizen journalist, or at least the citizen opinion columnist, it’s also the age of the data journalist, with the news media acting as product managers of data leaks, making the information usable, browsable, attractive. There is an uneasy partnership between leakers and the media, just as there is an uneasy partnership between the press and the government, which would like some credit for its efforts, thank you very much, and wouldn’t mind if you gave it some points for transparency while you’re at it.

Pause for a second. There’s a glut of data, but most of it comes to us in ugly formats. What would happen if the things released in the interest of transparency were released in actual transparent formats?…(More)”