Code for America: Announcing the 2013 Accelerator Class


Press Release: “Code for America opened applications for the 2013 Accelerator knowing that the competition would be fierce. This year we received over 190 applications from amazing candidates. Today, we’re pleased to announce the five teams chosen to participate in the 2013 Accelerator.

The teams are articulate, knowledgeable, and passionate about their businesses. They come from all over the country — Texas, North Carolina, Florida, and California  — and we’re excited to get started with them. Teams include:

ArchiveSocial enables organizations to embrace social media by minimizing risk and eliminating compliance barriers. Specifically, it solves the challenge of retaining Gov 2.0 communications for compliance with FOIA and other public records laws. It currently automates business-grade record keeping of communications on networks such as Facebook, Twitter, and YouTube. Moving forward, ArchiveSocial will help further enforce social media policy and protect the organizational brand.

The Family Assessment Form (FAF) Web is a tool designed by social workers, researchers, and technology experts to help family support practitioners improve family functioning, service planning for families, and organizational performance. The FAF is ideal for use in organizations performing home visitation services for families that address comprehensive concerns about family well-being and child welfare. FAF Web enables all stakeholders to access essential data remotely from any internet-enabled device.

OpenCounter helps entrepreneurs to register their businesses with the local government. It does so through an online check-out experience that adapts to the applicant’s answers and asks for pertinent information only once. OpenCounter estimates licensing time and costs so entrepreneurs can understand what it will take to get their business off the ground. It’s the TurboTax of business permitting.

SmartProcure is an online information service that provides access to local, state, and federal government procurement data, with two public-interest goals: 1. Enable government agencies to make more efficient procurement decisions and save taxpayer dollars. 2. Empower businesses to sell more effectively and competitively to government agencies. The proprietary system provides access to data from more than 50 million purchase orders issued by 1,700 government agencies.

StreetCred Software helps police agencies manage their arrest warrants, eliminate warrant backlogs, and radically improve efficiency while increasing officer safety. It helps agencies understand their fugitive population, measure effectiveness, and make improvements. StreetCred Software, Inc., was founded by two Texas police officers. One is an 18-year veteran investigator and fugitive hunter, the other a technology industry veteran who became an cop in 2010.”

Data Science for Social Good


Data Science for Social Good: “By analyzing data from police reports to website clicks to sensor signals, governments are starting to spot problems in real-time and design programs to maximize impact. More nonprofits are measuring whether or not they’re helping people, and experimenting to find interventions that work.
None of this is inevitable, however.
We’re just realizing the potential of using data for social impact and face several hurdles to it’s widespread adoption:

  • Most governments and nonprofits simply don’t know what’s possible yet. They have data – but often not enough and maybe not the right kind.
  • There are too few data scientists out there – and too many spending their days optimizing ads instead of bettering lives.

To make an impact, we need to show social good organizations the power of data and analytics. We need to work on analytics projects that have high social impact. And we need to expose data scientists to the problems that really matter.

The fellowship

That’s exactly why we’re doing the Eric and Wendy Schmidt Data Science for Social Good summer fellowship at the University of Chicago.
We want to bring three dozen aspiring data scientists to Chicago, and have them work on data science projects with social impact.
Working closely with governments and nonprofits, fellows will take on real-world problems in education, health, energy, transportation, and more.
Over the next three months, they’ll apply their coding, machine learning, and quantitative skills, collaborate in a fast-paced atmosphere, and learn from mentors in industry, academia, and the Obama campaign.
The program is led by a strong interdisciplinary team from the Computation institute and the Harris School of Public Policy at the University of Chicago.”

Information Consumerism – The Price of Hypocrisy


Evgeny Morozov in Frankfurter Algemeine: “What we need is a sharper, starker picture of the information apocalypse that awaits us in a world where personal data is traded like coffee or any other commodity. Take the oft-repeated argument about the benefits of trading one’s data in exchange for some tangible commercial benefit. Say, for example, you install a sensor in your car to prove to your insurance company that you are driving much safer than the average driver that figures in their model for pricing insurance policies. Great: if you are better than the average, you get to pay less. But the problem with averages is that half of the population is always worse than the benchmark. Inevitably –regardless of whether they want to monitor themselves or not – that other half will be forced to pay more, for as the more successful of us take on self-tracking, most social institutions would (quite logically) assume that those who refuse to self-track have something to hide. Under this model, the implications of my decision to trade my personal data are no longer solely in the realm of markets and economics – they are also in the realm of ethics. If my decision to share my personal data for a quick buck makes someone else worse off and deprives them of opportunities, then I have an extra ethical factor to consider – economics alone doesn’t suffice.
All of this is to say that there are profound political and moral consequences to information consumerism– and they are comparable to energy consumerism in scope and importance. Making these consequences more pronounced and vivid is where intellectuals and political parties ought to focus their efforts. We should do our best to suspend the seeming economic normalcy of information sharing. An attitude of “just business!” will no longer suffice. Information sharing might have a vibrant market around it but it has no ethical framework to back it up. More than three decades ago, Michel Foucault was prescient to see that neoliberalism would turns us all into “entrepreneurs of the self” but let’s not forget that entrepreneurship is not without its downsides: as most economic activities, it can generate negative externalities, from pollution to noise. Entrepreneurship focused on information sharing is no exception….”

5 Big Data Projects That Could Impact Your Life


Mashable: “We reached out to a few organizations using information, both hand- and algorithm-collected, to create helpful tools for their communities. This is only a small sample of what’s out there — plenty more pop up each day, and as more information becomes public, the trend will only grow….
1. Transit Time NYC
Transit Time NYC, an interactive map developed by WNYC, lets New Yorkers click a spot in any of the city’s five boroughs for an estimate of subway or train travel times. To create it, WNYC lead developer Steve Melendez broke the city into 2,930 hexagons, then pulled data from open source itinerary platform OpenTripPlanner — the Wikipedia of mapping software — and coupled it with the MTA’s publicly downloadable subway schedule….
2. Twitter’s ‘Topography of Tweets
In a blog post, Twitter unveiled a new data visualization map that displays billions of geotagged tweets in a 3D landscape format. The purpose is to display, topographically, which parts of certain cities most people are tweeting from…
3. Homicide Watch D.C.
Homicide Watch D.C. is a community-driven data site that aims to cover every murder in the District of Columbia. It’s sorted by “suspect” and “victim” profiles, where it breaks down each person’s name, age, gender and race, as well as original articles reported by Homicide Watch staff…
4. Falling Fruit
Can you find a hidden apple tree along your daily bike commute? Falling Fruit can.
The website highlights overlooked or hidden edibles in urban areas across the world. By collecting public information from the U.S. Department of Agriculture, municipal tree inventories, foraging maps and street tree databases, the site has created a network of 615 types of edibles in more than 570,000 locations. The purpose is to remind urban dwellers that agriculture does exist within city boundaries — it’s just more difficult to find….
5. AIDSvu
AIDSVu is an interactive map that illustrates the prevalence of HIV in the United States. The data is pulled from the U.S. Center for Disease Control’s national HIV surveillance reports, which are collected at both state and county levels each year…”

Metadata Liberation Movement


Holman Jenkins in the Wall Street Journal: “The biggest problem, then, with metadata surveillance may simply be that the wrong agencies are in charge of it. One particular reason why this matters is that the potential of metadata surveillance might actually be quite large but is being squandered by secret agencies whose narrow interest is only looking for terrorists….
“Big data” is only as good as the algorithms used to find out things worth finding out. The efficacy and refinement of big-data techniques are advanced by repetition, by giving more chances to find something worth knowing. Bringing metadata out of its black box wouldn’t only be a way to improve public trust in what government is doing. It would be a way to get more real value for society out of techniques that are being squandered on a fairly minor threat.
Bringing metadata out of the black box would open up new worlds of possibility—from anticipating traffic jams to locating missing persons after a disaster. It would also create an opportunity to make big data more consistent with the constitutional prohibition of unwarranted search and seizure. In the first instance, with the computer withholding identifying details of the individuals involved, any red flag could be examined by a law-enforcement officer to see, based on accumulated experience, whether the indication is of interest.
If so, a warrant could be obtained to expose the identities involved. If not, the record could immediately be expunged. All this could take place in a reasonably aboveboard, legal fashion, open to inspection in court when and if charges are brought or—this would be a good idea—a court is informed of investigations that led to no action.
Our guess is that big data techniques would pop up way too many false positives at first, and only considerable learning and practice would allow such techniques to become a useful tool. At the same time, bringing metadata surveillance out of the shadows would help the Googles, Verizons and Facebooks defend themselves from a wholly unwarranted suspicion that user privacy is somehow better protected by French or British or (heavens) Chinese companies from their own governments than U.S. data is from the U.S. government.
Most of all, it would allow these techniques to be put to work on solving problems that are actual problems for most Americans, which terrorism isn’t.”

Predictive Policing: Don’t even think about it


The Economist: “PredPol is one of a range of tools using better data, more finely crunched, to predict crime. They seem to promise better law-enforcement. But they also bring worries about privacy, and of justice systems run by machines not people.
Criminal offences, like infectious disease, form patterns in time and space. A burglary in a placid neighbourhood represents a heightened risk to surrounding properties; the threat shrinks swiftly if no further offences take place. These patterns have spawned a handful of predictive products which seem to offer real insight. During a four-month trial in Kent, 8.5% of all street crime occurred within PredPol’s pink boxes, with plenty more next door to them; predictions from police analysts scored only 5%. An earlier trial in Los Angeles saw the machine score 6% compared with human analysts’ 3%.
Intelligent policing can convert these modest gains into significant reductions in crime…
Predicting and forestalling crime does not solve its root causes. Positioning police in hotspots discourages opportunistic wrongdoing, but may encourage other criminals to move to less likely areas. And while data-crunching may make it easier to identify high-risk offenders—about half of American states use some form of statistical analysis to decide when to parole prisoners—there is little that it can do to change their motivation.
Misuse and overuse of data can amplify biases…But mathematical models might make policing more equitable by curbing prejudice.”

Index: Participation and Civic Engagement


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on participation and civic engagement and was originally published in 2013.

  • Percent turnout of voting age population in 2012 U.S. Presidential election: 57.5
  • Percent turnout in 2008, 2004, 2000 elections: 62.3, 60.4, 54.2
  • Change in voting rate in U.S. from 1980 to most recent election: –29
  • Change in voting rate in Slovak Republic from 1980 to most recent election: –42, the lowest rate among democratic countries surveyed
  • Change in voting rate in Russian Federation from 1980 to most recent election: +14, the highest rate among democratic countries surveyed
  • Percent turnout in Australia as of 2011: 95, the highest rate among democratic countries surveyed
  • Percentage point difference in voting rates between high and low educated people in Australia as of 2011: 1
  • Percentage point difference in voting rates between high and low educated people in the U.S. as of 2011:  33
  • Number of Black and Hispanic U.S. voters in comparison to 2008 election: 1.7 million and 1.4 million increase
  • Number of non-Hispanic White U.S. voters in comparison to 2008 election: 2 million decrease, the only example of a race group showing a decrease in net voting from one presidential election to the next
  • Percent of Americans that contact their elected officials between elections: 10
  • Margin of victory in May 2013 Los Angeles mayoral election: 54-46
  • Percent turnout among Los Angeles citizens in May 2013 Los Angeles mayoral election: 19
  • Percent of U.S. adults that used social networking sites in 2012: 60
  • How many of which participated in a political or civic activity online: 2/3
  • Percent of U.S. social media users in 2012 that used social tools to encourage other people to take action on an issue that is important to them: 31
  • Percent of U.S. adults that belonged to a group on a social networking site involved in advancing a political or social issue in 2012: 12
  • Increase in the number of adults who took part in these behaviors in 2008: four-fold
  • Number of U.S. adults that signed up to receive alerts about local issues via email or text messaging in 2010: 1 in 5
  • Percent of U.S. adults that used digital tools digital tools to talk to their neighbors and keep informed about community issues in 2010: 20
  • Number of Americans that talked face-to-face with neighbors about community issues in 2010: almost half
  • How many online adults that have used social tools as blogs, social networking sites, and online video as well as email and text alerts to keep informed about government activities: 1/3
  • Percent of U.S. adult internet users that have gone online for raw data about government spending and activities in 2010: 40
  • Of which how many look online to see how federal stimulus money is being spent: 1 in 5
  • Read or download the text of legislation: 22%
  • How many Americans volunteered through or for an organization at least once between September 2011 and September 2012: 64.5 million
  • Median hours spent on volunteer activities during this time: 50
  • Change in volunteer rate compared to the year before: 0.3 decline

Sources

9 models to scale open data – past, present and future


Open Knowledge Foundation Blog: “The possibilities of open data have been enthralling us for 10 years…But that excitement isn’t what matters in the end. What matters is scale – which organisational structures will make this movement explode?  This post quickly and provocatively goes through some that haven’t worked (yet!) and some that have.
Ones that are working now
1) Form a community to enter in new data. Open Street Map and MusicBrainz are two big examples. It works as the community is the originator of the data. That said, neither has dominated its industry as much as I thought they would have by now.
2) Sell tools to an upstream generator of open data. This is what CKAN does for central Governments (and the new ScraperWiki CKAN tool helps with). It’s what mySociety does, when selling FixMyStreet installs to local councils, thereby publishing their potholes as RSS feeds.
3) Use open data (quietly). Every organisation does this and never talks about it. It’s key to quite old data resellers like Bloomberg. It is what most of ScraperWiki’s professional services customers ask us to do. The value to society is enormous and invisible. The big flaw is that it doesn’t help scale supply of open data.
4) Sell tools to downstream users. This isn’t necessarily open data specific – existing software like spreadsheets and Business Intelligence can be used with open or closed data. Lots of open data is on the web, so tools like the new ScraperWiki which work well with web data are particularly suited to it.
Ones that haven’t worked
5) Collaborative curation ScraperWiki started as an audacious attempt to create an open data curation community, based on editing scraping code in a wiki. In its original form (now called ScraperWiki Classic) this didn’t scale. …With a few exceptions, notably OpenCorporates, there aren’t yet open data curation projects.
6) General purpose data marketplaces, particularly ones that are mainly reusing open data, haven’t taken off. They might do one day, however I think they need well-adopted higher level standards for data formatting and syncing first (perhaps something like dat, perhaps something based on CSV files).
Ones I expect more of in the future
These are quite exciting models which I expect to see a lot more of.
7) Give labour/money to upstream to help them create better data. This is quite new. The only, and most excellent, example of it is the UK’s National Archive curating the Statute Law Database. They do the work with the help of staff seconded from commercial legal publishers and other parts of Government.
It’s clever because it generates money for upstream, which people trust the most, and which has the most ability to improve data quality.
8) Viral open data licensing. MySQL made lots of money this way, offering proprietary dual licenses of GPLd software to embedded systems makers. In data this could use OKFN’s Open Database License, and organisations would pay when they wanted to mix the open data with their own closed data. I don’t know anyone actively using it, although Chris Taggart from OpenCorporates mentioned this model to me years ago.
9) Corporations release data for strategic advantage. Companies are starting to release their own data for strategic gain. This is very new. Expect more of it.”

Understanding Smart Data Disclosure Policy Success: The Case of Green Button


New Paper by Djoko Sigit Sayogo and Theresa Pardo: “Open data policies are expected to promote innovations that stimulate social, political and economic change. In pursuit of innovation potential, open datahas expanded to wider environment involving government, business and citizens. The US government recently launched such collaboration through a smart data policy supporting energy efficiency called Green Button. This paper explores the implementation of Green Button and identifies motivations and success factors facilitating successful collaboration between public and private organizations to support smart disclosure policy. Analyzing qualitative data from semi-structured interviews with experts involved in Green Button initiation and implementation, this paper presents some key findings. The success of Green Button can be attributed to the interaction between internal and external factors. The external factors consist of both market and non-market drivers: economic factors, technology related factors, regulatory contexts and policy incentives, and some factors that stimulate imitative behavior among the adopters. The external factors create the necessary institutional environment for the Green Button implementation. On the other hand, the acceptance and adoption of Green Button itself is influenced by the fit of Green Button capability to the strategic mission of energy and utility companies in providing energy efficiency programs. We also identify the different roles of government during the different stages of Green Button implementation.”
[Recipient of Best Management/Policy Paper Award, dgo2013]