New research project to map the impact of open budget data


Jonathan Gray at Open Knowledge: “…a new research project to examine the impact of open budget data, undertaken as a collaboration between Open Knowledge and the Digital Methods Initiative at the University of Amsterdam, supported by the Global Initiative for Financial Transparency (GIFT).

The project will include an empirical mapping of who is active around open budget data around the world, and what the main issues, opportunities and challenges are according to different actors. On the basis of this mapping it will provide a review of the various definitions and conceptions of open budget data, arguments for why it matters, best practises for publication and engagement, as well as applications and outcomes in different countries around the world.

As well as drawing on Open Knowledge’s extensive experience and expertise around open budget data (through projects such as Open Spending), it will utilise innovative tools and methods developed at the University of Amsterdam to harness evidence from the web, social media and collections of documents to inform and enrich our analysis.

As part of this project we’re launching a collaborative bibliography of existing research and literature on open budget data and associated topics which we hope will become a useful resource for other organisations, advocates, policy-makers, and researchers working in this area. If you have suggestions for items to add, please do get in touch.

This project follows on from other research projects we’ve conducted around this area – including on data standards for fiscal transparency, on technology for transparent and accountable public finance, and on mapping the open spending community….(More)”

CrowdFlower Launches Open Data Project


Anthony Ha at Techcrunch: “Crowdsourcing company CrowdFlower allows businesses to tap into a distributed workforce of 5 million contributors for basic tasks like sentiment analysis. Today it’s releasing some of that data to the public through its new Data for Everyone initiative…. hope is to turn CrowdFlower into a central repository where open data can be found by researchers and entrepreneurs. (Factual was another startup trying to become a hub for open data, though in recent years, it’s become more focused on gathering location data to power mobile ads.)…

As for the data that’s available now, …There’s a lot of Twitter sentiment analysis covering things like from attitudes towards brands and products, yogurt (?), and climate change. Among the more recent data sets, I was particularly taken in the gender breakdown of who’s been on the cover of Time magazine and, yes, the analysis of who thought the dress (you know the one) was gold and white versus blue and black…. (More)”

US government and private sector developing ‘precrime’ system to anticipate cyber-attacks


Martin Anderson at The Stack: “The USA’s Office of the Director of National Intelligence (ODNI) is soliciting the involvement of the private and academic sectors in developing a new ‘precrime’ computer system capable of predicting cyber-incursions before they happen, based on the processing of ‘massive data streams from diverse data sets’ – including social media and possibly deanonymised Bitcoin transactions….
At its core the predictive technologies to be developed in association with the private sector and academia over 3-5 years are charged with the mission ‘to invest in high-risk/high-payoff research that has the potential to provide the U.S. with an overwhelming intelligence advantage over our future adversaries’.
The R&D program is intended to generate completely automated, human-free prediction systems for four categories of event: unauthorised access, Denial of Service (DoS), malicious code and scans and probes which are seeking access to systems.
The CAUSE project is an unclassified program, and participating companies and organisations will not be granted access to NSA intercepts. The scope of the project, in any case, seems focused on the analysis of publicly available Big Data, including web searches, social media exchanges and trawling ungovernable avalanches of information in which clues to future maleficent actions are believed to be discernible.
Program manager Robert Rahmer says: “It is anticipated that teams will be multidisciplinary and might include computer scientists, data scientists, social and behavioral scientists, mathematicians, statisticians, content extraction experts, information theorists, and cyber-security subject matter experts having applied experience with cyber capabilities,”
Battelle, one of the concerns interested in participating in CAUSE, is interested in employing Hadoop and Apache Spark as an approach to the data mountain, and includes in its preliminary proposal an intent to ‘de-anonymize Bitcoin sale/purchase activity to capture communication exchanges more accurately within threat-actor forums…’.
Identifying and categorising quality signal in the ‘white noise’ of Big Data is a central plank in CAUSE, and IARPA maintains several offices to deal with different aspects of it. Its pointedly-named ‘Office for Anticipating Surprise’  frames the CAUSE project best, since it initiated it. The OAS is occupied with ‘Detecting and forecasting the emergence of new technical capabilities’, ‘Early warning of social and economic crises, disease outbreaks, insider threats, and cyber attacks’ and ‘Probabilistic forecasts of major geopolitical trends and rare events’.
Another concerned department is The Office of Incisive Analysis, which is attempting to break down the ‘data static’ problem into manageable mission stages:
1) Large data volumes and varieties – “Providing powerful new sources of information from massive, noisy data that currently overwhelm analysts”
2) Social-Cultural and Linguistic Factors – “Analyzing language and speech to produce insights into groups and organizations. “
3) Improving Analytic Processes – “Dramatic enhancements to the analytic process at the individual and group level. “
The Office of Smart Collection develops ‘new sensor and transmission technologies, with the seeking of ‘Innovative approaches to gain access to denied environments’ as part of its core mission, while the Office of Safe and Secure Operations concerns itself with ‘Revolutionary advances in science and engineering to solve problems intractable with today’s computers’.
The CAUSE program, which attracted 150 developers, organisations, academics and private companies to the initial event, will announce specific figures about funding later in the year, and practice ‘predictions’ from participants will begin in the summer, in an accelerating and stage-managed program over five years….(More)”

Measuring government impact in a social media world


Arthur Mickoleit & Ryan Androsoff at OECD Insights: “There is hardly a government around the world that has not yet felt the impact of social media on how it communicates and engages with citizens. And while the most prominent early adopters in the public sector have tended to be politicians (think of US President Barack Obama’s impressive use of social media during his 2008 campaign), government offices are also increasingly jumping on the bandwagon. Yes, we are talking about those – mostly bricks-and-mortar – institutions that often toil away from the public gaze, managing the public administration in our countries. As the world changes, they too are increasingly engaging in a very public way through social media.
Research from our recent OECD working paper “Social Media Use by Governments” shows that as of November 2014, out of 34 OECD countries, 28 have a Twitter account for the office representing the top executive institution (head of state, head of government, or government as a whole), and 21 have a Facebook account….
 
But what is the impact governments can or should expect from social media? Is it all just vanity and peer pressure? Surely not.
Take the Spanish national police force (e.g. on Twitter, Facebook & YouTube), a great example of using social media to build long-term engagement, trust and a better public service. The thing so many governments yearn for, in this case the Spanish police seem to have managed well.
Or take the Danish “tax daddy” on Twitter – @Skattefar. It started out as the national tax administration’s quest to make it easier for everyone to submit correct tax filings; it is now one of the best examples around of a tax agency gone social.
Government administrations can use social media for internal purposes too. The Government of Canada used public platforms like Twitter and internal platforms like GCpedia and GCconnex to conduct a major employee engagement exercise (Blueprint 2020) to develop a vision for the future of the Canadian federal public service.
And when it comes to raising efficiency in the public sector, read this account of a Dutch research facility’s Director who decided to stop email. Not reduce it, but stop it altogether and replace it with social media.
There are so many other examples that could be cited. But the major question is how can we even begin to appraise the impact of these different initiatives? Because as we’ve known since the 19th century, “if you cannot measure it, you cannot improve it” (quote usually attributed to Lord Kelvin). Some aspects of impact measurement for social media can be borrowed from the private sector with regards to presence, popularity, penetration, and perception. But it’s around purpose that impact measurement agendas will split between the private sector and government. Virtually all companies will want to calculate the return on social media investments based on whether it helps them improve their financial returns. That’s different in the public sector where purpose is rarely defined in commercial terms.
A good impact assessment for social media in the public sector therefore needs to be built around its unique purpose-orientation. This is much more difficult to measure and it will involve a mix of quantitative data (e.g. reach of target audience) and qualitative data (e.g. case studies describing tangible impact). Social Media Use by Governments proposes a framework to start looking at social media measurement in gradual steps – from measuring presence, to popularity, to penetration, to perception, and finally, to purpose-orientation. The aim of this framework is to help governments develop truly relevant metrics and start treating social media activity by governments with the same public management rigour that is applied to other government activities. You can see a table summarising the framework by clicking on the thumbnail below.
This is far from an exact science, but we are beginning the work collaborating with member and partner governments to develop a toolkit that will help decision-makers implement the OECD Recommendation on Digital Government Strategies, including on the issue of social media metrics…(More)”.

Using Flash Crowds to Automatically Detect Earthquakes & Impact Before Anyone Else


Patrick Meier at iRevolutions: “It is said that our planet has a new nervous system; a digital nervous system comprised of digital veins and intertwined sensors that capture the pulse of our planet in near real-time. Next generation humanitarian technologies seek to leverage this new nervous system to detect and diagnose the impact of disasters within minutes rather than hours. To this end, LastQuake may be one of the most impressive humanitarian technologies that I have recently come across. Spearheaded by the European-Mediterranean Seismological Center (EMSC), the technology combines “Flashsourcing” with social media monitoring to auto-detect earthquakes before they’re picked up by seismometers or anyone else.

Screen Shot 2014-10-23 at 5.08.30 PM

Scientists typically draw on ground-motion prediction algorithms and data on building infrastructure to rapidly assess an earthquake’s potential impact. Alas, ground-motion predictions vary significantly and infrastructure data are rarely available at sufficient resolutions to accurately assess the impact of earthquakes. Moreover, a minimum of three seismometers are needed to calibrate a quake and said seismic data take several minutes to generate. This explains why the EMSC uses human sensors to rapidly collect relevant data on earthquakes as these reduce the uncertainties that come with traditional rapid impact assessment methodologies. Indeed, the Center’s important work clearly demonstrates how the Internet coupled with social media are “creating new potential for rapid and massive public involvement by both active and passive means” vis-a-vis earthquake detection and impact assessments. Indeed, the EMSC can automatically detect new quakes within 80-90 seconds of their occurrence while simultaneously publishing tweets with preliminary information on said quakes, like this one:

Screen Shot 2014-10-23 at 5.44.27 PM

In reality, the first human sensors (increases in web traffic) can be detected within 15 seconds (!) of a quake…(More)

City Governments Are Using Yelp to Tell You Where Not to Eat


Michael Luca and Luther Lowe at HBR Blog: “…in recent years consumer-feedback platforms like TripAdvisor, Foursquare, and Chowhound have transformed the restaurant industry (as well as the hospitality industry), becoming important guides for consumers. Yelp has amassed about 67 million reviews in the last decade. So it’s logical to think that these platforms could transform hygiene awareness too — after all, people who contribute to review sites focus on some of the same things inspectors look for.

It turns out that one way user reviews can transform hygiene awareness is by helping health departments better utilize their resources. The deployment of inspectors is usually fairly random, which means time is often wasted on spot checks at clean, rule-abiding restaurants. Social media can help narrow the search for violators.
Within a given city or area, it’s possible to merge the entire history of Yelp reviews and ratings — some of which contain telltale words or phrases such as “dirty” and “made me sick” — with the history of hygiene violations and feed them into an algorithm that can predict the likelihood of finding problems at reviewed restaurants. Thus inspectors can be allocated more efficiently.
In San Francisco, for example, we broke restaurants into the top half and bottom half of hygiene scores. In a recent paper, one of us (Michael Luca, with coauthor Yejin Choi and her graduate students) showed that we could correctly classify more than 80% of restaurants into these two buckets using only Yelp text and ratings. In the next month, we plan to hold a contest on DrivenData to get even better algorithms to help cities out (we are jointly running the contest). Similar algorithms could be applied in any city and in other sorts of prediction tasks.
Another means for transforming hygiene awareness is through the sharing of health-department data with online review sites. The logic is simple: Diners should be informed about violations before they decide on a destination, rather than after.
Over the past two years, we have been working with cities to help them share inspection data with Yelp through an open-data standard that Yelp created in 2012 to encourage officials to put their information in places that are more useful to consumers. In San Francisco, Los Angeles, Raleigh, and Louisville, Kentucky, customers now see hygiene data alongside Yelp reviews. There’s evidence that users are starting to pay attention to this data — click-through rates are similar to those for other features on Yelp ….

And there’s no reason this type of data sharing should be limited to restaurant-inspection reports. Why not disclose data about dentists’ quality and regulatory compliance via Yelp? Why not use data from TripAdvisor to help spot bedbugs? Why not use Twitter to understand what citizens are concerned about, and what cities can do about it? Uses of social media data for policy, and widespread dissemination of official data through social media, have the potential to become important means of public accountability. (More)

Tired of Being Profiled, a Programmer Turns to Crowdsourcing Cop Reviews


Christopher Moraff at Next City: “…despite the fact that policing is arguably one of the most important and powerful service professions a civilized society can produce, it’s far easier to find out if the plumber you just hired broke someone’s pipe while fixing their toilet than it is to find out if the cop patrolling your neighborhood broke someone’s head while arresting them.
A 31-year-old computer programmer has set out to fix that glitch with a new web-based (and soon to be mobile) crowdsourced rating tool called CopScore that is designed to help communities distinguish police officers who are worthy of praise from those who are not fit to wear the uniform….
CopScore is a work in progress, and, for the time being at least, a one-man show. Hardison does all the coding himself, often working through the night to bring new features online.
Currently in the very early beta stage, the platform works by consolidating information on the service records of individual police officers together with details of their interactions with constituents. The searchable platform includes data gleaned from public sources — such as social media and news articles — cross-referenced with Yelp-style ratings from citizens.

For Hardison, CopScore is as much a personal endeavor as it is a professional one. He says his youthful interest in computer programming — which he took up as a misbehaving fifth-grader under the guiding hand of a concerned teacher — made him the butt of the occassional joke in the predominantly African-American community of North Nashville where he grew up….”(More)

Making emotive games from open data


Katie Collins at WIRED: “Microsoft researcher Kati London’s aim is “to try to get people to think of data in terms of personalities, relationships and emotions”, she tells the audience at the Story Festival in London. Through Project Sentient Data, she uses her background in games development to create fun but meaningful experiences that bridge online interactions and things that are happening in the real world.
One such experience invited children to play against the real-time flow of London traffic through an online game called the Code of Everand. The aim was to test the road safety knowledge of 9-11 year olds and “make alertness something that kids valued”.
The core mechanic of the game was that of a normal world populated by little people, containing spirit channels that only kids could see and go through. Within these spirit channels, everything from lorries and cars from the streets became monsters. The children had to assess what kind of dangers the monsters posed and use their tools to dispel them.
“Games are great ways to blur and observe the ways people interact with real-world data,” says London.
In one of her earlier projects back in 2005, London used her knowledge of horticulture to bring artificial intelligence to plants. “Almost every workspace I go into has a half dead plant in it, so we gave plants the ability to tell us what they need.” It was, she says, an exercise in “humanising data” that led to further projects that saw her create self aware street signs and a dynamic city map that expressed shame neighbourhood by neighbourhood depending on the open dataset of public complaints in New York.
A further project turned complaint data into cartoons on Instagram every week. London praised the open data initiative in New York, but added that for people to access it, they had to know it existed and know where to find it. The cartoons were a “lightweight” form of “civic engagement” that helped to integrate hyperlocal issues into everyday conversation.
London also gamified community engagement through a project commissioned by the Knight Foundation called Macon Money….(More)”.

Beyond Transparency


Hildy Gottlieb on “How “opening up” can help organizations achieve their missions” in Stanford Social Innovation Review : “…For the past two years, Creating the Future, a social change research and development laboratory, has been experimenting to find the answer to that question. In the process, we have learned that when organizations are more open in their work, it can improve both the work itself and the results in the communities they serve.
In December 2012, Creating the Future’s board voted to open all its board and strategy meetings (including meetings for branding, resource development, and programming) to anyone who wished to attend and participate.
Since our organization is global, we hold our meetings via Google Hangout, and community members participate via a dedicated Twitter hashtag. Everyone is encouraged to participate—through asking questions and sharing observations—as if they are board members, whether or not they are.
This online openness mirrors the kind of inclusive, participatory culture that many grassroots neighborhood groups have fostered in the “real world” for decades. As we’ve studied those groups and experienced open engagement for ourselves, here are some of the things we’ve learned that can apply to any organization, whether they are working at a distance or in person.

What Being Open Makes Possible

1.  Being open adds new thinking to the mix. We can’t overstate this obvious practical benefit for every strategic issue an organization considers. During a recent discussion of employee “paid time off” policies, a participant with no formal relationship to the organization powerfully shifted the board’s conversation and perspectives away from the rigidity of a policy, focusing instead on the values of relationships, outcomes, buy-in, and adaptability. That input helped the board clarify its intent. It ultimately chose to scrap the idea of a certain amount of “paid time off,” in favor of an outcomes-based approach that provides flexibility for both employees and their supervisors.
2. Being open flattens internal communications. Opening all our meetings has led to cross-pollination across every aspect of our organization, providing an ongoing opportunity for sharing information and resources, and for developing everyone’s potential as leaders….
3. Being open walks the talk of the engaged communities we want to see. From the moment we opened the doors to our meetings, people have walked in and found meaningful ways to become part of our work. …
It seems so simple: If we want to engage the community, we just need to open the doors and invite people in!
4. Being open creates meaningful inclusion. Board diversity initiatives are intended to ensure that an organization’s decision-making reflects the experience of the community it serves. In reality, though, there can never be enough seats on a board to accomplish inclusion beyond what often feels like tokenism. Creating the Future’s board doesn’t have to worry about representing the community, because our community members represent themselves. And while this is powerful in an online setting, it is even more powerful when on-the-ground community members are part of a community-based organization’s decision-making fabric.
5. Being open creates more inclusive accountability. During a discussion of cash flow for our young organization, one concerned board member wondered aloud whether adhering to our values might be at cross-purposes with our survival. Our community members went wild via Twitter, expressing that it was that very code of values that drew them to the work in the first place. That reminder helped board members remove scarcity and fear from the conversation so that they could base their decision on what would align with our values and help accomplish the mission.
The needs of our community directly impacted that decision—not because of a bylaws requirement for “voting members” but simply because we encouraged community members to actively take part in the conversation….(More)”

Data for good


NESTA: “This report explores how capturing, sharing and analysing data in new ways can transform how charities work and how social action happens.

Key Findings

  • Citizens Advice (CAB) and Data Kind partnered to develop the Civic Dashboard. A tool which mines data from CAB consultations to understand emerging social issues in the UK.
  • Shooting Star Chase volunteers streamlined the referral paths of how children come to be at the hospices saving up to £90,000 for children’s hospices around the country by refining the referral system.
  • In a study of open grant funding data, NCVO identified 33,000 ‘below the radar organisations’ not currently registered in registers and databases on the third sector
  • In their social media analysis of tweets related to the Somerset Floods, Demos found that 39,000 tweets were related to social action

New ways of capturing, sharing and analysing data have the potential to transform how community and voluntary sector organisations work and how social action happens. However, while analysing and using data is core to how some of the world’s fastest growing businesses understand their customers and develop new products and services, civil society organisations are still some way off from making the most of this potential.
Over the last 12 months Nesta has grant funded a number of research projects that explore two dimensions of how big and open data can be used for the common good. Firstly, how it can be used by charities to develop better products and services and secondly, how it can help those interested in civil society better understand social action and civil society activity.

  • Citizens Advice Bureau (CAB) and Datakind, a global community of data scientists interested in how data can be used for a social purpose, were grant funded to explore how a datadriven approach to mining the rich data that CAB holds on social issues in the UK could be used to develop a real–time dashboard to identify emerging social issues. The project also explored how data–driven methods could better help other charities such as St Mungo’s and Buttle UK, and how data could be shared more effectively between charities as part of this process, to create collaborative data–driven projects.
  • Five organisations (The RSA, Cardiff University, The Demos Centre for Analysis of Social Media, NCVO and European Alternatives) were grant funded to explore how data–driven methods, such as open data analysis and social media analysis, can help us understand informal social action, often referred to as ‘below the radar activity’ in new ways.

This paper is not the definitive story of the opportunities in using big and open data for the common good, but it can hopefully provide insight on what can be done and lessons for others interested in exploring the opportunities in these methods….(More).”