Finding Mr. Smith or why anti-corruption needs open data


Martin Tisne: “Anti-corruption groups have been rightly advocating for the release of information on the beneficial or real owners of companies and trust. The idea is to crack down on tax evasion and corruption by identifying the actual individuals hiding behind several layers of shell companies.
But knowing that “Mr. Smith” is the owner of company X is of no interest, unless you know who Mr. Smith is.
The real interest lies in figuring out that Mr. Smith is linked to company Y, that has been illegally exporting timber from country Z, and that Mr. Smith is the son-in-law of the mining minister of yet another country, who has been accused of embezzling mining industry revenues.
For that, investigative journalists, prosecution authorities, civil society groups like Global Witness and Transparency International will need access not just to public registries of beneficial ownership but also contract data, political exposed persons databases (“PEPs” databases), project by project extractive industry data, and trade export/import data.
Unless those datasets are accessible, comparable, linked, it won’t be possible. We are talking about millions of datasets – no problem for computers to crunch, but impossible to go through manually.
This is what is different in the anti-corruption landscape today, compared to 10 years ago. Technology makes it possible. Don’t get me wrong – there are still huge, thorny political obstacles to getting the data even publicly available in the first place. But unless it is open data, I fear those battles will have been in vain.
That’s why we need open data as a topic on the G20 anti-corruption working group.”

Index: The Networked Public


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the networked public and was originally published in 2014.

Global Overview

  • The proportion of global population who use the Internet in 2013: 38.8%, up 3 percentage points from 2012
  • Increase in average global broadband speeds from 2012 to 2013: 17%
  • Percent of internet users surveyed globally that access the internet at least once a day in 2012: 96
  • Hours spent online in 2012 each month across the globe: 35 billion
  • Country with the highest online population, as a percent of total population in 2012: United Kingdom (85%)
  • Country with the lowest online population, as a percent of total population in 2012: India (8%)
  • Trend with the highest growth rate in 2012: Location-based services (27%)
  • Years to reach 50 million users: telephone (75), radio (38), TV (13), internet (4)

Growth Rates in 2014

  • Rate at which the total number of Internet users is growing: less than 10% a year
  • Worldwide annual smartphone growth: 20%
  • Tablet growth: 52%
  • Mobile phone growth: 81%
  • Percentage of all mobile users who are now smartphone users: 30%
  • Amount of all web usage in 2013 accounted for by mobile: 14%
  • Amount of all web usage in 2014 accounted for by mobile: 25%
  • Percentage of money spent on mobile used for app purchases: 68%
  • Growth of BitCoin wallet between 2013 and 2014: 8 times increase
  • Number of listings on AirBnB in 2014: 550k, 83% growth year on year
  • How many buyers are on Alibaba in 2014: 231MM buyers, 44% growth year on year

Social Media

  • Number of Whatsapp messages on average sent per day: 50 billion
  • Number sent per day on Snapchat: 1.2 billion
  • How many restaurants are registered on GrubHub in 2014: 29,000
  • Amount the sale of digital songs fell in 2013: 6%
  • How much song streaming grew in 2013: 32%
  • Number of photos uploaded and shared every day on Flickr, Snapchat, Instagram, Facebook and Whatsapp combined in 2014: 1.8 billion
  • How many online adults in the U.S. use a social networking site of some kind: 73%
  • Those who use multiple social networking sites: 42%
  • Dominant social networking platform: Facebook, with 71% of online adults
  • Number of Facebook users in 2004, its founding year: 1 million
  • Number of monthly active users on Facebook in September 2013: 1.19 billion, an 18% increase year-over-year
  • How many Facebook users log in to the site daily: 63%
  • Instagram users who log into the service daily: 57%
  • Twitter users who are daily visitors: 46%
  • Number of photos uploaded to Facebook every minute: over 243,000, up 16% from 2012
  • How much of the global internet population is actively using Twitter every month: 21%
  • Number of tweets per minute: 350,000, up 250% from 2012
  • Fastest growing demographic on Twitter: 55-64 year age bracket, up 79% from 2012
  • Fastest growing demographic on Facebook: 45-54 year age bracket, up 46% from 2012
  • How many LinkedIn accounts are created every minute: 120, up 20% from 2012
  • The number of Google searches in 2013: 3.5 million, up 75% from 2012
  • Percent of internet users surveyed globally that use social media in 2012: 90
  • Percent of internet users surveyed globally that use social media daily: 60
  • Time spent social networking, the most popular online activity: 22%, followed by searches (21%), reading content (20%), and emails/communication (19%)
  • The average age at which a child acquires an online presence through their parents in 10 mostly Western countries: six months
  • Number of children in those countries who have a digital footprint by age 2: 81%
  • How many new American marriages between 2005-2012 began by meeting online, according to a nationally representative study: more than one-third 
  • How many of the world’s 505 leaders are on Twitter: 3/4
  • Combined Twitter followers: of 505 world leaders: 106 million
  • Combined Twitter followers of Justin Bieber, Katy Perry, and Lady Gaga: 122 million
  • How many times all Wikipedias are viewed per month: nearly 22 billion times
  • How many hits per second: more than 8,000 
  • English Wikipedia’s share of total page views: 47%
  • Number of articles in the English Wikipedia in December 2013: over 4,395,320 
  • Platform that reaches more U.S. adults between ages 18-34 than any cable network: YouTube
  • Number of unique users who visit YouTube each month: more than 1 billion
  • How many hours of video are watched on YouTube each month: over 6 billion, 50% more than 2012
  • Proportion of YouTube traffic that comes from outside the U.S.: 80%
  • Most common activity online, based on an analysis of over 10 million web users: social media
  • People on Twitter who recommend products in their tweets: 53%
  • People who trust online recommendations from people they know: 90%

Mobile and the Internet of Things

  • Number of global smartphone users in 2013: 1.5 billion
  • Number of global mobile phone users in 2013: over 5 billion
  • Percent of U.S. adults that have a cell phone in 2013: 91
  • Number of which are a smartphone: almost two thirds
  • Mobile Facebook users in March 2013: 751 million, 54% increase since 2012
  • Growth rate of global mobile traffic as a percentage of global internet traffic as of May 2013: 15%, up from .9% in 2009
  • How many smartphone owners ages 18–44 “keep their phone with them for all but two hours of their waking day”: 79%
  • Those who reach for their smartphone immediately upon waking up: 62%
  • Those who couldn’t recall a time their phone wasn’t within reach or in the same room: 1 in 4
  • Facebook users who access the service via a mobile device: 73.44%
  • Those who are “mobile only”: 189 million
  • Amount of YouTube’s global watch time that is on mobile devices: almost 40%
  • Number of objects connected globally in the “internet of things” in 2012: 8.7 billion
  • Number of connected objects so far in 2013: over 10 billion
  • Years from tablet introduction for tables to surpass desktop PC and notebook shipments: less than 3 (over 55 million global units shipped in 2013, vs. 45 million notebooks and 35 million desktop PCs)
  • Number of wearable devices estimated to have been shipped worldwide in 2011: 14 million
  • Projected number of wearable devices in 2016: between 39-171 million
  • How much of the wearable technology market is in the healthcare and medical sector in 2012: 35.1%
  • How many devices in the wearable tech market are fitness or activity trackers: 61%
  • The value of the global wearable technology market in 2012: $750 million
  • The forecasted value of the market in 2018: $5.8 billion
  • How many Americans are aware of wearable tech devices in 2013: 52%
  • Devices that have the highest level of awareness: wearable fitness trackers,
  • Level of awareness for wearable fitness trackers amongst American consumers: 1 in 3 consumers
  • Value of digital fitness category in 2013: $330 million
  • How many American consumers surveyed are aware of smart glasses: 29%
  • Smart watch awareness amongst those surveyed: 36%

Access

  • How much of the developed world has mobile broadband subscriptions in 2013: 3/4
  • How much of the developing world has broadband subscription in 2013: 1/5
  • Percent of U.S. adults that had a laptop in 2012: 57
  • How many American adults did not use the internet at home, at work, or via mobile device in 2013: one in five
  • Amount President Obama initiated spending in 2009 in an effort to expand access: $7 billion
  • Number of Americans potentially shut off from jobs, government services, health care and education, among other opportunities due to digital inequality: 60 million
  • American adults with a high-speed broadband connection at home as of May 2013: 7 out of 10
  • Americans aged 18-29 vs. 65+ with a high-speed broadband connection at home as of May 2013: 80% vs. 43
  • American adults with college education (or more) vs. adults with no high school diploma that have a high-speed broadband connection at home as of May 2013: 89% vs. 37%
  • Percent of U.S. adults with college education (or more) that use the internet in 2011: 94
  • Those with no high school diploma that used the internet in 2011: 43
  • Percent of white American households that used the internet in 2013: 67
  • Black American households that used the internet in 2013: 57
  • States with lowest internet use rates in 2013: Mississippi, Alabama and Arkansas
  • How many American households have only wireless telephones as of the second half of 2012: nearly two in five
  • States with the highest prevalence of wireless-only adults according to predictive modeling estimates: Idaho (52.3%), Mississippi (49.4%), Arkansas (49%)
  • Those with the lowest prevalence of wireless-only adults: New Jersey (19.4%), Connecticut (20.6%), Delaware (23.3%) and New York (23.5%)

Sources

The Emerging Power of Big Data


New America Foundation Report on the Chicago experience of using big data: “Big data is transforming the commercial marketplace but it also has the potential to reshape government affairs and urban development.  In a new report from the Emerging Leaders Program at the Chicago Council of Global Affairs, Lincoln S. Ellis, a founding member of the World Economic Roundtable, and other authors from the Emerging Leaders Program, explore how big data can be used by mega-cities to meet the challenges they face in an age of resource constraints to improve the lives of their residents.
Using Chicago as a case study, the report examines how the explosion of data availability enables cities to do more with less—to improve government services, fund much needed transportation, provide better education, and guarantee public safety.  And do more with less is what many cities have had to do over the past five years because many cities have had to cut their budgets and reduce the number of public employees in the post-financial crisis economy.  It is also what they will need to continue to do in the future.
“Unfortunately, resource constraints are a consistent feature of the post-crisis global landscape,” argues Ellis.  “Happily, so too is the renaissance in productivity gains garnered by our ability to leverage technology and information to achieve our most important public purposes in a smarter and more efficient way.”
Click here to view the report as a PDF.”

The Art and Science of Data-driven Journalism


Alex Howard for the Tow Center for digital journalism: Journalists have been using data in their stories for as long as the profession has existed. A revolution in computing in the 20th century created opportunities for data integration into investigations, as journalists began to bring technology into their work. In the 21st century, a revolution in connectivity is leading the media toward new horizons. The Internet, cloud computing, agile development, mobile devices, and open source software have transformed the practice of journalism, leading to the emergence of a new term: data journalism. Although journalists have been using data in their stories for as long as they have been engaged in reporting, data journalism is more than traditional journalism with more data. Decades after early pioneers successfully applied computer-assisted reporting and social science to investigative journalism, journalists are creating news apps and interactive features that help people understand data, explore it, and act upon the insights derived from it. New business models are emerging in which data is a raw material for profit, impact, and insight, co-created with an audience that was formerly reduced to passive consumption. Journalists around the world are grappling with the excitement and the challenge of telling compelling stories by harnessing the vast quantity of data that our increasingly networked lives, devices, businesses, and governments produce every day. While the potential of data journalism is immense, the pitfalls and challenges to its adoption throughout the media are similarly significant, from digital literacy to competition for scarce resources in newsrooms. Global threats to press freedom, digital security, and limited access to data create difficult working conditions for journalists in many countries. A combination of peer-to-peer learning, mentorship, online training, open data initiatives, and new programs at journalism schools rising to the challenge, however, offer reasons to be optimistic about more journalists learning to treat data as a source. (Download the report)”

Selected Readings on Crowdsourcing Tasks and Peer Production


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Technological advances are creating a new paradigm by which institutions and organizations are increasingly outsourcing tasks to an open community, allocating specific needs to a flexible, willing and dispersed workforce. “Microtasking” platforms like Amazon’s Mechanical Turk are a burgeoning source of income for individuals who contribute their time, skills and knowledge on a per-task basis. In parallel, citizen science projects – task-based initiatives in which citizens of any background can help contribute to scientific research – like Galaxy Zoo are demonstrating the ability of lay and expert citizens alike to make small, useful contributions to aid large, complex undertakings. As governing institutions seek to do more with less, looking to the success of citizen science and microtasking initiatives could provide a blueprint for engaging citizens to help accomplish difficult, time-consuming objectives at little cost. Moreover, the incredible success of peer-production projects – best exemplified by Wikipedia – instills optimism regarding the public’s willingness and ability to complete relatively small tasks that feed into a greater whole and benefit the public good. You can learn more about this new wave of “collective intelligence” by following the MIT Center for Collective Intelligence and their annual Collective Intelligence Conference.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Benkler, Yochai. The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press, 2006. http://bit.ly/1aaU7Yb.

  • In this book, Benkler “describes how patterns of information, knowledge, and cultural production are changing – and shows that the way information and knowledge are made available can either limit or enlarge the ways people can create and express themselves.”
  • In his discussion on Wikipedia – one of many paradigmatic examples of people collaborating without financial reward – he calls attention to the notable ongoing cooperation taking place among a diversity of individuals. He argues that, “The important point is that Wikipedia requires not only mechanical cooperation among people, but a commitment to a particular style of writing and describing concepts that is far from intuitive or natural to people. It requires self-discipline. It enforces the behavior it requires primarily through appeal to the common enterprise that the participants are engaged in…”

Brabham, Daren C. Using Crowdsourcing in Government. Collaborating Across Boundaries Series. IBM Center for The Business of Government, 2013. http://bit.ly/17gzBTA.

  • In this report, Brabham categorizes government crowdsourcing cases into a “four-part, problem-based typology, encouraging government leaders and public administrators to consider these open problem-solving techniques as a way to engage the public and tackle difficult policy and administrative tasks more effectively and efficiently using online communities.”
  • The proposed four-part typology describes the following types of crowdsourcing in government:
    • Knowledge Discovery and Management
    • Distributed Human Intelligence Tasking
    • Broadcast Search
    • Peer-Vetted Creative Production
  • In his discussion on Distributed Human Intelligence Tasking, Brabham argues that Amazon’s Mechanical Turk and other microtasking platforms could be useful in a number of governance scenarios, including:
    • Governments and scholars transcribing historical document scans
    • Public health departments translating health campaign materials into foreign languages to benefit constituents who do not speak the native language
    • Governments translating tax documents, school enrollment and immunization brochures, and other important materials into minority languages
    • Helping governments predict citizens’ behavior, “such as for predicting their use of public transit or other services or for predicting behaviors that could inform public health practitioners and environmental policy makers”

Boudreau, Kevin J., Patrick Gaule, Karim Lakhani, Christoph Reidl, Anita Williams Woolley. “From Crowds to Collaborators: Initiating Effort & Catalyzing Interactions Among Online Creative Workers.” Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 14-060. January 23, 2014. https://bit.ly/2QVmGUu.

  • In this working paper, the authors explore the “conditions necessary for eliciting effort from those affecting the quality of interdependent teamwork” and “consider the the role of incentives versus social processes in catalyzing collaboration.”
  • The paper’s findings are based on an experiment involving 260 individuals randomly assigned to 52 teams working toward solutions to a complex problem.
  • The authors determined the level of effort in such collaborative undertakings are sensitive to cash incentives. However, collaboration among teams was driven more by the active participation of teammates, rather than any monetary reward.

Franzoni, Chiara, and Henry Sauermann. “Crowd Science: The Organization of Scientific Research in Open Collaborative Projects.” Research Policy (August 14, 2013). http://bit.ly/HihFyj.

  • In this paper, the authors explore the concept of crowd science, which they define based on two important features: “participation in a project is open to a wide base of potential contributors, and intermediate inputs such as data or problem solving algorithms are made openly available.” The rationale for their study and conceptual framework is the “growing attention from the scientific community, but also policy makers, funding agencies and managers who seek to evaluate its potential benefits and challenges. Based on the experiences of early crowd science projects, the opportunities are considerable.”
  • Based on the study of a number of crowd science projects – including governance-related initiatives like Patients Like Me – the authors identify a number of potential benefits in the following categories:
    • Knowledge-related benefits
    • Benefits from open participation
    • Benefits from the open disclosure of intermediate inputs
    • Motivational benefits
  • The authors also identify a number of challenges:
    • Organizational challenges
    • Matching projects and people
    • Division of labor and integration of contributions
    • Project leadership
    • Motivational challenges
    • Sustaining contributor involvement
    • Supporting a broader set of motivations
    • Reconciling conflicting motivations

Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 453–456. CHI ’08. New York, NY, USA: ACM, 2008. http://bit.ly/1a3Op48.

  • In this paper, the authors examine “[m]icro-task markets, such as Amazon’s Mechanical Turk, [which] offer a potential paradigm for engaging a large number of users for low time and monetary costs. [They] investigate the utility of a micro-task market for collecting user measurements, and discuss design considerations for developing remote micro user evaluation tasks.”
  • The authors conclude that in addition to providing a means for crowdsourcing small, clearly defined, often non-skill-intensive tasks, “Micro-task markets such as Amazon’s Mechanical Turk are promising platforms for conducting a variety of user study tasks, ranging from surveys to rapid prototyping to quantitative measures. Hundreds of users can be recruited for highly interactive tasks for marginal costs within a timeframe of days or even minutes. However, special care must be taken in the design of the task, especially for user measurements that are subjective or qualitative.”

Kittur, Aniket, Jeffrey V. Nickerson, Michael S. Bernstein, Elizabeth M. Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. “The Future of Crowd Work.” In 16th ACM Conference on Computer Supported Cooperative Work (CSCW 2013), 2012. http://bit.ly/1c1GJD3.

  • In this paper, the authors discuss paid crowd work, which “offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale.” However, they caution that, “it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework.”
  • The authors argue that seven key challenges must be met to ensure that crowd work processes evolve and reach their full potential:
    • Designing workflows
    • Assigning tasks
    • Supporting hierarchical structure
    • Enabling real-time crowd work
    • Supporting synchronous collaboration
    • Controlling quality

Madison, Michael J. “Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo.” In Convening Cultural Commons, 2013. http://bit.ly/1ih9Xzm.

  • This paper explores a “case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place, Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis on the Internet. In the second place, Galaxy Zoo is a highly successful example of peer production, some times known as crowdsourcing…In the third place, is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies.”
  • Madison concludes that the success of Galaxy Zoo has not been the result of the “character of its information resources (scientific data) and rules regarding their usage,” but rather, the fact that the “community was guided from the outset by a vision of a specific organizational solution to a specific research problem in astronomy, initiated and governed, over time, by professional astronomers in collaboration with their expanding universe of volunteers.”

Malone, Thomas W., Robert Laubacher and Chrysanthos Dellarocas. “Harnessing Crowds: Mapping the Genome of Collective Intelligence.” MIT Sloan Research Paper. February 3, 2009. https://bit.ly/2SPjxTP.

  • In this article, the authors describe and map the phenomenon of collective intelligence – also referred to as “radical decentralization, crowd-sourcing, wisdom of crowds, peer production, and wikinomics – which they broadly define as “groups of individuals doing things collectively that seem intelligent.”
  • The article is derived from the authors’ work at MIT’s Center for Collective Intelligence, where they gathered nearly 250 examples of Web-enabled collective intelligence. To map the building blocks or “genes” of collective intelligence, the authors used two pairs of related questions:
    • Who is performing the task? Why are they doing it?
    • What is being accomplished? How is it being done?
  • The authors concede that much work remains to be done “to identify all the different genes for collective intelligence, the conditions under which these genes are useful, and the constraints governing how they can be combined,” but they believe that their framework provides a useful start and gives managers and other institutional decisionmakers looking to take advantage of collective intelligence activities the ability to “systematically consider many possible combinations of answers to questions about Who, Why, What, and How.”

Mulgan, Geoff. “True Collective Intelligence? A Sketch of a Possible New Field.” Philosophy & Technology 27, no. 1. March 2014. http://bit.ly/1p3YSdd.

  • In this paper, Mulgan explores the concept of a collective intelligence, a “much talked about but…very underdeveloped” field.
  • With a particular focus on health knowledge, Mulgan “sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to possible intellectual barriers to progress.”
  • He concludes that the “central message that comes from observing real intelligence is that intelligence has to be for something,” and that “turning this simple insight – the stuff of so many science fiction stories – into new theories, new technologies and new applications looks set to be one of the most exciting prospects of the next few years and may help give shape to a new discipline that helps us to be collectively intelligent about our own collective intelligence.”

Sauermann, Henry and Chiara Franzoni. “Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation.” SSRN Working Papers Series. November 28, 2013. http://bit.ly/1o6YB7f.

  • In this paper, Sauremann and Franzoni explore the issue of interest-based motivation in crowd-based knowledge production – in particular the use of the crowd science platform Zooniverse – by drawing on “research in psychology to discuss important static and dynamic features of interest and deriv[ing] a number of research questions.”
  • The authors find that interest-based motivation is often tied to a “particular object (e.g., task, project, topic)” not based on a “general trait of the person or a general characteristic of the object.” As such, they find that “most members of the installed base of users on the platform do not sign up for multiple projects, and most of those who try out a project do not return.”
  • They conclude that “interest can be a powerful motivator of individuals’ contributions to crowd-based knowledge production…However, both the scope and sustainability of this interest appear to be rather limited for the large majority of contributors…At the same time, some individuals show a strong and more enduring interest to participate both within and across projects, and these contributors are ultimately responsible for much of what crowd science projects are able to accomplish.”

Schmitt-Sands, Catherine E. and Richard J. Smith. “Prospects for Online Crowdsourcing of Social Science Research Tasks: A Case Study Using Amazon Mechanical Turk.” SSRN Working Papers Series. January 9, 2014. http://bit.ly/1ugaYja.

  • In this paper, the authors describe an experiment involving the nascent use of Amazon’s Mechanical Turk as a social science research tool. “While researchers have used crowdsourcing to find research subjects or classify texts, [they] used Mechanical Turk to conduct a policy scan of local government websites.”
  • Schmitt-Sands and Smith found that “crowdsourcing worked well for conducting an online policy program and scan.” The microtasked workers were helpful in screening out local governments that either did not have websites or did not have the types of policies and services for which the researchers were looking. However, “if the task is complicated such that it requires ongoing supervision, then crowdsourcing is not the best solution.”

Shirky, Clay. Here Comes Everybody: The Power of Organizing Without Organizations. New York: Penguin Press, 2008. https://bit.ly/2QysNif.

  • In this book, Shirky explores our current era in which, “For the first time in history, the tools for cooperating on a global scale are not solely in the hands of governments or institutions. The spread of the Internet and mobile phones are changing how people come together and get things done.”
  • Discussing Wikipedia’s “spontaneous division of labor,” Shirky argues that the process is like, “the process is more like creating a coral reef, the sum of millions of individual actions, than creating a car. And the key to creating those individual actions is to hand as much freedom as possible to the average user.”

Silvertown, Jonathan. “A New Dawn for Citizen Science.” Trends in Ecology & Evolution 24, no. 9 (September 2009): 467–471. http://bit.ly/1iha6CR.

  • This article discusses the move from “Science for the people,” a slogan adopted by activists in the 1970s to “’Science by the people,’ which is “a more inclusive aim, and is becoming a distinctly 21st century phenomenon.”
  • Silvertown identifies three factors that are responsible for the explosion of activity in citizen science, each of which could be similarly related to the crowdsourcing of skills by governing institutions:
    • “First is the existence of easily available technical tools for disseminating information about products and gathering data from the public.
    • A second factor driving the growth of citizen science is the increasing realisation among professional scientists that the public represent a free source of labour, skills, computational power and even finance.
    • Third, citizen science is likely to benefit from the condition that research funders such as the National Science Foundation in the USA and the Natural Environment Research Council in the UK now impose upon every grantholder to undertake project-related science outreach. This is outreach as a form of public accountability.”

Szkuta, Katarzyna, Roberto Pizzicannella, David Osimo. “Collaborative approaches to public sector innovation: A scoping study.” Telecommunications Policy. 2014. http://bit.ly/1oBg9GY.

  • In this article, the authors explore cases where government collaboratively delivers online public services, with a focus on success factors and “incentives for services providers, citizens as users and public administration.”
  • The authors focus on six types of collaborative governance projects:
    • Services initiated by government built on government data;
    • Services initiated by government and making use of citizens’ data;
    • Services initiated by civil society built on open government data;
    • Collaborative e-government services; and
    • Services run by civil society and based on citizen data.
  • The cases explored “are all designed in the way that effectively harnesses the citizens’ potential. Services susceptible to collaboration are those that require computing efforts, i.e. many non-complicated tasks (e.g. citizen science projects – Zooniverse) or citizens’ free time in general (e.g. time banks). Those services also profit from unique citizens’ skills and their propensity to share their competencies.”

Cluster mapping


“The U.S. Cluster Mapping Project is a national economic initiative that provides open, interactive data to understand regional clusters and support business, innovation and policy in the United States. It is based at the Institute for Strategy and Competitiveness at Harvard Business School, with support from a number of partners and a federal grant from the U.S. Department of Commerce’s Economic Development Administration.
Research
The project provides a robust cluster mapping database grounded in the leading academic research. Professor Michael Porter pioneered the comprehensive mapping of clusters in the U.S. economy in the early 2000s. The research team from Harvard, MIT, and Temple used the latest Census and industry data to develop a new algorithm to define cluster categories that cover the entire U.S. economy. These categories enable comparative analyses of clusters across any region in the United States….
Impact
Research on the presence of regional clusters has recently oriented economic policy toward addressing the needs of clusters and mobilizing their potential. Four regional partners in Massachusetts, Minnesota, Oregon, and South Carolina produced a set of case studies that discuss how regions have organized economic policy around clusters. These cases form the core of a resource library that aims to disseminate insights and strengthen the community of practice in cluster-based economic development. The project will also take an international scope to benefit cross-border industries in North America and inform collective global dialogue around cluster-based economic development.”

How NYC Open Data and Reddit Saved New Yorkers Over $55,000 a Year


IQuantNY: “NYC generates an enormous amount of data each year, and for the most part, it stays behind closed doors.  But thanks to the Open Data movement, signed into law by Bloomberg in 2012 and championed over the last several years by Borough President Gale Brewer, along with other council members, we now get to see a small slice of what the city knows. And that slice is growing.
There have been some detractors along the way; a senior attorney for the NYPD said in 2012 during a council hearing that releasing NYPD data in csv format was a problem because they were “concerned with the integrity of the data itself” and because “data could be manipulated by people who want ‘to make a point’ of some sort”.  But our democracy is built on the idea of free speech; we let all the information out and then let reason lead the way.
In some ways, Open Data adds another check and balance into government: its citizens.  I’ve watched the perfect example of this check work itself out over the past month.  You may have caught my post that used parking ticket data to identify the fire hydrant in New York City that was generating the most income for the city in the form of fines: $33,000 a year.  And on the next block, the second most profitable hydrant was generating $24,000 a year.  That’s two consecutive blocks with hydrants generating over $55,000 a year. But there was a problem.  In my post, I laid out why these two parking spots were extremely confusing and basically seemed like a trap; there was a wide “curb extension” between the street and the hydrant, making it appear like the hydrant was not by the street.  Additionally, the DOT had painted parking spots right where you would be fined if you parked.
Once the data was out there, the hydrant took on a life of its own.  First, it raised to the top of the nyc sub-reddit.  That is basically one way that the internet voted that this is in-fact “interesting”.  And that is how things go from small to big. From there, it travelled to the New York Observer, which was able to get a comment from the DOT. After that, it appeared in the New York Post, the post was republished in Gothamist and finally it even went global in the Daily Mail.
I guess the pressure was on the DOT at this point, as each media source reached out for comment, but what struck me was their response to the Observer:

“While DOT has not received any complaints about this location, we will review the roadway markings and make any appropriate alterations”

Why does someone have to complain in order for the DOT to see problems like this?  In fact, the DOT just redesigned every parking sign in New York because some of the old ones were considered confusing.  But if this hydrant was news to them, it implies that they did not utilize the very strongest source of measuring confusion on our streets: NYC parking tickets….”

Cataloging the World


New book on “Paul Otlet and the Birth of the Information Age”: “The dream of capturing and organizing knowledge is as old as history. From the archives of ancient Sumeria and the Library of Alexandria to the Library of Congress and Wikipedia, humanity has wrestled with the problem of harnessing its intellectual output. The timeless quest for wisdom has been as much about information storage and retrieval as creative genius.
In Cataloging the World, Alex Wright introduces us to a figure who stands out in the long line of thinkers and idealists who devoted themselves to the task. Beginning in the late nineteenth century, Paul Otlet, a librarian by training, worked at expanding the potential of the catalog card, the world’s first information chip. From there followed universal libraries and museums, connecting his native Belgium to the world by means of a vast intellectual enterprise that attempted to organize and code everything ever published. Forty years before the first personal computer and fifty years before the first browser, Otlet envisioned a network of “electric telescopes” that would allow people everywhere to search through books, newspapers, photographs, and recordings, all linked together in what he termed, in 1934, a réseau mondial–essentially, a worldwide web.
Otlet’s life achievement was the construction of the Mundaneum–a mechanical collective brain that would house and disseminate everything ever committed to paper. Filled with analog machines such as telegraphs and sorters, the Mundaneum–what some have called a “Steampunk version of hypertext”–was the embodiment of Otlet’s ambitions. It was also short-lived. By the time the Nazis, who were pilfering libraries across Europe to collect information they thought useful, carted away Otlet’s collection in 1940, the dream had ended. Broken, Otlet died in 1944.
Wright’s engaging intellectual history gives Otlet his due, restoring him to his proper place in the long continuum of visionaries and pioneers who have struggled to classify knowledge, from H.G. Wells and Melvil Dewey to Vannevar Bush, Ted Nelson, Tim Berners-Lee, and Steve Jobs. Wright shows that in the years since Otlet’s death the world has witnessed the emergence of a global network that has proved him right about the possibilities–and the perils–of networked information, and his legacy persists in our digital world today, captured for all time…”

Measuring Governance: What’s the point?


Alan Hudson at Global Integrity: “Over the last 10-15 years, the fact that governance – the institutional arrangements and relationships that shape how effectively things get done – plays a central role in shaping countries’ development trajectories has become widely acknowledged (see for instance the World Bank’s World Development Report of 2011). This acknowledgement has developed hand-in-hand with determined efforts to measure various aspects of governance.

This emphasis on governance and the efforts made to measure its patterns and understand its dynamics is very welcome. There’s no doubt that governance matters and measuring “governance” and its various dimensions can play a useful role in drawing attention to problems and opportunities, in monitoring compliance with standards, in evaluating efforts to support reform, and in informing decisions about what reforms to implement and how.

But in my experience, discussions about governance and its measurement sometimes gloss over a number of key questions (for a similar argument see the early sections of Matt Andrews’ piece on “Governance indicators can make sense”). These include questions about: what is being measured – “governance” is a multi-faceted and rather woolly concept (see Francis Fukuyama’s 2013 piece on “What is Governance?” and various responses); who is going to use the data that is generated; how that data might have an impact; and what results are being sought.

I’ve noticed this most recently in discussions about the inclusion of “governance” in the post-2015 development framework of goals, targets and indicators. From what I’ve seen, the understandable enthusiasm for ensuring that governance gains a place in the post-2015 framework can lead to discussions that: skate over the fact that the evidence that particular forms of governance – often labelled as “Good Governance” – lead to better development outcomes is patchy; fail to effectively grapple with the fact that a cookie-cutter approach to governance is unlikely to work across diverse contexts; pay little attention given to the ways in which the data generated might actually be used to make a difference; and, give scant consideration to the needs of those who might use the data, particularly citizens and citizens’ groups.

In my view, a failure to address these issues risks inadvertently weakening the case for paying attention to, and measuring, aspects of governance. As the Overseas Development Institute’s excellent report on “Governance targets and indicators for post-2015” put it, in diplomatic language: “including something as a target or indicator does not automatically lead to its improvement and the prize is not just to find governance targets and indicators that can be ‘measured’. Rather, it  may be important to reflect on the pathways through which set targets and indicators are thought to lead to better outcomes and on the incentives that might be generated by different measurement approaches.” (See my working document on “Fiscal Governance and Post-2015” for additional thoughts on the inclusion of governance in the post-2015 framework, including notes toward a theory of change).

More broadly, beyond the confines of the post-2015 debate, the risk – and arguably, in many cases, the reality – is that by paying insufficient attention to some key issues, we end up with a lot of data on various aspects of “governance”, but that that data doesn’t get used as much as it might, isn’t very useful for informing context-specific efforts to improve governance, and has limited impact.

To remedy this situation, I’d suggest that any effort to measure aspects of “governance” or to improve the availability, quality, use and impact of governance data (as the Governance Data Alliance is doing – with a Working Group on Problem Statements and Theories of Change) should answer up-front a series of simple questions:

  • Outcomes: What outcome(s) are you interested in? Are you interested in improving governance for its own sake, because you regard a particular type of governance as intrinsically valuable, and/or because you think, for instance, that improving governance will help to improve service delivery and accelerate progress against poverty? (See Nathaniel Heller’s post on “outputs versus outcomes in open government”)
  • Theory: If your interest is not solely based on the intrinsic value you attach to “governance”, which aspects of “governance” do you think matter in terms of the outcomes – e.g. service delivery and/or reduced poverty – that you’re interested in? What’s the theory of change that links governance to development outcomes? Without such a theory, it’s difficult to decide what to measure!
  • Data: In what ways do you think that data about the aspects of governance that you think are important – for intrinsic or extrinsic reasons – will be used to help to drive progress towards the type of governance that you value? To what use might the data be put, by whom, to do what? Or, from the perspective of data-users, what information do they need to take action to improve governance?

Organizations that are involved in generating governance data no doubt spend time considering these questions. But nonetheless, I think there would be value in making that thinking – and information about whether and how the data gets used, and with what effect – explicit….”

Twitter releasing trove of user data to scientists for research


Joe Silver at ArsTechnica: “Twitter has a 200-million-strong and ever-growing user base that broadcasts 500 million updates daily. It has been lauded for its ability to unsettle repressive political regimes, bring much-needed accountability to corporations that mistreat their customers, and combat other societal ills (whether such characterizations are, in fact, accurate). Now, the company has taken aim at disrupting another important sphere of human society: the scientific research community.
Back in February, the site announced its plan—in collaboration with Gnip—to provide a handful of research institutions with free access to its data sets from 2006 to the present. It’s a pilot program called “Twitter Data Grants,” with the hashtag #DataGrants. At the time, Twitter’s engineering blog explained the plan to enlist grant applications to access its treasure trove of user data:

Twitter has an expansive set of data from which we can glean insights and learn about a variety of topics, from health-related information such as when and where the flu may hit to global events like ringing in the new year. To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data. Our Data Grants program aims to change that by connecting research institutions and academics with the data they need.

In April, Twitter announced that, after reviewing the more than 1,300 proposals submitted from more than 60 different countries, it had selected six institutions to provide with data access. Projects approved included a study of foodborne gastrointestinal illnesses, a study measuring happiness levels in cities based on images shared on Twitter, and a study using geosocial intelligence to model urban flooding in Jakarta, Indonesia. There’s even a project exploring the relationship between tweets and sports team performance.
Twitter did not directly respond to our questions on Tuesday afternoon regarding the specific amount and types of data the company is providing to the six institutions. But in its privacy policy, Twitter explains that most user information is intended to be broadcast widely. As a result, the company likely believes that sharing such information with scientific researchers is well within its rights, as its services “are primarily designed to help you share information with the world,” Twitter says. “Most of the information you provide us is information you are asking us to make public.”
While mining such data sets will undoubtedly aid scientists in conducting experiments for which similar data was previously either unavailable or quite limited, these applications raise some legal and ethical questions. For example, Scientific American has asked whether Twitter will be able to retain any legal rights to scientific findings and whether mining tweets (many of which are not publicly accessible) for scientific research when Twitter users have not agreed to such uses is ethically sound.
In response, computational epidemiologists Caitlin Rivers and Bryan Lewis have proposed guidelines for ethical research practices when using social media data, such as avoiding personally identifiable information and making all the results publicly available….”