Networks and Hierarchies


on whether political hierarchy in the form of the state has met its match in today’s networked world in the American Interest: “…To all the world’s states, democratic and undemocratic alike, the new informational, commercial, and social networks of the internet age pose a profound challenge, the scale of which is only gradually becoming apparent. First email achieved a dramatic improvement in the ability of ordinary citizens to communicate with one another. Then the internet came to have an even greater impact on the ability of citizens to access information. The emergence of search engines marked a quantum leap in this process. The advent of laptops, smartphones, and other portable devices then emancipated electronic communication from the desktop. With the explosive growth of social networks came another great leap, this time in the ability of citizens to share information and ideas.
It was not immediately obvious how big a challenge all this posed to the established state. There was a great deal of cheerful talk about the ways in which the information technology revolution would promote “smart” or “joined-up” government, enhancing the state’s ability to interact with citizens. However, the efforts of Anonymous, Wikileaks and Edward Snowden to disrupt the system of official secrecy, directed mainly against the U.S. government, have changed everything. In particular, Snowden’s revelations have exposed the extent to which Washington was seeking to establish a parasitical relationship with the key firms that operate the various electronic networks, acquiring not only metadata but sometimes also the actual content of vast numbers of phone calls and messages. Techniques of big-data mining, developed initially for commercial purposes, have been adapted to the needs of the National Security Agency.
The most recent, and perhaps most important, network challenge to hierarchy comes with the advent of virtual currencies and payment systems like Bitcoin. Since ancient times, states have reaped considerable benefits from monopolizing or at least regulating the money created within their borders. It remains to be seen how big a challenge Bitcoin poses to the system of national fiat currencies that has evolved since the 1970s and, in particular, how big a challenge it poses to the “exorbitant privilege” enjoyed by the United States as the issuer of the world’s dominant reserve (and transaction) currency. But it would be unwise to assume, as some do, that it poses no challenge at all….”

No silver bullet: De-identification still doesn’t work


Arvind Narayanan and Edward W. Felten: “Paul Ohm’s 2009 article Broken Promises of Privacy spurred a debate in legal and policy circles on the appropriate response to computer science research on re-identification techniques. In this debate, the empirical research has often been misunderstood or misrepresented. A new report by Ann Cavoukian and Daniel Castro is full of such inaccuracies, despite its claims of “setting the record straight.” In a response to this piece, Ed Felten and I point out eight of our most serious points of disagreement with Cavoukian and Castro. The thrust of our arguments is that (i) there is no evidence that de-identification works either in theory or in practice and (ii) attempts to quantify its efficacy are unscientific and promote a false sense of security by assuming unrealistic, artificially constrained models of what an adversary might do. Specifically, we argue that:

  1. There is no known effective method to anonymize location data, and no evidence that it’s meaningfully achievable.
  2. Computing re-identification probabilities based on proof-of-concept demonstrations is silly.
  3. Cavoukian and Castro ignore many realistic threats by focusing narrowly on a particular model of re-identification.
  4. Cavoukian and Castro concede that de-identification is inadequate for high-dimensional data. But nowadays most interesting datasets are high-dimensional.
  5. Penetrate-and-patch is not an option.
  6. Computer science knowledge is relevant and highly available.
  7. Cavoukian and Castro apply different standards to big data and re-identification techniques.
  8. Quantification of re-identification probabilities, which permeates Cavoukian and Castro’s arguments, is a fundamentally meaningless exercise.

Data privacy is a hard problem. Data custodians face a choice between roughly three alternatives: sticking with the old habit of de-identification and hoping for the best; turning to emerging technologies like differential privacy that involve some trade-offs in utility and convenience; and using legal agreements to limit the flow and use of sensitive data. These solutions aren’t fully satisfactory, either individually or in combination, nor is any one approach the best in all circumstances. Change is difficult. When faced with the challenge of fostering data science while preventing privacy risks, the urge to preserve the status quo is understandable. However, this is incompatible with the reality of re-identification science. If a “best of both worlds” solution exists, de-identification is certainly not that solution. Instead of looking for a silver bullet, policy makers must confront hard choices.”

Introduction to Open Geospatial Consortium (OGC) Standards


Joseph McGenn; Dominic Taylor; Gail Millin-Chalabi (Editor); Kamie Kitmitto (Editor) at Jorum : “The onset of the Information Age and Digital Revolution has created a knowledge based society where the internet acts as a global platform for the sharing of information. In a geospatial context, this resulted in an advancement of techniques in how we acquire, study and share geographic information and with the development of Geographic Information Systems (GIS), locational services, and online mapping, spatial data has never been more abundant. The transformation to this digital era has not been without its drawbacks, and a forty year lack of common polices to data sharing has resulted in compatibility issues and great diversity in how software and data are delivered. Essential to the sharing of spatial information is interoperability, where different programmes can exchange and open data from various sources seamlessly. Applying universal standards across a sector provides interoperable solutions. The Open Geospatial Consortium (OGC) facilitates interoperability by providing open standard specifications which organisations can use to develop geospatial software. This means that two separate pieces of software or platforms, if developed using open standard specifications, can exchange data without compatibility issues. By defining these specifications and standards the OGC plays a crucial role in how geospatial information is shared on a global scale. Standard specifications are the invisible glue that holds information systems together, without which, data sharing generally would be an arduous task. On some level they keep the world spinning and this course will instil some appreciation for them from a geospatial perspective. This course introduces users to the OGC and all the common standards in the context of geoportals and mapping solutions. These standards are defined and explored using a number of platforms and interoperability is demonstrated in a practical sense. Finally, users will implement these standards to develop their own platforms for sharing geospatial information.”

U.S. Secretary of Commerce Penny Pritzker Announces Expansion and Enhancement of Commerce Data Programs


Press Release from the U.S. Secretary of Commerce:Department will hire first-ever Chief Data Officer

As “America’s Data Agency,” the Department of Commerce is prepared and well-positioned to foster the next phase in the open data revolution. In line with President Obama’s Year of Action, U.S. Secretary of Commerce Penny Pritzker today announced a series of steps taken to enhance and expand the data programs at the Department.
“Data is a key pillar of the Department’s “Open for Business Agenda,” and for the first time, we have made it a department-wide strategic priority,” said Secretary of Commerce Penny Pritzker. “No other department can rival the reach, depth and breadth of the Department of Commerce’s data programs. The Department of Commerce is working to unleash more of its data to strengthen the nation’s economic growth; make its data easier to access, understand, and use; and maximize the return of data investments for businesses, entrepreneurs, government, taxpayers, and communities.”
Secretary Pritzker made a number of major announcements today as a special guest speaker at the Environmental Systems Research Institute’s (Esri) User Conference in San Diego, California. She discussed the power and potential of open data, recognizing that data not only enable start-ups and entrepreneurs, move markets, and empower companies large and small, but also touch the lives of Americans every day.
In her remarks, Secretary Pritzker outlined new ways the Department of Commerce is working to unlock the potential of even more open data to make government smarter, including the following:
Chief Data Officer
Today, Secretary Pritzker announced the Commerce Department will hire its first-ever Chief Data Officer. This leader will be responsible for developing and implementing a vision for the future of the diverse data resources at Commerce.
The new Chief Data Officer will pull together a platform for all data sets; instigate and oversee improvements in data collection and dissemination; and ensure that data programs are coordinated, comprehensive, and strategic.
The Chief Data Officer will hold the key to unlocking more government data to help support a data-enabled Department and economy.
Trade Developer Portal
The International Trade Administration has launched its “Developer Portal,” an online toolkit to put diverse sets of trade and investment data in a single place, making it easier for the business community to use and better tap into the 95 percent of American customers that live overseas.
In creating this portal, the Commerce Department is making its data public to software developers, giving them access to authoritative information on U.S. exports and international trade to help U.S. businesses export and expand their operations in overseas markets. The developer community will be able to integrate the data into applications and mashups to help U.S. business owners compete abroad while also creating more jobs here at home.
Data Advisory Council
Open data requires open dialogue. To facilitate this, the Commerce Department is creating a data advisory council, comprised of 15 private sector leaders that will advise the Department on the best use of government data.
This new advisory council will help Commerce maximize the value of its data by:

  • discovering how to deliver data in more usable, timely, and accessible ways;
  • improving how data is utilized and shared to make businesses and governments more responsive, cost-effective, and efficient;
  • better anticipating customers’ needs; and
  • collaborating with the private sector to develop new data products and services.

The council’s primary focus will be on the accessibility and usability of Commerce data, as well as the transformation of the Department’s supporting infrastructure and procedures for managing data.
These data leaders will represent a broad range of business interests—reflecting the wide range of scientific, statistical, and other data that the Department of Commerce produces. Members will serve two-year terms and will meet about four times a year. The advisory council will be housed within the Economics and Statistics Administration.”
Commerce data inform decisions that help make government smarter, keep businesses more competitive and better inform citizens about their own communities – with the potential to guide up to $3.3 trillion in investments in the United States each year.

The open data imperative


Paper by Geoffrey Boulton in Insights: the UKSG journal: “The information revolution of recent decades is a world historical event that is changing the lives of individuals, societies and economies and with major implications for science, research and learning. It offers profound opportunities to explore phenomena that were hitherto beyond our power to resolve, and at the same time is undermining the process whereby concurrent publication of scientific concept and evidence (data) permitted scrutiny, replication and refutation and that has been the bedrock of scientific progress and of ‘self-correction’ since the inception of the first scientific journals in the 17th century. Open publication, release and sharing of data are vital habits that need to be redefined and redeveloped for the modern age by the research community if it is to exploit technological opportunities, maintain self-correction and maximize the contribution of research to human understanding and welfare.”

To improve quality, value of patient data, get them involved, study says


Joseph Conn at Vital Signs: “The key to the future use of patient-generated data is focusing on data that patients want to produce, own and use and making it easy for them to produce it.
At least, that’s the take of four co-authors from Duke University in an article in this month’s healthcare policy journal Health Affairs. The July issue is chockablock with articles on the many forms and uses of Big Data.
“We observe that the key to high-quality, patient-generated data is to have immediate and actionable data so that patients experience the importance of the data for their own care as well as research purposes,” the authors said in “Assessing the Value of Patient-Generated Data to Comparative Effectiveness Research.”
Patient-generated data, which the authors describe as patient-reported outcomes, or PRO will be “critical to developing the evidence base that informs decisions made by patients, providers and policymakers in pursuit of high-value medical care,” they predict.
“The easier it is for patients and clinicians to navigate (personal data) the more relevant that information will be to patient care, the more invested patients and clinics will be in contributing high-quality data, and the better the data in the big-data ecosystem will be,” they write.
“Analysis show that data quality improves over time and that the amount of missing data declines as patients experience the attention to their symptoms and actions that result from the information they provide,” the authors say…”

Selected Readings on Crowdsourcing Expertise


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Crowdsourcing enables leaders and citizens to work together to solve public problems in new and innovative ways. New tools and platforms enable citizens with differing levels of knowledge, expertise, experience and abilities to collaborate and solve problems together. Identifying experts, or individuals with specialized skills, knowledge or abilities with regard to a specific topic, and incentivizing their participation in crowdsourcing information, knowledge or experience to achieve a shared goal can enhance the efficiency and effectiveness of problem solving.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Börner, Katy, Michael Conlon, Jon Corson-Rikert, and Ying Ding. “VIVO: A Semantic Approach to Scholarly Networking and Discovery.” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 17, 2012): 1–178. http://bit.ly/17huggT.

  • This e-book “provides an introduction to VIVO…a tool for representing information about research and researchers — their scholarly works, research interests, and organizational relationships.”
  • VIVO is a response to the fact that, “Information for scholars — and about scholarly activity — has not kept pace with the increasing demands and expectations. Information remains siloed in legacy systems and behind various access controls that must be licensed or otherwise negotiated before access. Information representation is in its infancy. The raw material of scholarship — the data and information regarding previous work — is not available in common formats with common semantics.”
  • Providing access to structured information on the work and experience of a diversity of scholars enables improved expert finding — “identifying and engaging experts whose scholarly works is of value to one’s own. To find experts, one needs rich data regarding one’s own work and the work of potential related experts. The authors argue that expert finding is of increasing importance since, “[m]ulti-disciplinary and inter-disciplinary investigation is increasingly required to address complex problems. 

Bozzon, Alessandro, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. “Choosing the Right Crowd: Expert Finding in Social Networks.” In Proceedings of the 16th International Conference on Extending Database Technology, 637–648. EDBT  ’13. New York, NY, USA: ACM, 2013. http://bit.ly/18QbtY5.

  • This paper explores the challenge of selecting experts within the population of social networks by considering the following problem: “given an expertise need (expressed for instance as a natural language query) and a set of social network members, who are the most knowledgeable people for addressing that need?”
  • The authors come to the following conclusions:
    • “profile information is generally less effective than information about resources that they directly create, own or annotate;
    • resources which are produced by others (resources appearing on the person’s Facebook wall or produced by people that she follows on Twitter) help increasing the assessment precision;
    • Twitter appears the most effective social network for expertise matching, as it very frequently outperforms all other social networks (either combined or alone);
    • Twitter appears as well very effective for matching expertise in domains such as computer engineering, science, sport, and technology & games, but Facebook is also very effective in fields such as locations, music, sport, and movies & tv;
    • surprisingly, LinkedIn appears less effective than other social networks in all domains (including computer science) and overall.”

Brabham, Daren C. “The Myth of Amateur Crowds.” Information, Communication & Society 15, no. 3 (2012): 394–410. http://bit.ly/1hdnGJV.

  • Unlike most of the related literature, this paper focuses on bringing attention to the expertise already being tapped by crowdsourcing efforts rather than determining ways to identify more dormant expertise to improve the results of crowdsourcing.
  • Brabham comes to two central conclusions: “(1) crowdsourcing is discussed in the popular press as a process driven by amateurs and hobbyists, yet empirical research on crowdsourcing indicates that crowds are largely self-selected professionals and experts who opt-in to crowdsourcing arrangements; and (2) the myth of the amateur in crowdsourcing ventures works to label crowds as mere hobbyists who see crowdsourcing ventures as opportunities for creative expression, as entertainment, or as opportunities to pass the time when bored. This amateur/hobbyist label then undermines the fact that large amounts of real work and expert knowledge are exerted by crowds for relatively little reward and to serve the profit motives of companies. 

Dutton, William H. Networking Distributed Public Expertise: Strategies for Citizen Sourcing Advice to Government. One of a Series of Occasional Papers in Science and Technology Policy, Science and Technology Policy Institute, Institute for Defense Analyses, February 23, 2011. http://bit.ly/1c1bpEB.

  • In this paper, a case is made for more structured and well-managed crowdsourcing efforts within government. Specifically, the paper “explains how collaborative networking can be used to harness the distributed expertise of citizens, as distinguished from citizen consultation, which seeks to engage citizens — each on an equal footing.” Instead of looking for answers from an undefined crowd, Dutton proposes “networking the public as advisors” by seeking to “involve experts on particular public issues and problems distributed anywhere in the world.”
  • Dutton argues that expert-based crowdsourcing can be successfully for government for a number of reasons:
    • Direct communication with a diversity of independent experts
    • The convening power of government
    • Compatibility with open government and open innovation
    • Synergy with citizen consultation
    • Building on experience with paid consultants
    • Speed and urgency
    • Centrality of documents to policy and practice.
  • He also proposes a nine-step process for government to foster bottom-up collaboration networks:
    • Do not reinvent the technology
    • Focus on activities, not the tools
    • Start small, but capable of scaling up
    • Modularize
    • Be open and flexible in finding and going to communities of experts
    • Do not concentrate on one approach to all problems
    • Cultivate the bottom-up development of multiple projects
    • Experience networking and collaborating — be a networked individual
    • Capture, reward, and publicize success.

Goel, Gagan, Afshin Nikzad and Adish Singla. “Matching Workers with Tasks: Incentives in Heterogeneous Crowdsourcing Markets.” Under review by the International World Wide Web Conference (WWW). 2014. http://bit.ly/1qHBkdf

  • Combining the notions of crowdsourcing expertise and crowdsourcing tasks, this paper focuses on the challenge within platforms like Mechanical Turk related to intelligently matching tasks to workers.
  • The authors’ call for more strategic assignment of tasks in crowdsourcing markets is based on the understanding that “each worker has certain expertise and interests which define the set of tasks she can and is willing to do.”
  • Focusing on developing meaningful incentives based on varying levels of expertise, the authors sought to create a mechanism that, “i) is incentive compatible in the sense that it is truthful for agents to report their true cost, ii) picks a set of workers and assigns them to the tasks they are eligible for in order to maximize the utility of the requester, iii) makes sure total payments made to the workers doesn’t exceed the budget of the requester.

Gubanov, D., N. Korgin, D. Novikov and A. Kalkov. E-Expertise: Modern Collective Intelligence. Springer, Studies in Computational Intelligence 558, 2014. http://bit.ly/U1sxX7

  • In this book, the authors focus on “organization and mechanisms of expert decision-making support using modern information and communication technologies, as well as information analysis and collective intelligence technologies (electronic expertise or simply e-expertise).”
  • The book, which “addresses a wide range of readers interested in management, decision-making and expert activity in political, economic, social and industrial spheres, is broken into five chapters:
    • Chapter 1 (E-Expertise) discusses the role of e-expertise in decision-making processes. The procedures of e-expertise are classified, their benefits and shortcomings are identified, and the efficiency conditions are considered.
    • Chapter 2 (Expert Technologies and Principles) provides a comprehensive overview of modern expert technologies. A special emphasis is placed on the specifics of e-expertise. Moreover, the authors study the feasibility and reasonability of employing well-known methods and approaches in e-expertise.
    • Chapter 3 (E-Expertise: Organization and Technologies) describes some examples of up-to-date technologies to perform e-expertise.
    • Chapter 4 (Trust Networks and Competence Networks) deals with the problems of expert finding and grouping by information and communication technologies.
    • Chapter 5 (Active Expertise) treats the problem of expertise stability against any strategic manipulation by experts or coordinators pursuing individual goals.

Holst, Cathrine. “Expertise and Democracy.” ARENA Report No 1/14, Center for European Studies, University of Oslo. http://bit.ly/1nm3rh4

  • This report contains a set of 16 papers focused on the concept of “epistocracy,” meaning the “rule of knowers.” The papers inquire into the role of knowledge and expertise in modern democracies and especially in the European Union (EU). Major themes are: expert-rule and democratic legitimacy; the role of knowledge and expertise in EU governance; and the European Commission’s use of expertise.
    • Expert-rule and democratic legitimacy
      • Papers within this theme concentrate on issues such as the “implications of modern democracies’ knowledge and expertise dependence for political and democratic theory.” Topics include the accountability of experts, the legitimacy of expert arrangements within democracies, the role of evidence in policy-making, how expertise can be problematic in democratic contexts, and “ethical expertise” and its place in epistemic democracies.
    • The role of knowledge and expertise in EU governance
      • Papers within this theme concentrate on “general trends and developments in the EU with regard to the role of expertise and experts in political decision-making, the implications for the EU’s democratic legitimacy, and analytical strategies for studying expertise and democratic legitimacy in an EU context.”
    • The European Commission’s use of expertise
      • Papers within this theme concentrate on how the European Commission uses expertise and in particular the European Commission’s “expertgroup system.” Topics include the European Citizen’s Initiative, analytic-deliberative processes in EU food safety, the operation of EU environmental agencies, and the autonomy of various EU agencies.

King, Andrew and Karim R. Lakhani. “Using Open Innovation to Identify the Best Ideas.” MIT Sloan Management Review, September 11, 2013. http://bit.ly/HjVOpi.

  • In this paper, King and Lakhani examine different methods for opening innovation, where, “[i]nstead of doing everything in-house, companies can tap into the ideas cloud of external expertise to develop new products and services.”
  • The three types of open innovation discussed are: opening the idea-creation process, competitions where prizes are offered and designers bid with possible solutions; opening the idea-selection process, ‘approval contests’ in which outsiders vote to determine which entries should be pursued; and opening both idea generation and selection, an option used especially by organizations focused on quickly changing needs.

Long, Chengjiang, Gang Hua and Ashish Kapoor. Active Visual Recognition with Expertise Estimation in Crowdsourcing. 2013 IEEE International Conference on Computer Vision. December 2013. http://bit.ly/1lRWFur.

  • This paper is focused on improving the crowdsourced labeling of visual datasets from platforms like Mechanical Turk. The authors note that, “Although it is cheap to obtain large quantity of labels through crowdsourcing, it has been well known that the collected labels could be very noisy. So it is desirable to model the expertise level of the labelers to ensure the quality of the labels. The higher the expertise level a labeler is at, the lower the label noises he/she will produce.”
  • Based on the need for identifying expert labelers upfront, the authors developed an “active classifier learning system which determines which users to label which unlabeled examples” from collected visual datasets.
  • The researchers’ experiments in identifying expert visual dataset labelers led to findings demonstrating that the “active selection” of expert labelers is beneficial in cutting through the noise of crowdsourcing platforms.

Noveck, Beth Simone. “’Peer to Patent’: Collective Intelligence, Open Review, and Patent Reform.” Harvard Journal of Law & Technology 20, no. 1 (Fall 2006): 123–162. http://bit.ly/HegzTT.

  • This law review article introduces the idea of crowdsourcing expertise to mitigate the challenge of patent processing. Noveck argues that, “access to information is the crux of the patent quality problem. Patent examiners currently make decisions about the grant of a patent that will shape an industry for a twenty-year period on the basis of a limited subset of available information. Examiners may neither consult the public, talk to experts, nor, in many cases, even use the Internet.”
  • Peer-to-Patent, which launched three years after this article, is based on the idea that, “The new generation of social software might not only make it easier to find friends but also to find expertise that can be applied to legal and policy decision-making. This way, we can improve upon the Constitutional promise to promote the progress of science and the useful arts in our democracy by ensuring that only worth ideas receive that ‘odious monopoly’ of which Thomas Jefferson complained.”

Ober, Josiah. “Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment.” American Political Science Review 107, no. 01 (2013): 104–122. http://bit.ly/1cgf857.

  • In this paper, Ober argues that, “A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world.”
  • Ober describes an approach to decision-making that aggregates expertise across multiple domains. This “Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest.”

Sims, Max H., Jeffrey Bigham, Henry Kautz and Marc W. Halterman. Crowdsourcing medical expertise in near real time.” Journal of Hospital Medicine 9, no. 7, July 2014. http://bit.ly/1kAKvq7.

  • In this article, the authors discuss the develoment of a mobile application called DocCHIRP, which was developed due to the fact that, “although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered.”
  • The DocCHIRP pilot project used a “system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers.”
  • Healthcare providers (HCPs) sought to gain intelligence from the crowd, which included 85 registered users, on questions related to medication, complex medical decision making, standard of care, administrative, testing and referrals.
  • The authors believe that, “if future iterations of the mobile crowdsourcing applications can address…adoption barriers and support the organic growth of the crowd of HCPs,” then “the approach could have a positive and transformative effect on how providers acquire relevant knowledge and care for patients.”

Spina, Alessandro. “Scientific Expertise and Open Government in the Digital Era: Some Reflections on EFSA and Other EU Agencies.” in Foundations of EU Food Law and Policy, eds. A. Alemmano and S. Gabbi. Ashgate, 2014. http://bit.ly/1k2EwdD.

  • In this paper, Spina “presents some reflections on how the collaborative and crowdsourcing practices of Open Government could be integrated in the activities of EFSA [European Food Safety Authority] and other EU agencies,” with a particular focus on “highlighting the benefits of the Open Government paradigm for expert regulatory bodies in the EU.”
  • Spina argues that the “crowdsourcing of expertise and the reconfiguration of the information flows between European agencies and teh public could represent a concrete possibility of modernising the role of agencies with a new model that has a low financial burden and an almost immediate effect on the legal governance of agencies.”
  • He concludes that, “It is becoming evident that in order to guarantee that the best scientific expertise is provided to EU institutions and citizens, EFSA should strive to use the best organisational models to source science and expertise.”

GitHub: A Swiss Army knife for open government


FCW: “Today, more than 300 government agencies are using the platform for public and private development. Cities (Chicago, Philadelphia, San Francisco), states (New York, Washington, Utah) and countries (United Kingdom, Australia) are sharing code and paving a new road to civic collaboration….

In addition to a rapidly growing code collection, the General Services Administration’s new IT development shop has created a “/Developer program” to “provide comprehensive support for any federal agency engaged in the production or use of APIs.”
The Consumer Financial Protection Bureau has built a full-blown website on GitHub to showcase the software and design work its employees are doing.
Most of the White House’s repos relate to Drupal-driven websites, but the Obama administration has also shared its iOS and Android apps, which together have been forked nearly 400 times.

Civic-focused organizations — such as the OpenGov Foundation, the Sunlight Foundation and the Open Knowledge Foundation — are also actively involved with original projects on GitHub. Those projects include the OpenGov Foundation’s Madison document-editing tool touted by the likes of Rep. Darrell Issa (R-Calif.) and the Open Knowledge Foundation’s CKAN, which powers hundreds of government data platforms around the world.
According to GovCode, an aggregator of public government open-source projects hosted on GitHub, there have been hundreds of individual contributors and nearly 90,000 code commits, which involve making a set of tentative changes permanent.
The nitty-gritty
Getting started on GitHub is similar to the process for other social networking platforms. Users create individual accounts and can set up “organizations” for agencies or cities. They can then create repositories (or repos) to collaborate on projects through an individual or organizational account. Other developers or organizations can download repo code for reuse or repurpose it in their own repositories (called forking), and make it available to others to do the same.
Collaborative aspects of GitHub include pull requests that allow developers to submit and accept updates to repos that build on and grow an open-source project. There are wikis, gists (code snippet sharing) and issue tracking for bugs, feature requests, or general questions and answers.
GitHub provides free code hosting for all public repos. Upgrade offerings include personal and organizational plans based on the number of private repos. For organizations that want a self-hosted GitHub development environment, GitHub Enterprise, used by the likes of CFPB, allows for self-hosted, private repos behind a firewall.
GitHub’s core user interface can be unwelcoming or even intimidating to the nondeveloper, but GitHub’s Pages package offers Web-hosting features that include domain mapping and lightweight content management tools such as static site generator Jekyll and text editor Atom.
Notable government projects that use Pages are the White House’s Project Open Data, 18F’s /Developer Program, CFPB’s Open Tech website and New York’s Open Data Handbook. Indeed, Wired recently commented that the White House’s open-data GitHub efforts “could help fix government.”…
See also: GitHub for Government (GovLab)

Meet the UK start-ups changing the world with open data


Sophie Curtis in The Telegraph: “Data is more accessible today than anyone could have imagined 10 or 20 years ago. From corporate databases to social media and embedded sensors, data is exploding, with total worldwide volume expected to reach 6.6 zettabytes by 2020.
Open data is information that is available for anyone to use, for any purpose, at no cost. For example, the Department for Education publishes open data about the performance of schools in England, so that companies can create league tables and citizens can find the best-performing schools in their catchment area.
Governments worldwide are working to open up more of their data. Since January 2010, more than 18,500 UK government data sets have been released via the data.gov.uk web portal, creating new opportunities for organisations to build innovative digital services.
Businesses are also starting to realise the value of making their non-personal data freely available, with open innovation leading to the creation products and services that they can benefit from….

Now a range of UK start-ups are working with the ODI to build businesses using open data, and have already unlocked a total of £2.5 million worth of investments and contracts.
Mastodon C joined the ODI start-up programme at its inception in December 2012. Shortly after joining, the company teamed up with Ben Goldacre and Open Healthcare UK, and embarked on a project investigating the use of branded statins over the far cheaper generic versions.
The data analysis identified potential efficiency savings to the NHS of £200 million. The company is now also working with the Technology Strategy Board and Nesta to help them gain better insight into their data.
Another start-up, CarbonCulture is a community platform designed to help people use resources more efficiently. The company uses high-tech metering to monitor carbon use in the workplace and help clients save money.
Organisations such as 10 Downing Street, Tate, Cardiff Council, the GLA and the UK Parliament are using the company’s digital tools to monitor and improve their energy consumption. CarbonCulture has also helped the Department of Energy and Climate Change reduce its gas use by 10 per cent.
Spend Network’s business is built on collecting the spend statements and tender documents published by government in the UK and Europe and then publishing this data openly so that anyone can use it. The company currently hosts over £1.2 trillion of transactions from the UK and over 1.8 million tenders from across Europe.
One of the company’s major breakthroughs was creating the first national, open spend analysis for central and local government. This was used to uncover a 45 per cent delay in the UK’s tendering process, holding up £22 billion of government funds to the economy.
Meanwhile, TransportAPI uses open data feeds from Traveline, Network Rail and Transport for London to provide nationwide timetables, departure and infrastructure information across all modes of public transport.
TransportAPI currently has 700 developers and organisations signed up to its platform, including individual taxpayers and public sector organisations like universities and local authorities. Travel portals, hyperlocal sites and business analytics are also integrating features, such as the ‘nearest transport’ widget, into their websites.
These are just four examples of how start-ups are using open data to create new digital services. The ODI this week announced seven new open data start-ups joining the programme, covering 3D printed learning materials, helping disabled communities, renewable energy markets, and smart cities….”

Liberating Data to Transform Health Care


Erika G. Martin,  Natalie Helbig, and  Nirav R. Shah on New York’s Open Data Experience in JAMA: “The health community relies on governmental survey, surveillance, and administrative data to track epidemiologic trends, identify risk factors, and study the health care delivery system. Since 2009, a quiet “open data” revolution has occurred. Catalyzed by President Obama’s open government directive, federal, state, and local governments are releasing deidentified data meeting 4 “open” criteria: public accessibility, availability in multiple formats, free of charge, and unlimited use and distribution rights.1 As of February 2014, HealthData.gov, the federal health data repository, has more than 1000 data sets, and Health Data NY, New York’s health data site, has 48 data sets with supporting charts and maps. Data range from health interview surveys to administrative transactions. The implicit logic is that making governmental data readily available will improve government transparency; increase opportunities for research, mobile health application development, and data-driven quality improvement; and make health-related information more accessible. Together, these activities have the potential to improve health care quality, reduce costs, facilitate population health planning and monitoring, and empower health care consumers to make better choices and live healthier lives.”