Request for Proposals: Exploring the Implications of Government Release of Large Datasets


“The Berkeley Center for Law & Technology and Microsoft are issuing this request for proposals (RFP) to fund scholarly inquiry to examine the civil rights, human rights, security and privacy issues that arise from recent initiatives to release large datasets of government information to the public for analysis and reuse.  This research may help ground public policy discussions and drive the development of a framework to avoid potential abuses of this data while encouraging greater engagement and innovation.
This RFP seeks to:

    • Gain knowledge of the impact of the online release of large amounts of data generated by citizens’ interactions with government
    • Imagine new possibilities for technical, legal, and regulatory interventions that avoid abuse
    • Begin building a body of research that addresses these issues

– BACKGROUND –

 
Governments at all levels are releasing large datasets for analysis by anyone for any purpose—“Open Data.”  Using Open Data, entrepreneurs may create new products and services, and citizens may use it to gain insight into the government.  A plethora of time saving and other useful applications have emerged from Open Data feeds, including more accurate traffic information, real-time arrival of public transportation, and information about crimes in neighborhoods.  Sometimes governments release large datasets in order to encourage the development of unimagined new applications.  For instance, New York City has made over 1,100 databases available, some of which contain information that can be linked to individuals, such as a parking violation database containing license plate numbers and car descriptions.
Data held by the government is often implicitly or explicitly about individuals—acting in roles that have recognized constitutional protection, such as lobbyist, signatory to a petition, or donor to a political cause; in roles that require special protection, such as victim of, witness to, or suspect in a crime; in the role as businessperson submitting proprietary information to a regulator or obtaining a business license; and in the role of ordinary citizen.  While open government is often presented as an unqualified good, sometimes Open Data can identify individuals or groups, leading to a more transparent citizenry.  The citizen who foresees this growing transparency may be less willing to engage in government, as these transactions may be documented and released in a dataset to anyone to use for any imaginable purpose—including to deanonymize the database—forever.  Moreover, some groups of citizens may have few options or no choice as to whether to engage in governmental activities.  Hence, open data sets may have a disparate impact on certain groups. The potential impact of large-scale data and analysis on civil rights is an area of growing concern.  A number of civil rights and media justice groups banded together in February 2014 to endorse the “Civil Rights Principles for the Era of Big Data” and the potential of new data systems to undermine longstanding civil rights protections was flagged as a “central finding” of a recent policy review by White House adviser John Podesta.
The Berkeley Center for Law & Technology (BCLT) and Microsoft are issuing this request for proposals in an effort to better understand the implications and potential impact of the release of data related to U.S. citizens’ interactions with their local, state and federal governments. BCLT and Microsoft will fund up to six grants, with a combined total of $300,000.  Grantees will be required to participate in a workshop to present and discuss their research at the Berkeley Technology Law Journal (BTLJ) Spring Symposium.  All grantees’ papers will be published in a dedicated monograph.  Grantees’ papers that approach the issues from a legal perspective may also be published in the BTLJ. We may also hold a followup workshop in New York City or Washington, DC.
While we are primarily interested in funding proposals that address issues related to the policy impacts of Open Data, many of these issues are intertwined with general societal implications of “big data.” As a result, proposals that explore Open Data from a big data perspective are welcome; however, proposals solely focused on big data are not.  We are open to proposals that address the following difficult question.  We are also open to methods and disciplines, and are particularly interested in proposals from cross-disciplinary teams.

    • To what extent does existing Open Data made available by city and state governments affect individual profiling?  Do the effects change depending on the level of aggregation (neighborhood vs. cities)?  What releases of information could foreseeably cause discrimination in the future? Will different groups in society be disproportionately impacted by Open Data?
    • Should the use of Open Data be governed by a code of conduct or subject to a review process before being released? In order to enhance citizen privacy, should governments develop guidelines to release sampled or perturbed data, instead of entire datasets? When datasets contain potentially identifiable information, should there be a notice-and-comment proceeding that includes proposed technological solutions to anonymize, de-identify or otherwise perturb the data?
    • Is there something fundamentally different about government services and the government’s collection of citizen’s data for basic needs in modern society such as power and water that requires governments to exercise greater due care than commercial entities?
    • Companies have legal and practical mechanisms to shield data submitted to government from public release.  What mechanisms do individuals have or should have to address misuse of Open Data?  Could developments in the constitutional right to information policy as articulated in Whalen and Westinghouse Electric Co address Open Data privacy issues?
    • Collecting data costs money, and its release could affect civil liberties.  Yet it is being given away freely, sometimes to immensely profitable firms.  Should governments license data for a fee and/or impose limits on its use, given its value?
    • The privacy principle of “collection limitation” is under siege, with many arguing that use restrictions will be more efficacious for protecting privacy and more workable for big data analysis.  Does the potential of Open Data justify eroding state and federal privacy act collection limitation principles?   What are the ethical dimensions of a government system that deprives the data subject of the ability to obscure or prevent the collection of data about a sensitive issue?  A move from collection restrictions to use regulation raises a number of related issues, detailed below.
    • Are use restrictions efficacious in creating accountability?  Consumer reporting agencies are regulated by use restrictions, yet they are not known for their accountability.  How could use regulations be implemented in the context of Open Data efficaciously?  Can a self-learning algorithm honor data use restrictions?
    • If an Open Dataset were regulated by a use restriction, how could individuals police wrongful uses?   How would plaintiffs overcome the likely defenses or proof of facts in a use regulation system, such as a burden to prove that data were analyzed and the product of that analysis was used in a certain way to harm the plaintiff?  Will plaintiffs ever be able to beat first amendment defenses?
    • The President’s Council of Advisors on Science and Technology big data report emphasizes that analysis is not a “use” of data.  Such an interpretation suggests that NSA metadata analysis and large-scale scanning of communications do not raise privacy issues.  What are the ethical and legal implications of the “analysis is not use” argument in the context of Open Data?
    • Open Data celebrates the idea that information collected by the government can be used by another person for various kinds of analysis.  When analysts are not involved in the collection of data, they are less likely to understand its context and limitations.  How do we ensure that this knowledge is maintained in a use regulation system?
    • Former President William Clinton was admitted under a pseudonym for a procedure at a New York Hospital in 2004.  The hospital detected 1,500 attempts by its own employees to access the President’s records.  With snooping such a tempting activity, how could incentives be crafted to cause self-policing of government data and the self-disclosure of inappropriate uses of Open Data?
    • It is clear that data privacy regulation could hamper some big data efforts.  However, many examples of big data successes hail from highly regulated environments, such as health care and financial services—areas with statutory, common law, and IRB protections.  What are the contours of privacy law that are compatible with big data and Open Data success and which are inherently inimical to it?
    • In recent years, the problem of “too much money in politics” has been addressed with increasing disclosure requirements.  Yet, distrust in government remains high, and individuals identified in donor databases have been subjected to harassment.  Is the answer to problems of distrust in government even more Open Data?
    • What are the ethical and epistemological implications of encouraging government decision-making based upon correlation analysis, without a rigorous understanding of cause and effect?  Are there decisions that should not be left to just correlational proof? While enthusiasm for data science has increased, scientific journals are elevating their standards, with special scrutiny focused on hypothesis-free, multiple comparison analysis. What could legal and policy experts learn from experts in statistics about the nature and limits of open data?…
      To submit a proposal, visit the Conference Management Toolkit (CMT) here.
      Once you have created a profile, the site will allow you to submit your proposal.
      If you have questions, please contact Chris Hoofnagle, principal investigator on this project.”

How Three Startups Are Using Data to Renew Public Trust In Government


Mark Hall: “Chances are that when you think about the word government, it is with a negative connotation.Your less-than-stellar opinion of government may be caused by everything from Washington’s dirty politics to the long lines at your local DMV.Regardless of the reason, local, state and national politics have frequently garnered a bad reputation. People feel like governments aren’t working for them.We have limited information, visibility and insight into what’s going on and why. Yes, the data is public information but it’s difficult to access and sift through.
Good news. Things are changing fast.
Innovative startups are emerging and they are changing the way we access government information at all levels.
Here are three tech startups that are taking a unique approach to opening up government data:
1. OpenGov is a Mountain View-based software company that enables government officials and local residents to easily parse through the city’s financial data.
Founded by a team with extensive technology and finance experience, this startup has already racked up some of the largest cities to join the movement, including the City of Los Angeles.OpenGov’s approach pairs data with good design in a manner that makes it easy to use.Historically, information like expenditures of public funds existed in a silo within the mayor’s office or city manager, diminishing  the accountability of public employees.Imagine you are a citizen who is interested in seeing how much your city spent on a particular matter?
Now you can find out within just a few clicks.
This data is always of great importance but could also become increasingly critical during events like local elections.This level of openness and accessibility to data will be game-changing.
2. FiscalNote is a one-year old startup that uses analytical signals and intelligent government data to map legislation and predict an outcome.
Headquartered in Washington D.C., the company has developed a search layer and unique algorithm that makes tracking legislative data extremely easy. If you are an organization that has vested interests in specific legislative bills, tools by FiscalNote can give you insights into its progress and likelihood of being passed or held up. Want to know if your local representative favors a bill that could hurt your industry? Find out early and take the steps necessary to minimize the impact. Large corporations and special interest groups have traditionally held lobbying power with elected officials. This technology is important because small businesses, nonprofits and organizations now have an additional tool to see a changing legislative landscape in ways that were previously unimaginable.
3. Civic Industries is a San Francisco startup that allows citizens and local government officials to easily access data that previously required you to drive down to city hall. Building permits, code enforcements, upcoming government projects and construction data is now openly available within a few clicks.
Civic Insight maps various projects in your community and enables you to see all the projects with the corresponding start and completion dates, along with department contacts.
Accountability of public planning is no longer concealed to the city workers in the back-office. Responsibility is made clear. The startup also pushes underutilized city resources like empty storefronts and abandoned buildings to the forefront in an effort to drive action, either by residents or government officials.
So What’s Next?
While these three startups using data to push government transparency in the right direction, more work is needed…”

Chief Executive of Nesta on the Future of Government Innovation


Interview between Rahim Kanani and Geoff Mulgan, CEO of NESTA and member of the MacArthur Research Network on Opening Governance: “Our aspiration is to become a global center of expertise on all kinds of innovation, from how to back creative business start-ups and how to shape innovations tools such as challenge prizes, to helping governments act as catalysts for new solutions,” explained Geoff Mulgan, chief executive of Nesta, the UK’s innovation foundation. In an interview with Mulgan, we discussed their new report, published in partnership with Bloomberg Philanthropies, which highlights 20 of the world’s top innovation teams in government. Mulgan and I also discussed the founding and evolution of Nesta over the past few years, and leadership lessons from his time inside and outside government.
Rahim Kanani: When we talk about ‘innovations in government’, isn’t that an oxymoron?
Geoff Mulgan: Governments have always innovated. The Internet and World Wide Web both originated in public organizations, and governments are constantly developing new ideas, from public health systems to carbon trading schemes, online tax filing to high speed rail networks.  But they’re much less systematic at innovation than the best in business and science.  There are very few job roles, especially at senior levels, few budgets, and few teams or units.  So although there are plenty of creative individuals in the public sector, they succeed despite, not because of the systems around them. Risk-taking is punished not rewarded.   Over the last century, by contrast, the best businesses have learned how to run R&D departments, product development teams, open innovation processes and reasonably sophisticated ways of tracking investments and returns.
Kanani: This new report, published in partnership with Bloomberg Philanthropies, highlights 20 of the world’s most effective innovation teams in government working to address a range of issues, from reducing murder rates to promoting economic growth. Before I get to the results, how did this project come about, and why is it so important?
Mulgan: If you fail to generate new ideas, test them and scale the ones that work, it’s inevitable that productivity will stagnate and governments will fail to keep up with public expectations, particularly when waves of new technology—from smart phones and the cloud to big data—are opening up dramatic new possibilities.  Mayor Bloomberg has been a leading advocate for innovation in the public sector, and in New York he showed the virtues of energetic experiment, combined with rigorous measurement of results.  In the UK, organizations like Nesta have approached innovation in a very similar way, so it seemed timely to collaborate on a study of the state of the field, particularly since we were regularly being approached by governments wanting to set up new teams and asking for guidance.
Kanani: Where are some of the most effective innovation teams working on these issues, and how did you find them?
Mulgan: In our own work at Nesta, we’ve regularly sought out the best innovation teams that we could learn from and this study made it possible to do that more systematically, focusing in particular on the teams within national and city governments.  They vary greatly, but all the best ones are achieving impact with relatively slim resources.  Some are based in central governments, like Mindlab in Denmark, which has pioneered the use of design methods to reshape government services, from small business licensing to welfare.  SITRA in Finland has been going for decades as a public technology agency, and more recently has switched its attention to innovation in public services. For example, providing mobile tools to help patients manage their own healthcare.   In the city of Seoul, the Mayor set up an innovation team to accelerate the adoption of ‘sharing’ tools, so that people could share things like cars, freeing money for other things.  In south Australia the government set up an innovation agency that has been pioneering radical ways of helping troubled families, mobilizing families to help other families.
Kanani: What surprised you the most about the outcomes of this research?
Mulgan: Perhaps the biggest surprise has been the speed with which this idea is spreading.  Since we started the research, we’ve come across new teams being created in dozens of countries, from Canada and New Zealand to Cambodia and Chile.  China has set up a mobile technology lab for city governments.  Mexico City and many others have set up labs focused on creative uses of open data.  A batch of cities across the US supported by Bloomberg Philanthropy—from Memphis and New Orleans to Boston and Philadelphia—are now showing impressive results and persuading others to copy them.
 

Open Data for economic growth: the latest evidence


Andrew Stott at the Worldbank OpenData Blog: “One of the key policy drivers for Open Data has been to drive economic growth and business innovation. There’s a growing amount of evidence and analysis not only for the total potential economic benefit but also for some of the ways in which this is coming about. This evidence is summarised and reviewed in a new World Bank paper published today.
There’s a range of studies that suggest that the potential prize from Open Data could be enormous – including an estimate of $3-5 trillion a year globally from McKinsey Global Institute and an estimate of $13 trillion cumulative over the next 5 years in the G20 countries.  There are supporting studies of the value of Open Data to certain sectors in certain countries – for instance $20 billion a year to Agriculture in the US – and of the value of key datasets such as geospatial data.  All these support the conclusion that the economic potential is at least significant – although with a range from “significant” to “extremely significant”!
At least some of this benefit is already being realised by new companies that have sprung up to deliver new, innovative, data-rich services and by older companies improving their efficiency by using open data to optimise their operations. Five main business archetypes have been identified – suppliers, aggregators, enrichers, application developers and enablers. What’s more there are at least four companies which did not exist ten years ago, which are driven by Open Data, and which are each now valued at around $1 billion or more. Somewhat surprisingly the drive to exploit Open Data is coming from outside the traditional “ICT sector” – although the ICT sector is supplying many of the tools required.
It’s also becoming clear that if countries want to maximise their gain from Open Data the role of government needs to go beyond simply publishing some data on a website. Governments need to be:

  • Suppliers – of the data that business need
  • Leaders – making sure that municipalities, state owned enterprises and public services operated by the private sector also release important data
  • Catalysts – nurturing a thriving ecosystem of data users, coders and application developers and incubating new, data-driven businesses
  • Users – using Open Data themselves to overcome the barriers to using data within government and innovating new ways to use the data they collect to improve public services and government efficiency.

Nevertheless, most of the evidence for big economic benefits for Open Data comes from the developed world. So on Wednesday the World Bank is holding an open seminar to examine critically “Can Open Data Boost Economic Growth and Prosperity” in developing countries. Please join us and join the debate!
Learn more:

Selected Readings on Sentiment Analysis


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of sentiment analysis was originally published in 2014.

Sentiment Analysis is a field of Computer Science that uses techniques from natural language processing, computational linguistics, and machine learning to predict subjective meaning from text. The term opinion mining is often used interchangeably with Sentiment Analysis, although it is technically a subfield focusing on the extraction of opinions (the umbrella under which sentiment, evaluation, appraisal, attitude, and emotion all lie).

The rise of Web 2.0 and increased information flow has led to an increase in interest towards Sentiment Analysis — especially as applied to social networks and media. Events causing large spikes in media — such as the 2012 Presidential Election Debates — are especially ripe for analysis. Such analyses raise a variety of implications for the future of crowd participation, elections, and governance.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Choi, Eunsol et al. “Hedge detection as a lens on framing in the GMO debates: a position paper.” Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics 13 Jul. 2012: 70-79. http://bit.ly/1wweftP

  • Understanding the ways in which participants in public discussions frame their arguments is important for understanding how public opinion is formed. This paper adopts the position that it is time for more computationally-oriented research on problems involving framing. In the interests of furthering that goal, the authors propose the following question: In the controversy regarding the use of genetically-modified organisms (GMOs) in agriculture, do pro- and anti-GMO articles differ in whether they choose to adopt a more “scientific” tone?
  • Prior work on the rhetoric and sociology of science suggests that hedging may distinguish popular-science text from text written by professional scientists for their colleagues. The paper proposes a detailed approach to studying whether hedge detection can be used to understand scientific framing in the GMO debates, and provides corpora to facilitate this study. Some of the preliminary analyses suggest that hedges occur less frequently in scientific discourse than in popular text, a finding that contradicts prior assertions in the literature.

Michael, Christina, Francesca Toni, and Krysia Broda. “Sentiment analysis for debates.” (Unpublished MSc thesis). Department of Computing, Imperial College London (2013). http://bit.ly/Wi86Xv

  • This project aims to expand on existing solutions used for automatic sentiment analysis on text in order to capture support/opposition and agreement/disagreement in debates. In addition, it looks at visualizing the classification results for enhancing the ease of understanding the debates and for showing underlying trends. Finally, it evaluates proposed techniques on an existing debate system for social networking.

Murakami, Akiko, and Rudy Raymond. “Support or oppose?: classifying positions in online debates from reply activities and opinion expressions.” Proceedings of the 23rd International Conference on Computational Linguistics: Posters 23 Aug. 2010: 869-875. https://bit.ly/2Eicfnm

  • In this paper, the authors propose a method for the task of identifying the general positions of users in online debates, i.e., support or oppose the main topic of an online debate, by exploiting local information in their remarks within the debate. An online debate is a forum where each user posts an opinion on a particular topic while other users state their positions by posting their remarks within the debate. The supporting or opposing remarks are made by directly replying to the opinion, or indirectly to other remarks (to express local agreement or disagreement), which makes the task of identifying users’ general positions difficult.
  • A prior study has shown that a link-based method, which completely ignores the content of the remarks, can achieve higher accuracy for the identification task than methods based solely on the contents of the remarks. In this paper, it is shown that utilizing the textual content of the remarks into the link-based method can yield higher accuracy in the identification task.

Pang, Bo, and Lillian Lee. “Opinion mining and sentiment analysis.” Foundations and trends in information retrieval 2.1-2 (2008): 1-135. http://bit.ly/UaCBwD

  • This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Its focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. It includes material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

Ranade, Sarvesh et al. “Online debate summarization using topic directed sentiment analysis.” Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining 11 Aug. 2013: 7. http://bit.ly/1nbKtLn

  • Social networking sites provide users a virtual community interaction platform to share their thoughts, life experiences and opinions. Online debate forum is one such platform where people can take a stance and argue in support or opposition of debate topics. An important feature of such forums is that they are dynamic and grow rapidly. In such situations, effective opinion summarization approaches are needed so that readers need not go through the entire debate.
  • This paper aims to summarize online debates by extracting highly topic relevant and sentiment rich sentences. The proposed approach takes into account topic relevant, document relevant and sentiment based features to capture topic opinionated sentences. ROUGE (Recall-Oriented Understudy for Gisting Evaluation, which employ a set of metrics and a software package to compare automatically produced summary or translation against human-produced onces) scores are used to evaluate the system. This system significantly outperforms several baseline systems and show improvement over the state-of-the-art opinion summarization system. The results verify that topic directed sentiment features are most important to generate effective debate summaries.

Schneider, Jodi. “Automated argumentation mining to the rescue? Envisioning argumentation and decision-making support for debates in open online collaboration communities.” http://bit.ly/1mi7ztx

  • Argumentation mining, a relatively new area of discourse analysis, involves automatically identifying and structuring arguments. Following a basic introduction to argumentation, the authors describe a new possible domain for argumentation mining: debates in open online collaboration communities.
  • Based on our experience with manual annotation of arguments in debates, the authors propose argumentation mining as the basis for three kinds of support tools, for authoring more persuasive arguments, finding weaknesses in others’ arguments, and summarizing a debate’s overall conclusions.

Demos for Democracy


The GovLab presents Demos for Democracy, an ongoing series of live, interactive online demos featuring designers and builders of the latest innovative governance platforms, tools or methods to foster greater openness and collaboration to how we govern.
Who: remesh, founded by PhD students Andrew Konya and Aaron Slodov, is an online public platform that offers a community, group, nation or planet of people the ability to speak with one voice that represents the collective thinking of all people within the group. remesh was prototyped at a HacKSU hackathon early in 2013 and has been under development over the past year.
What: Join us for a live demonstration of how remesh works before their official public launch. Participants will be given a link to test the platform during the live Google hangout.  More information on what remesh does can be found here.
When: July 29, 2014, 2:00 – 2:30 PM EST
Where: Online via Google Hangouts on Air. To RSVP and join, go to the Hangout Link. This event will be live tweeted at #democracydemos.
Bios:
Andrew Konya (CEO/Founder) is a PhD student in computational/theoretical physics at Kent State University. With extensive experience developing and implementing mathematical models for natural and man-made systems, Andrew brings a creative yet and versatile technical toolbox. This expertise, in concert with his passion for linguistics, led him to develop the first mathematical framework for collective speech. His goal is the completion of a conversation platform, built on this framework, which can make conversations between countries in conflict a viable alternative to war.
Aaron Slodov (COO/Founder) is a current power systems engineering PhD student at Case Western Reserve University. A previous engineer at both Google and Meetup.com, Aaron is experienced in the tech landscape, and understands many of the current problems in the space. By enabling remesh technology he hopes to bring significant paradigm-shifting change to the way we communicate and interact with our world.
RSVP and JOIN
We hope to see you on Tuesday! If you have any questions, email us at [email protected].

The Quiet Movement to Make Government Fail Less Often


in The New York Times: “If you wanted to bestow the grandiose title of “most successful organization in modern history,” you would struggle to find a more obviously worthy nominee than the federal government of the United States.

In its earliest stirrings, it established a lasting and influential democracy. Since then, it has helped defeat totalitarianism (more than once), established the world’s currency of choice, sent men to the moon, built the Internet, nurtured the world’s largest economy, financed medical research that saved millions of lives and welcomed eager immigrants from around the world.

Of course, most Americans don’t think of their government as particularly successful. Only 19 percent say they trust the government to do the right thing most of the time, according to Gallup. Some of this mistrust reflects a healthy skepticism that Americans have always had toward centralized authority. And the disappointing economic growth of recent decades has made Americans less enamored of nearly every national institution.

But much of the mistrust really does reflect the federal government’s frequent failures – and progressives in particular will need to grapple with these failures if they want to persuade Americans to support an active government.

When the federal government is good, it’s very, very good. When it’s bad (or at least deeply inefficient), it’s the norm.

The evidence is abundant. Of the 11 large programs for low- and moderate-income people that have been subject to rigorous, randomized evaluation, only one or two show strong evidence of improving most beneficiaries’ lives. “Less than 1 percent of government spending is backed by even the most basic evidence of cost-effectiveness,” writes Peter Schuck, a Yale law professor, in his new book, “Why Government Fails So Often,” a sweeping history of policy disappointments.

As Mr. Schuck puts it, “the government has largely ignored the ‘moneyball’ revolution in which private-sector decisions are increasingly based on hard data.”

And yet there is some good news in this area, too. The explosion of available data has made evaluating success – in the government and the private sector – easier and less expensive than it used to be. At the same time, a generation of data-savvy policy makers and researchers has entered government and begun pushing it to do better. They have built on earlier efforts by the Bush and Clinton administrations.

The result is a flowering of experiments to figure out what works and what doesn’t.

New York City, Salt Lake City, New York State and Massachusetts have all begun programs to link funding for programs to their success: The more effective they are, the more money they and their backers receive. The programs span child care, job training and juvenile recidivism.

The approach is known as “pay for success,” and it’s likely to spread to Cleveland, Denver and California soon. David Cameron’s conservative government in Britain is also using it. The Obama administration likes the idea, and two House members – Todd Young, an Indiana Republican, and John Delaney, a Maryland Democrat – have introduced a modest bill to pay for a version known as “social impact bonds.”

The White House is also pushing for an expansion of randomized controlled trials to evaluate government programs. Such trials, Mr. Schuck notes, are “the gold standard” for any kind of evaluation. Using science as a model, researchers randomly select some people to enroll in a government program and others not to enroll. The researchers then study the outcomes of the two groups….”

Networks and Hierarchies


on whether political hierarchy in the form of the state has met its match in today’s networked world in the American Interest: “…To all the world’s states, democratic and undemocratic alike, the new informational, commercial, and social networks of the internet age pose a profound challenge, the scale of which is only gradually becoming apparent. First email achieved a dramatic improvement in the ability of ordinary citizens to communicate with one another. Then the internet came to have an even greater impact on the ability of citizens to access information. The emergence of search engines marked a quantum leap in this process. The advent of laptops, smartphones, and other portable devices then emancipated electronic communication from the desktop. With the explosive growth of social networks came another great leap, this time in the ability of citizens to share information and ideas.
It was not immediately obvious how big a challenge all this posed to the established state. There was a great deal of cheerful talk about the ways in which the information technology revolution would promote “smart” or “joined-up” government, enhancing the state’s ability to interact with citizens. However, the efforts of Anonymous, Wikileaks and Edward Snowden to disrupt the system of official secrecy, directed mainly against the U.S. government, have changed everything. In particular, Snowden’s revelations have exposed the extent to which Washington was seeking to establish a parasitical relationship with the key firms that operate the various electronic networks, acquiring not only metadata but sometimes also the actual content of vast numbers of phone calls and messages. Techniques of big-data mining, developed initially for commercial purposes, have been adapted to the needs of the National Security Agency.
The most recent, and perhaps most important, network challenge to hierarchy comes with the advent of virtual currencies and payment systems like Bitcoin. Since ancient times, states have reaped considerable benefits from monopolizing or at least regulating the money created within their borders. It remains to be seen how big a challenge Bitcoin poses to the system of national fiat currencies that has evolved since the 1970s and, in particular, how big a challenge it poses to the “exorbitant privilege” enjoyed by the United States as the issuer of the world’s dominant reserve (and transaction) currency. But it would be unwise to assume, as some do, that it poses no challenge at all….”

Brief survey of crowdsourcing for data mining


Paper by Guo XintongWang Hongzhi, Yangqiu Song, and Gao Hong in Expert Systems with Applications: “Crowdsourcing allows large-scale and flexible invocation of human input for data gathering and analysis, which introduces a new paradigm of data mining process. Traditional data mining methods often require the experts in analytic domains to annotate the data. However, it is expensive and usually takes a long time. Crowdsourcing enables the use of heterogeneous background knowledge from volunteers and distributes the annotation process to small portions of efforts from different contributions. This paper reviews the state-of-the-arts on the crowdsourcing for data mining in recent years. We first review the challenges and opportunities of data mining tasks using crowdsourcing, and summarize the framework of them. Then we highlight several exemplar works in each component of the framework, including question designing, data mining and quality control. Finally, we conclude the limitation of crowdsourcing for data mining and suggest related areas for future research.

Accessible Law for the Internet Age


AmericaDecoded: “America’s Laws Are the People’s Public Property. The State Decoded software provides you with a people-friendly way to access your local, state, and federal legal code.

  • about-icons-01Careful organization by article and section makes browsing a breeze.
  • about-icons-02A site-wide search allows you to find the laws you’re looking for by topic.
  • about-icons-03Scroll-over definitions translate legal jargon into common English.
  • about-icons-04Downloadable legal code lets you take the law into your own hands.
  • about-icons-05Best of all, everything on the site remains cost-and restriction-free.”

(See Video)