Civic Technology: Open Data and Citizen Volunteers as a Resource for North Carolina Local Governments


Report by John B. Stephens: “Civic technology is an emergent area of practice where IT experts and citizens without specialized IT skills volunteer their time using government-provided open data to improve government services or otherwise create public benefit. Civic tech, as it is often referred to, draws on longer-standing practices, particularly e-government and civic engagement. It is also a new form of citizen–government co-production, building on the trend of greater government transparency.

This report is designed to help North Carolina local government leaders:

  • Define civic technology practices and describing North Carolina civic tech resources
  • Highlight accomplishments and ongoing projects in civic tech (in North Carolina and beyond)
  • Identify opportunities and challenges for North Carolina local governments in civic tech
  • Provide a set of resources for education and involvement in civic tech….(More)”.

Research reveals de-identified patient data can be re-identified


Vanessa Teague, Chris Culnane and Ben Rubinstein in PhysOrg: “In August 2016, Australia’s federal Department of Health published medical billing records of about 2.9 million Australians online. These records came from the Medicare Benefits Scheme (MBS) and the Pharmaceutical Benefits Scheme (PBS) containing 1 billion lines of historical health data from the records of around 10 per cent of the population.

These longitudinal records were de-identified, a process intended to prevent a person’s identity from being connected with information, and were made public on the government’s open data website as part of its policy on accessible public 

We found that patients can be re-identified, without decryption, through a process of linking the unencrypted parts of the  with known information about the individual.

Our findings replicate those of similar studies of other de-identified datasets:

  • A few mundane facts taken together often suffice to isolate an individual.
  • Some patients can be identified by name from publicly available information.
  • Decreasing the precision of the data, or perturbing it statistically, makes re-identification gradually harder at a substantial cost to utility.

The first step is examining a patient’s uniqueness according to medical procedures such as childbirth. Some individuals are unique given public information, and many patients are unique given a few basic facts, such as year of birth or the date a baby was delivered….

The second step is examining uniqueness according to the characteristics of commercial datasets we know of but cannot access directly. There are high uniqueness rates that would allow linking with a commercial pharmaceutical dataset, and with the billing data available to a bank. This means that ordinary people, not just the prominent ones, may be easily re-identifiable by their bank or insurance company…

These de-identification methods were bound to fail, because they were trying to achieve two inconsistent aims: the protection of individual privacy and publication of detailed individual records. De-identification is very unlikely to work for other rich datasets in the government’s care, like census data, tax records, mental health records, penal information and Centrelink data.

While the ambition of making more data more easily available to facilitate research, innovation and sound public policy is a good one, there is an important technical and procedural problem to solve: there is no good solution for publishing sensitive complex individual records that protects privacy without substantially degrading the usefulness of the data.

Some data can be safely published online, such as information about government, aggregations of large collections of material, or data that is differentially private. For sensitive, complex data about individuals, a much more controlled release in a secure research environment is a better solution. The Productivity Commission recommends a “trusted user” model, and techniques like dynamic consent also give patients greater control and visibility over their personal information….(More).

From Territorial to Functional Sovereignty: The Case of Amazon


Essay by Frank Pasquale: “…Who needs city housing regulators when AirBnB can use data-driven methods to effectively regulate room-letting, then house-letting, and eventually urban planning generally? Why not let Amazon have its own jurisdiction or charter city, or establish special judicial procedures for Foxconn? Some vanguardists of functional sovereignty believe online rating systems could replace state occupational licensure—so rather than having government boards credential workers, a platform like LinkedIn could collect star ratings on them.

In this and later posts, I want to explain how this shift from territorial to functional sovereignty is creating a new digital political economy. Amazon’s rise is instructive. As Lina Khan explains, “the company has positioned itself at the center of e-commerce and now serves as essential infrastructure for a host of other businesses that depend upon it.” The “everything store” may seem like just another service in the economy—a virtual mall. But when a firm combines tens of millions of customers with a “marketing platform, a delivery and logistics network, a payment service, a credit lender, an auction house…a hardware manufacturer, and a leading host of cloud server space,” as Khan observes, it’s not just another shopping option.

Digital political economy helps us understand how platforms accumulate power. With online platforms, it’s not a simple narrative of “best service wins.” Network effects have been on the cyberlaw (and digital economics) agenda for over twenty years. Amazon’s dominance has exhibited how network effects can be self-reinforcing. The more merchants there are selling on (or to) Amazon, the better shoppers can be assured that they are searching all possible vendors. The more shoppers there are, the more vendors consider Amazon a “must-have” venue. As crowds build on either side of the platform, the middleman becomes ever more indispensable. Oh, sure, a new platform can enter the market—but until it gets access to the 480 million items Amazon sells (often at deep discounts), why should the median consumer defect to it? If I want garbage bags, do I really want to go over to Target.com to re-enter all my credit card details, create a new log-in, read the small print about shipping, and hope that this retailer can negotiate a better deal with Glad? Or do I, ala Sunstein, want a predictive shopping purveyor that intimately knows my past purchase habits, with satisfaction just a click away?
As artificial intelligence improves, the tracking of shopping into the Amazon groove will tend to become ever more rational for both buyers and sellers. Like a path through a forest trod ever clearer of debris, it becomes the natural default. To examine just one of many centripetal forces sucking money, data, and commerce into online behemoths, play out game theoretically how the possibility of online conflict redounds in Amazon’s favor. If you have a problem with a merchant online, do you want to pursue it as a one-off buyer? Or as someone whose reputation has been established over dozens or hundreds of transactions—and someone who can credibly threaten to deny Amazon hundreds or thousands of dollars of revenue each year? The same goes for merchants: The more tribute they can pay to Amazon, the more likely they are to achieve visibility in search results and attention (and perhaps even favor) when disputes come up. What Bruce Schneier said about security is increasingly true of commerce online: You want to be in the good graces of one of the neo-feudal giants who bring order to a lawless realm. Yet few hesitate to think about exactly how the digital lords might use their data advantages against those they ostensibly protect.

Forward-thinking legal thinkers are helping us grasp these dynamics. For example, Rory van Loo has described the status of the “corporation as courthouse”—that is, when platforms like Amazon run dispute resolution schemes to settle conflicts between buyers and sellers. Van Loo describes both the efficiency gains that an Amazon settlement process might have over small claims court, and the potential pitfalls for consumers (such as opaque standards for deciding cases). I believe that, on top of such economic considerations, we may want to consider the political economic origins of e-commerce feudalism. For example, as consumer rights shrivel, it’s rational for buyers to turn to Amazon (rather than overwhelmed small claims courts) to press their case. The evisceration of class actions, the rise of arbitration, boilerplate contracts—all these make the judicial system an increasingly vestigial organ in consumer disputes. Individuals rationally turn to online giants for powers to impose order that libertarian legal doctrine stripped from the state. And in so doing, they reinforce the very dynamics that led to the state’s etiolation in the first place….(More)”.

Accountability of AI Under the Law: The Role of Explanation


Paper by Finale Doshi-Velez and Mason Kortz: “The ubiquity of systems using artificial intelligence or “AI” has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before—applications range from clinical decision support to autonomous driving and predictive policing. That said, our AIs continue to lag in common sense reasoning [McCarthy, 1960], and thus there exist legitimate concerns about the intentional and unintentional negative consequences of AI systems [Bostrom, 2003, Amodei et al., 2016, Sculley et al., 2014]. How can we take advantage of what AI systems have to offer, while also holding them accountable?

In this work, we focus on one tool: explanation. Questions about a legal right to explanation from AI systems was recently debated in the EU General Data Protection Regulation [Goodman and Flaxman, 2016, Wachter et al., 2017a], and thus thinking carefully about when and how explanation from AI systems might improve accountability is timely. Good choices about when to demand explanation can help prevent negative consequences from AI systems, while poor choices may not only fail to hold AI systems accountable but also hamper the development of much-needed beneficial AI systems.

Below, we briefly review current societal, moral, and legal norms around explanation, and then focus on the different contexts under which explanation is currently required under the law. We find that there exists great variation around when explanation is demanded, but there also exist important consistencies: when demanding explanation from humans, what we typically want to know is whether and how certain input factors affected the final decision or outcome.

These consistencies allow us to list the technical considerations that must be considered if we desired AI systems that could provide kinds of explanations that are currently required of humans under the law. Contrary to popular wisdom of AI systems as indecipherable black boxes, we find that this level of explanation should generally be technically feasible but may sometimes be practically onerous—there are certain aspects of explanation that may be simple for humans to provide but challenging for AI systems, and vice versa. As an interdisciplinary team of legal scholars, computer scientists, and cognitive scientists, we recommend that for the present, AI systems can and should be held to a similar standard of explanation as humans currently are; in the future we may wish to hold an AI to a different standard….(More)”

From #Resistance to #Reimagining governance


Stefaan G. Verhulst in Open Democracy: “…There is no doubt that #Resistance (and its associated movements) holds genuine transformative potential. But for the change it brings to be meaningful (and positive), we need to ask the question: What kind of government do we really want?

Working to maintain the status quo or simply returning to, for instance, a pre-Trump reality cannot provide for the change we need to counter the decline in trust, the rise of populism and the complex social, economic and cultural problems we face. We need a clear articulation of alternatives.  Without such an articulation, there is a danger of a certain hollowness and dispersion of energies. The call for #Resistance requires a more concrete –and ultimately more productive – program that is concerned not just with rejecting or tearing down, but with building up new institutions and governance processes. What’s needed, in short, is not simply #Resistance.

Below, I suggest six shifts that can help us reimagine governance for the twenty-first century. Several of these shifts are enabled by recent technological changes (e.g., the advent of big data, blockchain and collective intelligence) as well as other emerging methods such as design thinking, behavioral economics, and agile development.

Some of the shifts I suggest have been experimented with, but they have often been developed in an ad hoc manner without a full understanding of how they could make a more systemic impact. Part of the purpose of this paper is to begin the process of a more systematic enquiry; the following amounts to a preliminary outline or blueprint for reimagined governance for the twenty-first century.

Screen Shot 2017-12-14 at 1.21.29 PM

  • Shift 1: from gatekeeper to platform…
  • Shift 2: from inward to user-and-problem orientation…
  • Shift 3: from closed to open…
  • Shift 4: from deliberation to collaboration and co-creation…
  • Shift 5: from ideology to evidence-based…
  • Shift 6: from centralized to distributed… (More)

Code and Clay, Data and Dirt: Five Thousand Years of Urban Media


Book by Shannon Mattern: “For years, pundits have trumpeted the earthshattering changes that big data and smart networks will soon bring to our cities. But what if cities have long been built for intelligence, maybe for millennia? In Code and Clay, Data and Dirt Shannon Mattern advances the provocative argument that our urban spaces have been “smart” and mediated for thousands of years.

Offering powerful new ways of thinking about our cities, Code and Clay, Data and Dirt goes far beyond the standard historical concepts of origins, development, revolutions, and the accomplishments of an elite few. Mattern shows that in their architecture, laws, street layouts, and civic knowledge—and through technologies including the telephone, telegraph, radio, printing, writing, and even the human voice—cities have long negotiated a rich exchange between analog and digital, code and clay, data and dirt, ether and ore.

Mattern’s vivid prose takes readers through a historically and geographically broad range of stories, scenes, and locations, synthesizing a new narrative for our urban spaces. Taking media archaeology to the city’s streets, Code and Clay, Data and Dirt reveals new ways to write our urban, media, and cultural histories….(More)”.

Business Models For Sustainable Research Data Repositories


OECD Report: “In 2007, the OECD Principles and Guidelines for Access to Research Data from Public Funding were published and in the intervening period there has been an increasing emphasis on open science. At the same time, the quantity and breadth of research data has massively expanded. So called “Big Data” is no longer limited to areas such as particle physics and astronomy, but is ubiquitous across almost all fields of research. This is generating exciting new opportunities, but also challenges.

The promise of open research data is that they will not only accelerate scientific discovery and improve reproducibility, but they will also speed up innovation and improve citizen engagement with research. In short, they will benefit society as a whole. However, for the benefits of open science and open research data to be realised, these data need to be carefully and sustainably managed so that they can be understood and used by both present and future generations of researchers.

Data repositories – based in local and national research institutions and international bodies – are where the long-term stewardship of research data takes place and hence they are the foundation of open science. Yet good data stewardship is costly and research budgets are limited. So, the development of sustainable business models for research data repositories needs to be a high priority in all countries. Surprisingly, perhaps, little systematic analysis has been done on income streams, costs, value propositions, and business models for data repositories, and that is the gap this report attempts to address, from a science policy perspective…..

This project was designed to take up the challenge and to contribute to a better understanding of how research data repositories are funded, and what developments are occurring in their funding. Central questions included:

  • How are data repositories currently funded, and what are the key revenue sources?
  • What innovative revenue sources are available to data repositories?
  • How do revenue sources fit together into sustainable business models?
  • What incentives for, and means of, optimising costs are available?
  • What revenue sources and business models are most acceptable to key stakeholders?…(More)”

Victims of Sexual Harassment Have a New Resource: AI


MIT Technology Review (The Download): “If you have ever dealt with sexual harassment in the workplace, there is now a private online place for you to go for help. Botler AI, a startup based in Montreal, on Wednesday launched a system that provides free information and guidance to those who have been sexually harassed and are unsure of their legal rights.

Using deep learning, the AI system was trained on more than 300,000 U.S. and Canadian criminal court documents, including over 57,000 documents and complaints related to sexual harassment. Using this information, the software predicts whether the situation explained by the user qualifies as sexual harassment, and notes which laws may have been violated under the criminal code. It then generates an incident report that the user can hand over to relevant authorities….

The tool starts by asking simple questions that can guide the software, like what state you live in and when the incident occured. Then, you explain your situation in plain language. The software then creates a report based on that account and what it has learned from the court cases on which it was trained.

The company’s ultimate goal is to provide free legal tools to help with a multitude of issues, not just sexual harassment. In this Botler isn’t alone—a similar company called DoNotPay started as an automated way to fight parking tickets but has since expanded massively (see “This Chatbot Will Help You Sue Anyone“)….(More).

The Wikipedia competitor that’s harnessing blockchain for epistemological supremacy


Peter Rubin at Wired: “At the time of this writing, the opening sentence of Larry Sanger’s Everipedia entry is pretty close to his Wikipedia entry. It describes him as “an American Internet project developer … best known as co-founder of Wikipedia.” By the time you read this, however, it may well mention a new, more salient fact—that Sanger recently became the Chief Information Officer of Everipedia itself, a site that seeks to become a better version of the online encyclopedia than the one he founded back in 2001. To do that, Sanger’s new employer is trying something that no other player in the space has done: moving to a blockchain.

Oh, blockchain, that decentralized “global ledger” that provides the framework for cryptocurrencies like Bitcoin (as well as a thousand explainer videos, and seemingly a thousand startups’ business plans). Blockchain already stands to make medical patient data easier to move and improve food safety; now, Everipedia’s founders hope, it will allow for a more powerful, accountable encyclopedia.

Here’s how it’ll work. Everipedia already uses a points system where creating articles and approved edits amasses “IQ.” In January, when the site moves over to a blockchain, Everipedia will convert IQ scores to a token-based currency, giving all existing editors an allotment proportionate to their IQ—and giving them a real, financial stake in Everipedia. From then on, creating and curating articles will allow users to earn tokens, which act as virtual shares of the platform. To prevent bad actors from trying to cash in with ill-founded or deliberately false articles and edits, Everipedia will force users to put up a token of their own in order to submit. If their work is accepted, they get their token back, plus a little bit for their contribution; if not, they lose their token. The assumption is that other users, motivated by the desire to maintain the site’s value, will actively seek to prevent such efforts….

This isn’t the first time a company has proposed a decentralized blockchain-based encyclopedia; earlier this year, a company called Lunyr announced similar plans. However, judging from Lunyr’s most recent roadmap, Everipedia will beat it to market with room to spare….(More)”.

Big data in social and psychological science: theoretical and methodological issues


Paper by Lin Qiu, Sarah Hian May Chan and David Chan in the Journal of Computational Social Science: “Big data presents unprecedented opportunities to understand human behavior on a large scale. It has been increasingly used in social and psychological research to reveal individual differences and group dynamics. There are a few theoretical and methodological challenges in big data research that require attention. In this paper, we highlight four issues, namely data-driven versus theory-driven approaches, measurement validity, multi-level longitudinal analysis, and data integration. They represent common problems that social scientists often face in using big data. We present examples of these problems and propose possible solutions….(More)”.