The Perils of Experimentation


Paper by Michael A. Livermore: “More than eighty years after Justice Brandeis coined the phrase “laboratories of democracy,” the concept of policy experimentation retains its currency as a leading justification for decentralized governance. This Article examines the downsides of experimentation, and in particular the potential for decentralization to lead to the production of information that exacerbates public choice failures. Standard accounts of experimentation and policy learning focus on information concerning the social welfare effects of alternative policies. But learning can also occur along a political dimension as information about ideological preferences, campaign techniques, and electoral incentives is revealed. Both types of information can be put to use in the policy arena by a host of individual and institutional actors that have a wide range of motives, from public-spirited concern for the general welfare to a desire to maximize personal financial returns. In this complex environment, there is no guarantee that the information that is generated by experimentation will lead to social benefits. This Article applies this insight to prior models of federalism developed in the legal and political science literature to show that decentralization can lead to the over-production of socially harmful information. As a consequence, policy makers undertaking a decentralization calculation should seek a level of decentralization that best balances the costs and benefits of information production. To illustrate the legal and policy implications of the arguments developed here, this Article examines two contemporary environmental rulemakings of substantial political, legal, and economic significance: a rule to define the jurisdictional reach of the Clean Water Act; and a rule to limit greenhouse gas emissions from the electricity generating sector….(More)”.

 

Private Data and the Public Good


Gideon Mann‘s remarks on the occasion of the Robert Khan distinguished lecture at The City College of New York on 5/22/16: and opportunities about a specific aspect of this relationship, the broader need for computer science to engage with the real world. Right now, a key aspect of this relationship is being built around the risks and opportunities of the emerging role of data.

Ultimately, I believe that these relationships, between computer science andthe real world, between data science and real problems, hold the promise tovastly increase our public welfare. And today, we, the people in this room,have a unique opportunity to debate and define a more moral dataeconomy….

The hybrid research model proposes something different. The hybrid research model, embeds, as it were, researchers as practitioners.The thought was always that you would be going about your regular run of business,would face a need to innovate to solve a crucial problem, and would do something novel. At that point, you might choose to work some extra time and publish a paper explaining your innovation. In practice, this model rarely works as expected. Tight deadlines mean the innovation that people do in their normal progress of business is incremental..

This model separated research from scientific publication, and shortens thetime-window of research, to what can be realized in a few year time zone.For me, this always felt like a tremendous loss, with respect to the older so-called “ivory tower” research model. It didn’t seem at all clear how this kindof model would produce the sea change of thought engendered byShannon’s work, nor did it seem that Claude Shannon would ever want towork there. This kind of environment would never support the freestanding wonder, like the robot mouse that Shannon worked on. Moreover, I always believed that crucial to research is publication and participation in the scientific community. Without this engagement, it feels like something different — innovation perhaps.

It is clear that the monopolistic environment that enabled AT&T to support this ivory tower research doesn’t exist anymore. .

Now, the hybrid research model was one model of research at Google, butthere is another model as well, the moonshot model as exemplified byGoogle X. Google X brought together focused research teams to driveresearch and development around a particular project — Google Glass and the Self-driving car being two notable examples. Here the focus isn’t research, but building a new product, with research as potentially a crucial blocking issue. Since the goal of Google X is directly to develop a new product, by definition they don’t publish papers along the way, but they’re not as tied to short-term deliverables as the rest of Google is. However, they are again decidedly un-Bell-Labs like — a secretive, tightly focused, non-publishing group. DeepMind is a similarly constituted initiative — working, for example, on a best-in-the-world Go playing algorithm, with publications happening sparingly.

Unfortunately, both of these approaches, the hybrid research model and the moonshot model stack the deck towards a particular kind of research — research that leads to relatively short term products that generate corporate revenue. While this kind of research is good for society, it isn’t the only kind of research that we need. We urgently need research that is longterm, and that is undergone even without a clear financial local impact. Insome sense this is a “tragedy of the commons”, where a shared public good (the commons) is not supported because everyone can benefit from itwithout giving back. Academic research is thus a non-rival, non-excludible good, and thus reasonably will be underfunded. In certain cases, this takes on an ethical dimension — particularly in health care, where the choice ofwhat diseases to study and address has a tremendous potential to affect human life. Should we research heart disease or malaria? This decisionmakes a huge impact on global human health, but is vastly informed by the potential profit from each of these various medicines….

Private Data means research is out of reach

The larger point that I want to make, is that in the absence of places where long-term research can be done in industry, academia has a tremendous potential opportunity. Unfortunately, it is actually quite difficult to do the work that needs to be done in academia, since many of the resources needed to push the state of the art are only found in industry: in particular data.

Of course, academia also lacks machine resources, but this is a simpler problem to fix — it’s a matter of money, resources form the government could go to enabling research groups building their own data centers or acquiring the computational resources from the market, e.g. Amazon. This is aided by the compute philanthropy that Google and Microsoft practice that grant compute cycles to academic organizations.

But the data problem is much harder to address. The data being collected and generated at private companies could enable amazing discoveries and research, but is impossible for academics to access. The lack of access to private data from companies actually is much more significant effects than inhibiting research. In particular, the consumer level data, collected by social networks and internet companies could do much more than ad targeting.

Just for public health — suicide prevention, addiction counseling, mental health monitoring — there is enormous potential in the use of our online behavior to aid the most needy, and academia and non-profits are set-up to enable this work, while companies are not.

To give a one examples, anorexia and eating disorders are vicious killers. 20 million women and 10 million men suffer from a clinically significant eating disorder at some time in their life, and sufferers of eating disorders have the highest mortality rate of any other mental health disorder — with a jaw-dropping estimated mortality rate of 10%, both directly from injuries sustained by the disorder and by suicide resulting from the disorder.

Eating disorders are particular in that sufferers often seek out confirmatory information, blogs, images and pictures that glorify and validate what sufferers see as “lifestyle” choices. Browsing behavior that seeks out images and guidance on how to starve yourself is a key indicator that someone is suffering. Tumblr, pinterest, instagram are places that people host and seek out this information. Tumblr has tried to help address this severe mental health issue by banning blogs that advocate for self-harm and by adding PSA announcements to query term searches for queries for or related to anorexia. But clearly — this is not the be all and end all of work that could be done to detect and assist people at risk of dying from eating disorders. Moreover, this data could also help understand the nature of those disorders themselves…..

There is probably a role for a data ombudsman within private organizations — someone to protect the interests of the public’s data inside of an organization. Like a ‘public editor’ in a newspaper according to how you’ve set it up. There to protect and articulate the interests of the public, which means probably both sides — making sure a company’s data is used for public good where appropriate, and making sure the ‘right’ to privacy of the public is appropriately safeguarded (and probably making sure the public is informed when their data is compromised).

Next, we need a platform to make collaboration around social good between companies and between companies and academics. This platform would enable trusted users to have access to a wide variety of data, and speed process of research.

Finally, I wonder if there is a way that government could support research sabbaticals inside of companies. Clearly, the opportunities for this research far outstrip what is currently being done…(more)”

How Open Data Is Creating New Opportunities in the Public Sector


Martin Yan at GovTech: Increased availability of open data in turn increases the ease with which citizens and their governments can collaborate, as well as equipping citizens to be active in identifying and addressing issues themselves. Technology developers are able to explore innovative uses of open data in combination with digital tools, new apps or other products that can tackle recognized inefficiencies. Currently, both the public and private sectors are teeming with such apps and projects….

Open data has proven to be a catalyst for the creation of new tools across industries and public-sector uses. Examples of a few successful projects include:

  • Citymapper — The popular real-time public transport app uses open data from Apple, Google, Cyclestreets, OpenStreetMaps and more sources to help citizens navigate cities. Features include A-to-B trip planning with ETA, real-time departures, bike routing, transit maps, public transport line status, real-time disruption alerts and integration with Uber.
  • Dataverse Project — This project from Harvard’s Institute for Quantitative Social Science makes it easy to share, explore and analyze research data. By simplifying access to this data, the project allows researchers to replicate others’ work to the benefit of all.
  • Liveplasma — An interactive search engine, Liveplasma lets users listen to music and view a web-like visualization of similar songs and artists, seeing how they are related and enabling discovery. Content from YouTube is streamed into the data visualizations.
  • Provenance — The England-based online platform lets users trace the origin and history of a product, also providing its manufacturing information. The mission is to encourage transparency in the practices of the corporations that produce the products we all use.

These examples demonstrate open data’s reach, value and impact well beyond the public sector. As open data continues to be put to wider use, the results will not be limited to increased efficiency and reduced wasteful spending in government, but will also create economic growth and jobs due to the products and services using the information as a foundation.

However, in the end, it won’t be the data alone that solves issues. Rather, it will be dependent on individual citizens, developers and organizations to see the possibilities, take up the call to arms and use this available data to introduce changes that make our world better….(More)”

La Primaire Wants To Help French Voters Bypass Traditional Parties


Federico Guerrini in Forbes: “French people, like the citizens of many other countries, have little confidence in their government or in their members of parliament.

A recent study by the Center for Political Research of the University of Science-Po(CEVIPOF) in Paris, shows that while residents still trust, in part, their local officials, only 37% of them on average feel the same for those belonging to theNational Assembly, the Senate or the executive.

Three years before, when asked in another poll about of what sprung to mind first when thinking of politics, their first answer was “disgust”.

With this sort of background, it is perhaps unsurprising that a number of activists have decided to try and find new ways to boost political participation, using crowdsourcing, smartphone applications and online platforms to look for candidates outside of the usual circles.

There are several civic tech initiatives in place in France right now. One of the most fascinating is called LaPrimaire.org.

It’s an online platform whose main aim is to organize an open primary election,select a suitable candidate, and allow him to run for President in the 2017elections.

Launched in April by Thibauld Favre and David Guez, an engineer and a lawyer by trade, both with no connection to the political establishment, it has attracted so far 164 self-proposed candidates and some 26,000 voters. Anyone can be elected, as long as they live in France, do not belong to any political party and have a clean criminal record.

primariacandidati

A different class of possible candidates, also present on the website, is composed by the so-called “citoyens plébiscités”, VIPs, politician or celebrities that backers of LaPrimaire.org think should run for president. In both cases, in order to qualify for the next phase of the selection, these people have to secure the vote of at least 500 supporters by July 14….(More)”

Transparency reports make AI decision-making accountable


Phys.org: “Machine-learning algorithms increasingly make decisions about credit, medical diagnoses, personalized recommendations, advertising and job opportunities, among other things, but exactly how usually remains a mystery. Now, new measurement methods developed by Carnegie Mellon University researchers could provide important insights to this process.

 Was it a person’s age, gender or education level that had the most influence on a decision? Was it a particular combination of factors? CMU’s Quantitative Input Influence (QII) measures can provide the relative weight of each factor in the final decision, said Anupam Datta, associate professor of computer science and electrical and computer engineering.

“Demands for algorithmic transparency are increasing as the use of algorithmic decision-making systems grows and as people realize the potential of these systems to introduce or perpetuate racial or sex discrimination or other social harms,” Datta said.

“Some companies are already beginning to provide transparency reports, but work on the computational foundations for these reports has been limited,” he continued. “Our goal was to develop measures of the degree of influence of each factor considered by a system, which could be used to generate transparency reports.”

These reports might be generated in response to a particular incident—why an individual’s loan application was rejected, or why police targeted an individual for scrutiny or what prompted a particular medical diagnosis or treatment. Or they might be used proactively by an organization to see if an artificial intelligence system is working as desired, or by a regulatory agency to see whether a decision-making system inappropriately discriminated between groups of people….(More)”

Nudge 2.0: A broader toolkit for lasting behavior change


Cait Lamberton and Benjamin Castleman in the Huffington Post: “Nudges are all around us. Chances are that someone has nudged you today—even if you didn’t realize it. Maybe it was your doctor’s office, sending you a text message about an upcoming appointment. Or maybe it was an airline website, urging you to make a reservation because “only three tickets are left at this price.” In fact, the private sector has been nudging us in one way or another for at least 75 years, since the heyday of the Madison Avenue Ad Men.

It’s taken a few generations, but the public sector is starting to catch on. In policy domains ranging from consumer finance and public health to retirement planning and education, researchers are applying behavioral science insights to help people make more informed decisions that lead to better long-term outcomes.

Sometimes these nudges take the form of changing the rules that determine whether someone participates in a program or not (like switching the default so people are automatically enrolled in a retirement savings plan unless they opt out, rather than only enrolling people who actively sign up for the program). But oftentimes, nudges can be as simple as sending people simplified information about opportunities that are available to them, or reminders about important tasks they have to complete in order to participate in beneficial programs.

A growing body of research demonstrates that nudges like these, despite being low touch and costing very little, can lead to substantial improvements in educational outcomes, whether it’s parents reading more to their children, middle school students completing more class assignments, or college students successfully persisting in college….

As impressive as these results have been, many of the early nudge studies in education have focused on fairly low-hanging fruit. We’re often helping people follow through on an intention they already have, or informing them about opportunities or resources that they didn’t know or were confused about. What’s less clear, however, is how well these strategies can support sustained behavior change, like going to school every day or avoiding substance abuse….

But what if we want to change someone’s direction? In real-world terms, what if a student is struggling in school but isn’t even considering looking for help? What if their lives are too busy for them to search for or meet with a tutor on a consistent basis? What if they have a nagging feeling that they’re just not the kind of person who succeeds in school, so they don’t see the point in even trying?

For these types of behavior change, we need an expanded nudge toolkit—what we’ll call Nudge 2.0. These strategies go beyond information simplification, reminders, and professional assistance, and address the decision-making person more holistically- people’s identity, their psychology, their emotions, and the competing forces that vie for their attention….(More)”

All European scientific articles to be freely accessible by 2020


EU Presidency: “All scientific articles in Europe must be freely accessible as of 2020. EU member states want to achieve optimal reuse of research data. They are also looking into a European visa for foreign start-up founders.

And, according to the new Innovation Principle, new European legislation must take account of its impact on innovation. These are the main outcomes of the meeting of the Competitiveness Council in Brussels on 27 May.

Sharing knowledge freely

Under the presidency of Netherlands State Secretary for Education, Culture and Science Sander Dekker, the EU ministers responsible for research and innovation decided unanimously to take these significant steps. Mr Dekker is pleased that these ambitions have been translated into clear agreements to maximise the impact of research. ‘Research and innovation generate economic growth and more jobs and provide solutions to societal challenges,’ the state secretary said. ‘And that means a stronger Europe. To achieve that, Europe must be as attractive as possible for researchers and start-ups to locate here and for companies to invest. That calls for knowledge to be freely shared. The time for talking about open access is now past. With these agreements, we are going to achieve it in practice.’

Open access

Open access means that scientific publications on the results of research supported by public and public-private funds must be freely accessible to everyone. That is not yet the case. The results of publicly funded research are currently not accessible to people outside universities and knowledge institutions. As a result, teachers, doctors and entrepreneurs do not have access to the latest scientific insights that are so relevant to their work, and universities have to take out expensive subscriptions with publishers to gain access to publications.

Reusing research data

From 2020, all scientific publications on the results of publicly funded research must be freely available. It also must be able to optimally reuse research data. To achieve that, the data must be made accessible, unless there are well-founded reasons for not doing so, for example intellectual property rights or security or privacy issues….(More)”

Time for sharing data to become routine: the seven excuses for not doing so are all invalid


Paper by Richard Smith and Ian Roberts: “Data are more valuable than scientific papers but researchers are incentivised to publish papers not share data. Patients are the main beneficiaries of data sharing but researchers have several incentives not to share: others might use their data to get ahead in the academic rat race; they might be scooped; their results might not be replicable; competitors may reach different conclusions; their data management might be exposed as poor; patient confidentiality might be breached; and technical difficulties make sharing impossible. All of these barriers can be overcome and researchers should be rewarded for sharing data. Data sharing must become routine….(More)”

Data Science Ethical Framework


UK Cabinet Office: “Data science provides huge opportunities for government. Harnessing new forms of data with increasingly powerful computer techniques increases operational efficiency, improves public services and provides insight for better policymaking.

We want people in government to feel confident using data science techniques to innovate. This guidance is intended to bring together relevant laws and best practice, to give teams robust principles to work with.

The publication is a first version that we are asking the public, experts, civil servants and other interested parties to help us perfect and iterate. This will include taking on evidence from a public dialogue on data science ethics. It was published on 19 May by the Minister for Cabinet Office, Matt Hancock. If you would like to help us iterate the framework, find out how to get in touch at the end of this blog. See Data Science Ethical Framework (PDF, 8.28MB, 17 pages). This file may not be suitable for users of assistive technology. Request an accessible format.

Improving patient care by bridging the divide between doctors and data scientists


 at the Conversation: “While wonderful new medical discoveries and innovations are in the news every day, doctors struggle daily with using information and techniques available right now while carefully adopting new concepts and treatments. As a practicing doctor, I deal with uncertainties and unanswered clinical questions all the time….At the moment, a report from the National Academy of Medicine tells us, most doctors base most of their everyday decisions on guidelines from (sometimes biased) expert opinions or small clinical trials. It would be better if they were from multicenter, large, randomized controlled studies, with tightly controlled conditions ensuring the results are as reliable as possible. However, those are expensive and difficult to perform, and even then often exclude a number of important patient groups on the basis of age, disease and sociological factors.

Part of the problem is that health records are traditionally kept on paper, making them hard to analyze en masse. As a result, most of what medical professionals might have learned from experiences was lost – or at least was inaccessible to another doctor meeting with a similar patient.

A digital system would collect and store as much clinical data as possible from as many patients as possible. It could then use information from the past – such as blood pressure, blood sugar levels, heart rate and other measurements of patients’ body functions – to guide future doctors to the best diagnosis and treatment of similar patients.

Industrial giants such as Google, IBM, SAP and Hewlett-Packard have also recognized the potential for this kind of approach, and are now working on how to leverage population data for the precise medical care of individuals.

Collaborating on data and medicine

At the Laboratory of Computational Physiology at the Massachusetts Institute of Technology, we have begun to collect large amounts of detailed patient data in the Medical Information Mart in Intensive Care (MIMIC). It is a database containing information from 60,000 patient admissions to the intensive care units of the Beth Israel Deaconess Medical Center, a Boston teaching hospital affiliated with Harvard Medical School. The data in MIMIC has been meticulously scoured so individual patients cannot be recognized, and is freely shared online with the research community.

But the database itself is not enough. We bring together front-line clinicians (such as nurses, pharmacists and doctors) to identify questions they want to investigate, and data scientists to conduct the appropriate analyses of the MIMIC records. This gives caregivers and patients the best individualized treatment options in the absence of a randomized controlled trial.

Bringing data analysis to the world

At the same time we are working to bring these data-enabled systems to assist with medical decisions to countries with limited health care resources, where research is considered an expensive luxury. Often these countries have few or no medical records – even on paper – to analyze. We can help them collect health data digitally, creating the potential to significantly improve medical care for their populations.

This task is the focus of Sana, a collection of technical, medical and community experts from across the globe that is also based in our group at MIT. Sana has designed a digital health information system specifically for use by health providers and patients in rural and underserved areas.

At its core is an open-source system that uses cellphones – common even in poor and rural nations – to collect, transmit and store all sorts of medical data. It can handle not only basic patient data such as height and weight, but also photos and X-rays, ultrasound videos, and electrical signals from a patient’s brain (EEG) and heart (ECG).

Partnering with universities and health organizations, Sana organizes training sessions (which we call “bootcamps”) and collaborative workshops (called “hackathons”) to connect nurses, doctors and community health workers at the front lines of care with technology experts in or near their communities. In 2015, we held bootcamps and hackathons in Colombia, Uganda, Greece and Mexico. The bootcamps teach students in technical fields like computer science and engineering how to design and develop health apps that can run on cellphones. Immediately following the bootcamp, the medical providers join the group and the hackathon begins…At the end of the day, though, the purpose is not the apps….(More)