Private Data and the Public Good


Gideon Mann‘s remarks on the occasion of the Robert Khan distinguished lecture at The City College of New York on 5/22/16: and opportunities about a specific aspect of this relationship, the broader need for computer science to engage with the real world. Right now, a key aspect of this relationship is being built around the risks and opportunities of the emerging role of data.

Ultimately, I believe that these relationships, between computer science andthe real world, between data science and real problems, hold the promise tovastly increase our public welfare. And today, we, the people in this room,have a unique opportunity to debate and define a more moral dataeconomy….

The hybrid research model proposes something different. The hybrid research model, embeds, as it were, researchers as practitioners.The thought was always that you would be going about your regular run of business,would face a need to innovate to solve a crucial problem, and would do something novel. At that point, you might choose to work some extra time and publish a paper explaining your innovation. In practice, this model rarely works as expected. Tight deadlines mean the innovation that people do in their normal progress of business is incremental..

This model separated research from scientific publication, and shortens thetime-window of research, to what can be realized in a few year time zone.For me, this always felt like a tremendous loss, with respect to the older so-called “ivory tower” research model. It didn’t seem at all clear how this kindof model would produce the sea change of thought engendered byShannon’s work, nor did it seem that Claude Shannon would ever want towork there. This kind of environment would never support the freestanding wonder, like the robot mouse that Shannon worked on. Moreover, I always believed that crucial to research is publication and participation in the scientific community. Without this engagement, it feels like something different — innovation perhaps.

It is clear that the monopolistic environment that enabled AT&T to support this ivory tower research doesn’t exist anymore. .

Now, the hybrid research model was one model of research at Google, butthere is another model as well, the moonshot model as exemplified byGoogle X. Google X brought together focused research teams to driveresearch and development around a particular project — Google Glass and the Self-driving car being two notable examples. Here the focus isn’t research, but building a new product, with research as potentially a crucial blocking issue. Since the goal of Google X is directly to develop a new product, by definition they don’t publish papers along the way, but they’re not as tied to short-term deliverables as the rest of Google is. However, they are again decidedly un-Bell-Labs like — a secretive, tightly focused, non-publishing group. DeepMind is a similarly constituted initiative — working, for example, on a best-in-the-world Go playing algorithm, with publications happening sparingly.

Unfortunately, both of these approaches, the hybrid research model and the moonshot model stack the deck towards a particular kind of research — research that leads to relatively short term products that generate corporate revenue. While this kind of research is good for society, it isn’t the only kind of research that we need. We urgently need research that is longterm, and that is undergone even without a clear financial local impact. Insome sense this is a “tragedy of the commons”, where a shared public good (the commons) is not supported because everyone can benefit from itwithout giving back. Academic research is thus a non-rival, non-excludible good, and thus reasonably will be underfunded. In certain cases, this takes on an ethical dimension — particularly in health care, where the choice ofwhat diseases to study and address has a tremendous potential to affect human life. Should we research heart disease or malaria? This decisionmakes a huge impact on global human health, but is vastly informed by the potential profit from each of these various medicines….

Private Data means research is out of reach

The larger point that I want to make, is that in the absence of places where long-term research can be done in industry, academia has a tremendous potential opportunity. Unfortunately, it is actually quite difficult to do the work that needs to be done in academia, since many of the resources needed to push the state of the art are only found in industry: in particular data.

Of course, academia also lacks machine resources, but this is a simpler problem to fix — it’s a matter of money, resources form the government could go to enabling research groups building their own data centers or acquiring the computational resources from the market, e.g. Amazon. This is aided by the compute philanthropy that Google and Microsoft practice that grant compute cycles to academic organizations.

But the data problem is much harder to address. The data being collected and generated at private companies could enable amazing discoveries and research, but is impossible for academics to access. The lack of access to private data from companies actually is much more significant effects than inhibiting research. In particular, the consumer level data, collected by social networks and internet companies could do much more than ad targeting.

Just for public health — suicide prevention, addiction counseling, mental health monitoring — there is enormous potential in the use of our online behavior to aid the most needy, and academia and non-profits are set-up to enable this work, while companies are not.

To give a one examples, anorexia and eating disorders are vicious killers. 20 million women and 10 million men suffer from a clinically significant eating disorder at some time in their life, and sufferers of eating disorders have the highest mortality rate of any other mental health disorder — with a jaw-dropping estimated mortality rate of 10%, both directly from injuries sustained by the disorder and by suicide resulting from the disorder.

Eating disorders are particular in that sufferers often seek out confirmatory information, blogs, images and pictures that glorify and validate what sufferers see as “lifestyle” choices. Browsing behavior that seeks out images and guidance on how to starve yourself is a key indicator that someone is suffering. Tumblr, pinterest, instagram are places that people host and seek out this information. Tumblr has tried to help address this severe mental health issue by banning blogs that advocate for self-harm and by adding PSA announcements to query term searches for queries for or related to anorexia. But clearly — this is not the be all and end all of work that could be done to detect and assist people at risk of dying from eating disorders. Moreover, this data could also help understand the nature of those disorders themselves…..

There is probably a role for a data ombudsman within private organizations — someone to protect the interests of the public’s data inside of an organization. Like a ‘public editor’ in a newspaper according to how you’ve set it up. There to protect and articulate the interests of the public, which means probably both sides — making sure a company’s data is used for public good where appropriate, and making sure the ‘right’ to privacy of the public is appropriately safeguarded (and probably making sure the public is informed when their data is compromised).

Next, we need a platform to make collaboration around social good between companies and between companies and academics. This platform would enable trusted users to have access to a wide variety of data, and speed process of research.

Finally, I wonder if there is a way that government could support research sabbaticals inside of companies. Clearly, the opportunities for this research far outstrip what is currently being done…(more)”

Infomocracy (Novel)


Malka Older’s debut novel: “It’s been twenty years and two election cycles since Information, a powerful search engine monopoly, pioneered the switch from warring nation-states to global micro-democracy. The corporate coalition party Heritage has won the last two elections. With another election on the horizon, the Supermajority is in tight contention, and everything’s on the line.

With power comes corruption. For Ken, this is his chance to do right by the idealistic Policy1st party and get a steady job in the big leagues. For Domaine, the election represents another staging ground in his ongoing struggle against the pax democratica. For Mishima, a dangerous Information operative, the whole situation is a puzzle: how do you keep the wheels running on the biggest political experiment of all time, when so many have so much to gain?…(More)

#OpenZika project


World Community Grid: “In February 2016, the World Health Organization declared the Zika virus to be a global public health emergency due to its rapid spread and new concerns about its link to a rise in neurological conditions.

The virus is rapidly spreading in new geographic areas such as the Americas, where people have not been previously exposed to the disease and therefore have little immunity to it. In April 2016, the Centers for Disease Control announced that a rise in severe neurological disorders, especially in children, has been linked to the Zika virus. Some pregnant women who have contracted the Zika virus have given birth to infants with a condition called microcephaly, which results in brain development issues typically leading to severe mental deficiencies. In other cases, paralysis and other neurological problems can occur, even in adults.

Problem

Currently, there is no vaccine to provide immunity to the disease and no antiviral drug for curing Zika, although various efforts are underway. Even though the virus was first identified in 1947, there has been little research since then, because the symptoms of the infection are usually mild. However, new data on links between Zika and microcephaly or other neurological issues have revealed that the disease may not be so benign, prompting the need for intensified research efforts.

Proposed Solution

The OpenZika project on World Community Grid aims to identify drug candidates to treat the Zika virus in someone who has been infected. The project will target proteins that the Zika virus likely uses to survive and spread in the body, based on what is known from similar diseases, such as dengue virus and yellow fever. In order to develop an anti-Zika drug, researchers need to identify which of millions of chemical compounds might be effective at interfering with these key proteins. The effectiveness of each compound will be tested in virtual experiments, called “docking calculations,” performed on World Community Grid volunteers’ computers and Android devices. These calculations would help researchers focus on the most likely compounds that may eventually lead to an antiviral medicine….(More)”

If you build it… will they come?


Laura Bacon at Omidyar Network: “What do datasets on Danish addresses, Indonesian elections, Singapore Dengue Fever, Slovakian contracts, Uruguayan health service provision, and Global weather systems have in common? Read on to learn more…

On May 12, 2016, more than 40 nations’ leaders gathered in London for an Anti-Corruption Summit, convened by UK Prime Minister David Cameron. Among the commitments made, 40 countries pledged to make their procurement processes open by default, with 14 countries specifically committing to publish to the Open Contracting Data Standard.

This conference and these commitments can be seen as part of a larger global norm toward openness and transparency, also embodied by the Open Government Partnership, Open Data Charter, and increasing numbers of Open Data Portals.

As government data is increasingly published openly in the public domain, valid questions have been raised about what impact the data will have: As governments release this data, will it be accessed and used? Will it ultimately improve lives, root out corruption, hold answers to seemingly intractable problems, and lead to economic growth?*

Omidyar Network — having supported several Open Data organizations and platforms such as Open Data Institute, Open Knowledge, and Web Foundation — sought data-driven answers to these questions. After a public call for proposals, we selected NYU’s GovLab to conduct research on the impact open data has already had. Not the potential or prospect of impact, but past proven impact. The GovLab research team, led by Stefaan Verhulst, investigated a variety of sectors — health, education, elections, budgets, contracts, etc. — in a variety of locations, spanning five continents.

Their findings are promising and exciting, demonstrating that open data is changing the world by empowering people, improving governance, solving public problems, and leading to innovation. A summary is contained in thisKey Findings report, and is accompanied by many open data case studies posted in this Open Data Impact Repository.

Of course, stories such as this are not 100% rosy, and the report is clear about the challenges ahead. There are plenty of cases in which open data has had minimal impact. There are cases where there was negative impact. And there are obstacles to open data reaching its full potential: namely, open data projects that don’t respond to citizens’ questions and needs, a lack of technical capacity on either the data provider and data user side, inadequate protections for privacy and security, and a shortage of resources.

But this research holds good news: Danish addresses, Indonesian elections,Singapore Dengue Fever, Slovakian contracts, Uruguayan health service provision, Global weather systems, and others were all opened up. And all changed the world by empowering citizens, improving governance, solving public problems, and leading to innovation. Please see this report for more….(More)”

See also odimpact.org

How to implement “open innovation” in city government


Victor Mulas at the Worldbank: “City officials are facing increasingly complex challenges. As urbanization rates grow, cities face higher demand for services from a larger and more densely distributed population. On the other hand, rapid changes in the global economy are affecting cities that struggle to adapt to these changes, often resulting in economic depression and population drain.

“Open innovation” is the latest buzz word circulating in forums on how to address the increased volume and complexity of challenges for cities and governments in general.

But, what is open innovation?

Traditionally, public services were designed and implemented by a group of public officials. Open innovation allows us to design these services with multiple actors, including those who stand to benefit from the services, resulting in more targeted and better tailored services, often implemented through partnership with these stakeholders. Open innovation allows cities to be more productive in providing services while addressing increased demand and higher complexity of services to be delivered.

New York, Barcelona, Amsterdam and many other cities have been experimenting with this concept, introducing challenges for entrepreneurs to address common problems or inviting stakeholders to co-create new services.   Open innovation has gone from being a “buzzword” to another tool in the city officials’ toolbox.

However, even cities that embrace open innovation are still struggling to implement it beyond a few specific areas.  This is understandable, as introducing open innovation practically requires a new way of doing things for city governments, which tend to be complex and bureaucratic organizations.

Counting with an engaged mayor is not enough to bring this kind of transformation. Changing the behavior of city officials requires their buy-in, it can’t be done top down

We have been introducing open innovation to cities and governments for the last three years in Chile, Colombia, Egypt and Mozambique. We have addressed specific challenges and iteratively designed and tested a systematic methodology to introduce open innovation in government through both a top-down and a bottom-up approaches. We have tested this methodology in Colombia (Cali, Barranquilla and Manizales) and Chile (metropolitan area of Gran Concepción).   We have identified “internal champions” (i.e., government officials who advocate the new methodology), and external stakeholders organized in an “innovation hub” that provides long-term sustainability and scalability of interventions. We believe that this methodology is easily applicable beyond cities to other government entities at the regional and national levels. …To understand how the methodology practically works, we describe in this report the process and its results in its application in the city area of Gran Concepción, in Chile. For this activity, the urban transport sector was selected and the target of intervention were the regional and municipal government departments in charge or urban transport in the area of Gran Concepción. The activity in Chile resulted in a threefold impact:

  1. It catalyzed the adoption of the bottom-up smart city model following this new methodology throughout Chile; and
  2. It expanded the implementation and mainstreaming of the methodologies developed and tested through this activity in other World Bank projects.

More information about this activity in Chile can be found in the Smart City Gran Concepcion webpage…(More)”

Open data + increased disclosure = better public-private partnerships


David Bloomgarden and Georg Neumann at Fomin Blog: “The benefits of open and participatory public procurement are increasingly being recognized by international bodies such as the Group of 20 major economies, the Organisation for Economic Co-operation and Development, and multilateral development banks. Value for money, more competition, and better goods and services for citizens all result from increased disclosure of contract data. Greater openness is also an effective tool to fight fraud and corruption.

However, because public-private partnerships (PPPs) are planned during a long timeframe and involve a large number of groups, therefore, implementing greater levels of openness in disclosure is complicated. This complexity can be a challenge to good design. Finding a structured and transparent approach to managing PPP contract data is fundamental for a project to be accepted and used by its local community….

In open contracting, all data is disclosed during the public procurement process—from the planning stage, to the bidding and awarding of the contract, to the monitoring of the implementation. A global open source data standard is used to publish that data, which is already being implemented in countries as diverse as Canada, Paraguay, and the Ukraine. Using open data throughout the contracting process provides opportunities to innovate in managing bids, fixing problems, and integrating feedback as needed. Open contracting contributes to the overall social and environmental sustainability of infrastructure investments.

In the case of Mexico’s airport, the project publishes details of awarded contracts, including visualizing the flow of funds and detailing the full amounts of awarded contracts and renewable agreements. Standardized, timely, and open data that follow global standards such as the Open Contracting Data Standard will make this information useful for analysis of value for money, cost-benefit, sustainability, and monitoring performance. Crucially, open contracting will shift the focus from the inputs into a PPP, to the outputs: the goods and services being delivered.

Benefits of open data for PPPs

We think that better and open data will lead to better PPPs. Here’s how:

1. Using user feedback to fix problems

The Brazilian state of Minas Gerais has been a leader in transparent PPP contracts with full proactive disclosure of the contract terms, as well as of other relevant project information—a practice that puts a government under more scrutiny but makes for better projects in the long run.

According to Marcos Siqueira, former head of the PPP Unit in Minas Gerais, “An adequate transparency policy can provide enough information to users so they can become contract watchdogs themselves.”

For example, a public-private contract was signed in 2014 to build a $300 million waste treatment plant for 2.5 million people in the metropolitan area of Belo Horizonte, the capital of Minas Gerais. As the team members conducted appraisals, they disclosed them on the Internet. In addition, the team held around 20 public meetings and identified all the stakeholders in the project. One notable result of the sharing and discussion of this information was the relocation of the facility to a less-populated area. When the project went to the bidding phase, it was much closer to the expectations of its various stakeholders.

2. Making better decisions on contracts and performance

Chile has been a leader in developing PPPs (which it refers to as concessions) for several decades, in a range of sectors: urban and inter-urban roads, seaports, airports, hospitals, and prisons. The country tops the list for the best enabling environment for PPPs in Latin America and the Caribbean, as measured by Infrascope, an index produced by the Economist Intelligence Unit and the Multilateral Investment Fund of the IDB Group.

Chile’s distinction is that it discloses information on performance of PPPs that are underway. The government’s Concessions Unit regularly publishes summaries of the projects during their different phases, including construction and operation. The reports are non-technical, yet include all the necessary information to understand the scope of the project…(More)”

The Small World Initiative: An Innovative Crowdsourcing Platform for Antibiotics


Ana Maria Barral et al in FASEB Journal: “The Small World Initiative™ (SWI) is an innovative program that encourages students to pursue careers in science and sets forth a unique platform to crowdsource new antibiotics. It centers around an introductory biology course through which students perform original hands-on field and laboratory research in the hunt for new antibiotics. Through a series of student-driven experiments, students collect soil samples, isolate diverse bacteria, test their bacteria against clinically-relevant microorganisms, and characterize those showing inhibitory activity. This is particularly relevant since over two thirds of antibiotics originate from soil bacteria or fungi. SWI’s approach also provides a platform to crowdsource antibiotic discovery by tapping into the intellectual power of many people concurrently addressing a global challenge and advances promising candidates into the drug development pipeline. This unique class approach harnesses the power of active learning to achieve both educational and scientific goals…..We will discuss our preliminary student evaluation results, which show the compelling impact of the program in comparison to traditional introductory courses. Ultimately, the mission of the program is to provide an evidence-based approach to teaching introductory biology concepts in the context of a real-world problem. This approach has been shown to be particularly impactful on underrepresented STEM talent pools, including women and minorities….(More)”

Global sharing of HIV vaccine research


Springwise: “Noticing that a global, collaborative effort is missing in the world of HIV vaccine research, scientists came together to make it a reality. Populated by research from the Collaboration for AIDS Vaccine Discovery — an international network of laboratories — DataSpace is a partnership between the Statistical Center for HIV/AIDS Research and Prevention, data management and software development company LabKey, and technology product development company Artefact.

Through pooled research results, scientists hope to make data more accessible and comparable. Two aspects make the platform particularly powerful. The Artefact team hand-coded a number of research points to allow results from multiple studies to be compared like-for-like. And rather than discard the findings of failed or inconclusive studies, DataSpace includes them in analysis, vastly increasing the volume of available information.

Material is added as study results become available, creating a constantly developing resource. Being able to quickly test ideas online helps researchers make serendipitous connections and avoid duplicating efforts….(More)”

Insights On Collective Problem-Solving: Complexity, Categorization And Lessons From Academia


Part 3 of an interview series by Henry Farrell for the MacArthur Research Network on Opening Governance: “…Complexity theorists have devoted enormous energy and attention to thinking about how complex problems, in which different factors interact in ways that are hard to predict, can best be solved. One key challenge is categorizing problems, so as to understand which approaches are best suited to addressing them.

Scott Page is the Leonid Hurwicz Collegiate Professor of Complex Systems at the University of Michigan, Ann Arbor, and one of the world’s foremost experts on diversity and problem-solving. I asked him a series of questions about how we might use insights from academic research to think better about how problem solving works.

Henry: One of the key issues of collective problem-solving is what you call the ‘problem of problems’ – the question of identifying which problems we need to solve. This is often politically controversial – e.g., it may be hard to get agreement that global warming, or inequality, or long prison sentences are a problem. How do we best go about identifying problems, given that people may disagree?

Scott: In a recent big think paper on the potential of diversity for collective problem solving in Scientific American, Katherine Phillips writes that group members must feel validated, that they must share a commitment to the group, and they must have a common goal if they are going to contribute. This implies that you won’t succeed in getting people to collaborate by setting an agenda from on high and then seeking to attract diverse people to further that agenda.

One way of starting to tackle the problem of problems is to steal a rule of thumb from Getting to Yes, by getting to think people about their broad interests rather than the position that they’re starting from. People often agree on their fundamental desires but disagree on how they can be achieved. For example, nearly everyone wants less crime, but they may disagree over whether they think the solution to crime involves tackling poverty or imposing longer prison sentences. If you can get them to focus on their common interest in solving crime rather than their disagreements, you’re more likely to get them to collaborate usefully.

Segregation amplifies the problem of problems. We live in towns and neighborhoods segregated by race, income, ideology, and human capital. Democrats live near Democrats and Republicans near Republicans. Consensus requires integration. We must work across ideologies. Relatedly, opportunity requires more than access. Many people grow up not knowing any engineers, dentists, doctors, lawyers, and statisticians. This isolation narrows the set of careers they consider and it reduces the diversity of many professions. We cannot imagine lives we do not know.

Henry: Once you get past the problem of problems, you still need to identify which kind of problem you are dealing with. You identify three standard types of problems: solution problems, selection problems and optimization problems. What – very briefly – are the key differences between these kinds of problems?

Scott: I’m constantly pondering the potential set of categories in which collective intelligence can emerge. I’m teaching a course on collective intelligence this semester and the undergraduates and I developed an acronym SCARCE PIGS to describe the different types of domains. Here’s the brief summary:

  • Predict: when individuals combine information, models, or measurements to estimate a future event, guess an answer, or classify an event. Examples might involve betting markets, or combined efforts to guess a quantity, such as Francis Galton’s example of people at a fair trying to guess the weight of a steer.
  • Identify: when individuals have local, partial, or possibly erroneous knowledge and collectively can find an object. Here, an example is DARPA’s Red Balloon project.
  • Solve: when individuals apply and possibly combine higher order cognitive processes and analytic tools for the purpose of finding or improving a solution to a task. Innocentive and similar organizations provide examples of this.
  • Generate: when individuals apply diverse representations, heuristics, and knowledge to produce something new. An everyday example is creating a new building.
  • Coordinate: when individuals adopt similar actions, behaviors, beliefs, or mental frameworks by learning through local interactions. Ordinary social conventions such as people greeting each other are good examples.
  • Cooperate: when individuals take actions, not necessarily in their self interest, that collectively produce a desirable outcome. Here, think of managing common pool resources (e.g. fishing boats not overfishing an area that they collectively control).
  • Arrange: when individuals manipulate items in a physical or virtual environment for their own purposes resulting in an organization of that environment. As an example, imagine a student co-op which keeps twenty types of hot sauce in its pantry. If each student puts whichever hot sauce she uses in the front of the pantry, then on average, the hot sauces will be arranged according to popularity, with the most favored hot sauces in the front and the least favored lost in the back.
  • Respond: when individuals react to external or internal stimuli creating collective responses that maintains system level functioning. For example, when yellow jackets attack a predator to maintain the colony, they are displaying this kind of problem solving.
  • Emerge: when individual parts create a whole that has categorically distinct and new functionalities. The most obvious example of this is the human brain….(More)”

The Biggest Hope for Ending Corruption Is Open Public Contracting


Gavin Hayman at the Huffington Post: “This week the British Prime Minister David Cameron is hosting an international anti-corruption summit. The scourge of anonymous shell companies and hidden identities rightly seizes the public’s imagination. We can all picture the suitcases of cash and tropical islands involved. As well as acting on offshore and onshore money laundering havens, world leaders at the summit should also be asking themselves where all this money is being stolen from in the first place.

The answer is mostly from public contracting: government spending through private companies to deliver works, goods and services to citizens. It is technical, dull and universally obscure. But it is the single biggest item of spending by government – amounting to a staggering $9,500,000,000,000 each year. This concentration of money, government discretion, and secrecy makes public contracting so vulnerable to corruption. Data on prosecutions tracked by the OECD Anti-Bribery Convention shows that roughly 60% of bribes were paid to win public contracts.

Corruption in contracting deprives ordinary people of vital goods and services, and sometimes even kills: I was one of many Londoners moved by Ai Wei Wei’s installation that memorialised the names of thousands of children killed in China’s Sichuan earthquake in 2008. Their supposed earthquake-proof schools collapsed on them like tofu.

Beyond corruption, inefficiency and mismanagement of public contracts cost countries billions. Governments just don’t seem to know what they are buying, when, from whom, and whether they got a good price.

This problem can be fixed. But it will require a set of innovations best described as open contracting: using accessible open data and better engagement so that citizens, government and business can follow the money in government contracts from planning to tendering to performance and closure. The coordination required can be hard work but it is achievable: any country can make substantial progress on open contracting with some political leadership. My organisation supports an open data standard and a free global helpdesk to assist governments, civil society, and business in this transition….(More)”