Stefaan Verhulst
Richard Waters in the Financial Times: “The “open data” movement has produced a deluge of publicly available information this decade, as governments like those in the UK and US have released large volumes of data for general use.
But the flood has left researchers and data scientists with a problem: how do they find the best data sets, ensure these are accurate and up to date, and combine them with other sources of information?
The most ambitious in a spate of start-ups trying to tackle this problem is set to be unveiled on Monday, when data.world opens for limited release. A combination of online repository and social network, the site is designed to be a central platform to support the burgeoning activity around freely available data.
The aim closely mirrors Github, which has been credited with spurring the open source software movement by becoming both a place to store and find free programs as well as a crowdsourcing tool for identifying the most useful.
“We are at an inflection point,” said Jeff Meisel, chief marketing officer for the US Census Bureau. A “massive amount of data” has been released under open data provisions, he said, but “what hasn’t been there are the tools, the communities, the infrastructure to make that data easier to mash up”….
Data.world plans to seed its site with about a thousand data sets and attract academics as its first users, said Mr Hurt. By letting users create personal profiles on the site, follow others and collaborate around the information they are working on, the site hopes to create the kind of social dynamic that makes it more useful the more it is used.
An attraction of the service is the ability to upload data in any format and then use common web standards to link different data sets and create mash-ups with the information, said Dean Allemang, an expert in online data….(More)”
Pål Sundsøy at arXiv: “The present study provides the first evidence that illiteracy can be reliably predicted from standard mobile phone logs. By deriving a broad set of mobile phone indicators reflecting users financial, social and mobility patterns we show how supervised machine learning can be used to predict individual illiteracy in an Asian developing country, externally validated against a large-scale survey. On average the model performs 10 times better than random guessing with a 70% accuracy. Further we show how individual illiteracy can be aggregated and mapped geographically at cell tower resolution. Geographical mapping of illiteracy is crucial to know where the illiterate people are, and where to put in resources. In underdeveloped countries such mappings are often based on out-dated household surveys with low spatial and temporal resolution. One in five people worldwide struggle with illiteracy, and it is estimated that illiteracy costs the global economy more than 1 trillion dollars each year. These results potentially enable cost-effective, questionnaire-free investigation of illiteracy-related questions on an unprecedented scale…(More)”.
Julio Saez-Rodriguez et al in Nature: “The generation of large-scale biomedical data is creating unprecedented opportunities for basic and translational science. Typically, the data producers perform initial analyses, but it is very likely that the most informative methods may reside with other groups. Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies. When the crowdsourcing is done in the form of collaborative scientific competitions, known as Challenges, the validation of the methods is inherently addressed. Challenges also encourage open innovation, create collaborative communities to solve diverse and important biomedical problems, and foster the creation and dissemination of well-curated data repositories….(More)”
Book by Amal El Fallah Seghrouchni, Fuyuki Ishikawa, Laurent Hérault, and Hideyuki Tokuda: “Smart cities are a new vision for urban development. They integrate information and communication technology infrastructures – in the domains of artificial intelligence, distributed and cloud computing, and sensor networks – into a city, to facilitate quality of life for its citizens and sustainable growth. This book explores various concepts for the development of these new technologies (including agent-oriented programming, broadband infrastructures, wireless sensor networks, Internet-based networked applications, open data and open platforms), and how they can provide smart services and enablers in a range of public domains.
The most significant research, both established and emerging, is brought together to enable academics and practitioners to investigate the possibilities of smart cities, and to generate the knowledge and solutions required to develop and maintain them…(More)”
Pete Masters at MissingMaps: “Downloading MapSwipe means that in a matter of minutes you can contribute to the work being done around the world by MSF, the Red Cross and others, and all from the comfort of your own phone!
In a humanitarian crisis, the location of the most vulnerable people is fundamental information for delivering food, shelter, medical care and other services where they are most needed. And, although it may be hard to believe, millions people around the world are not represented on any accessible map.
Part of the Missing Maps project, MapSwipe enables anyone with a smartphone to contribute to the mapping of these vulnerable communities.Download the app, choose a mission, read the instructions and get started! One tap identifies features, a second tap indicates possible features (if you’re not sure) and a third flags cloudy or bad quality imagery areas. If there is nothing to identify, just swipe to the next image! You can download imagery for offline MapSwiping on an underground train or plane, so say goodbye to your wasted commuter hours on Candy Crush or Angry Birds!
Screenshots from MapSwipe
Developed by Medecins Sans Frontieres / Doctors Without Borders (MSF), MapSwipe serves a direct purpose for NGOs. For example, when MSF responds to major disease outbreaks with mass vaccination campaigns, hundreds of teams have to cover enormous areas (as in the measles outbreak in Democratic Republic of Congo last year). Now, with MapSwipe, we can give vaccination campaign coordinators a super-fast snapshot of where the population clusters are, helping them to send their teams to the locations where they are most needed to achieve maximum vaccination coverage.
The Missing Maps project is a collaborative project in which a large and committed community of NGOs, academic institutes, companies and most of all individual mappers map vulnerable areas in OpenStreetMap. By using MapSwipe to identify where communities are located, you also give these mappers the ability to use their talents to map the towns and villages in these areas without having to search through miles of jungle and bush to find them, saving time and helping to put valuable data into the hands of field teams even faster….(More)”
Philip Hunter at the EMBO Journal: “Personal health and medical data are a valuable commodity for a number of sectors from public health agencies to academic researchers to pharmaceutical companies. Moreover, “big data” companies are increasingly interested in tapping into this resource. One such firm is Google, whose subsidiary Deep Mind was granted access to medical records on 1.6 million patients who had been treated at some time by three major hospitals in London, UK, in order to develop a diagnostic app. The public discussion it raised was just another sign of the long‐going tensions between drug companies, privacy advocates, regulators, legislators, insurers and patients about privacy, consent, rights of access and ownership of medical data that is generated in pharmacies, hospitals and doctors’ surgeries. In addition, the rapid growth of eHealth will add a boon of even more health data from mobile phones, portable diagnostic devices and other sources.
These developments are driving efforts to create a legal framework for protecting confidentiality, controlling communication and governing access rights to data. Existing data protection and human rights laws are being modified to account for personal medical and health data in parallel to the campaign for greater transparency and access to clinical trial data. Healthcare agencies in particular will have to revise their procedures for handling medical or research data that is associated with patients.
Google’s foray into medical data demonstrates the key role of health agencies, in this case the Royal Free NHS Trust, which operates the three London hospitals that granted Deep Mind access to patient data. Royal Free approached Deep Mind with a request to develop an app for detecting acute kidney injury, which, according to the Trust, affects more than one in six inpatients….(More)”
Alexandra Flynn at Osgood Digital Commons: “Municipal staff and politicians are moving aside to let someone else make budget decisions – community residents. This practice, known as participatory budgeting or PB, is a completely different way of managing public money. It allows the public to both identify projects and programs that they want to see in their neighbourhoods, and to vote on which ones to fund. The process was developed twenty-five years ago and there are now over 1,500 participatory budgets around the world …
There is no one-size-fits all model for participatory budgeting. The UN-Habitat suggests that the following are essential pieces for the introduction of a participatory budgeting process: the will of the mayor, public interest, clarity on administration and the decisionmaking process, education tools on the budgeting process, widely distributed information on the participatory budgeting process through all possible means, and information on infrastructure and public service shortfalls. The UN-Habitat recommends that participatory budgeting should not be used if honesty and transparency are lacking in local administration. Municipal governments should be clear that the final decision rests with the elected representatives of the local authority and that the process does not replace representative democracy with direct referendums.
Municipalities may want to consider the following issues when implementing participatory budgeting in their communities….(More)”
New book by Francesco Guala: “Understanding Institutions proposes a new unified theory of social institutions that combines the best insights of philosophers and social scientists who have written on this topic. Francesco Guala presents a theory that combines the features of three influential views of institutions: as equilibria of strategic games, as regulative rules, and as constitutive rules.
Guala explains key institutions like money, private property, and marriage, and develops a much-needed unification of equilibrium- and rules-based approaches. Although he uses game theory concepts, the theory is presented in a simple, clear style that is accessible to a wide audience of scholars working in different fields. Outlining and discussing various implications of the unified theory, Guala addresses venerable issues such as reflexivity, realism, Verstehen, and fallibilism in the social sciences. He also critically analyses the theory of “looping effects” and “interactive kinds” defended by Ian Hacking, and asks whether it is possible to draw a demarcation between social and natural science using the criteria of causal and ontological dependence. Focusing on current debates about the definition of marriage, Guala shows how these abstract philosophical issues have important practical and political consequences.
Moving beyond specific cases to general models and principles, Understanding Institutions offers new perspectives on what institutions are, how they work, and what they can do for us….(More)”
Stefaan G. Verhulst at Positive Returns (Medium): “Over the last few years we have seen growing recognition of the potential of “civic tech,” or the use of technology that “empowers citizens to make government more accessible, efficient and effective (definition provided in “Engines of Change”)”. One commentator recently described “civic tech as the next big thing.” At the same time, we are yet to witness a true tech-enabled transformation of how government works and how citizens engage with institutions and with each other to solve societal problems. In many ways, civic tech still operates under the radar screen and often lacks broad acceptance. So how do we accelerate and expand the civic tech sector? How can we build a civic tech field that can last and stand the test of time?
The “Engines of Change” report written for Omidyar Network by Purpose seeks to provide an answer to these questions in the context of the United States….
Given the new insights gained from the report, how to move forward? How to translate its findings into a strategy that seeks to improve people’s lives and addresses societal problems by leveraging technology? What emerges from reading the report, and reflecting on how fields and movements have been built in other areas (e.g., the digital learning movement by theMacArthur Foundation or the Hewlett Foundation’s efforts to build a conflict resolution field), are a set of design principles that, when applied consistently, may generate a true lasting civic tech movement. These principles include:
- Define a common problem that matters enough to work on collectively and identify a unique opportunity to solve it. Most successful movements seek to solve hard problems. So what is the problem that civic tech seeks to address? …
- Encourage experimentation. As it stands, there is no shortage of experimentation with new platforms and tools in the civic tech space.What is missing, however, is the type of assessment that uncovers whether or not such efforts are actually working, and why or why not. Rather than viewing experimentation as simply “trying new things,” the field could embrace “fast-cycle action research” to understand both more quickly, and more precisely, when an innovation works, for whom, and under what conditions.
- Establish an evidence base and a common set of metrics. While there is good reason to believe that breakthrough solutions may come from using technology, there are still too little studies measuring exactly how impactful civic tech is. Without a deeper understanding of whether, when, why and to what extent an intervention has made an impact, the civic tech movement will lack credibility. To accelerate the rate of experimentation and create more agile institutions capable of piloting civic tech solutions, we need research that will enable the sector to move away from “faith-based” initiatives toward “evidence-based” ones. The TicTec conference, the Opening Governance Research Network and the recently launched Open Governance Research Exchange are some initiatives that seek to address this shortcoming. Yet more analysis and translation of current findings into clear baselines of impact against common metrics is needed to make the sector more reliable.
- Develop a Network Infrastructure…
- Identify the signal…
As every engineer knows, building engines requires a set of basic design principles. Similarly, transforming the civic tech sector into a sustainable engine of change may require the implementation of the principles outlined above. Let’s build a civic tech sector to last….(More)”
Arun Sundararajan in Fortune: “….Despite some regulators’ fears, the sharing economy may not result in the decline of regulation but rather in its opposite, providing a basis upon which society can develop more rational, ethical, and participatory models of regulation. But what regulation looks like, as well as who actually creates and enforce the regulation, is also bound to change.
There are three emerging models – peer regulation, self-regulatory organizations, and data-driven delegation – that promise a regulatory future for the sharing economy best aligned with society’s interests. In the adapted book excerpt that follows, I explain how the third of these approaches, of delegating enforcement of regulations to companies that store critical data on consumers, can help mitigate some of the biases Airbnb guests may face, and why this is a superior alternative to the “open data” approach of transferring consumer information to cities and state regulators.
Consider a different problem — of collecting hotel occupancy taxes from hundreds of thousands of Airbnb hosts rather than from a handful of corporate hotel chains. The delegation of tax collection to Airbnb, something a growing number of cities are experimenting with, has a number of advantages. It is likely to yield higher tax revenues and greater compliance than a system where hosts are required to register directly with the government, which is something occasional hosts seem reluctant to do. It also sidesteps privacy concerns resulting from mandates that digital platforms like Airbnb turn over detailed user data to the government. There is also significant opportunity for the platform to build credibility as it starts to take on quasi governmental roles like this.
There is yet another advantage, and the one I believe will be the most significant in the long-run. It asks a platform to leverage its data to ensure compliance with a set of laws in a manner geared towards delegating responsibility to the platform. You might say that the task in question here — computing tax owed, collecting, and remitting it—is technologically trivial. True. But I like this structure because of the potential it represents. It could be a precursor for much more exciting delegated possibilities.
For a couple of decades now, companies of different kinds have been mining the large sets of “data trails” customers provide through their digital interactions. This generates insights of business and social importance. One such effort we are all familiar with is credit card fraud detection. When an unusual pattern of activity is detected, you get a call from your bank’s security team. Sometimes your card is blocked temporarily. The enthusiasm of these digital security systems is sometimes a nuisance, but it stems from your credit card company using sophisticated machine learning techniques to identify patterns that prior experience has told it are associated with a stolen card. It saves billions of dollars in taxpayer and corporate funds by detecting and blocking fraudulent activity swiftly.
A more recent visible example of the power of mining large data sets of customer interaction came in 2008, when Google engineers announced that they could predict flu outbreaks using data collected from Google searches, and track the spread of flu outbreaks in real time, providing information that was well ahead of the information available using the Center for Disease Control’s (CDC) own tracking systems. The Google system’s performance deteriorated after a couple of years, but its impact on public perception of what might be possible using “big data” was immense.
It seems highly unlikely that such a system would have emerged if Google had been asked to hand over anonymized search data to the CDC. In fact, there would have probably been widespread public backlash to this on privacy grounds. Besides, the reason why this capability emerged organically from within Google is partly as a consequence of Google having one of the highest concentrations of computer science and machine learning talent in the world.
Similar approaches hold great promise as a regulatory approach for sharing economy platforms. Consider the issue of discriminatory practices. There has long been anecdotal evidence that some yellow cabs in New York discriminate against some nonwhite passengers. There have been similar concerns that such behavior may start to manifest on ridesharing platforms and in other peer-to-peer markets for accommodation and labor services.
For example, a 2014 study by Benjamin Edelman and Michael Luca of Harvard suggested that African American hosts might have lower pricing power than white hosts on Airbnb. While the study did not conclusively establish that the difference is due to guests discriminating against African American hosts, a follow-up study suggested that guests with “distinctively African American names” were less likely to receive favorable responses for their requests to Airbnb hosts. This research raises a red flag about the need for vigilance as the lines between personal and professional blur.
One solution would be to apply machine-learning techniques to be able to identify patterns associated with discriminatory behavior. No doubt, many platforms are already using such systems….(More)”