We Need a Data-Rich Picture of What’s Killing the Planet


Clive Thompson at Wired: “…Marine litter isn’t the only hazard whose contours we can’t fully see. The United Nations has 93 indicators to measure the environmental dimensions of “sustainable development,” and amazingly, the UN found that we have little to no data on 68 percent of them—like how rapidly land is being degraded, the rate of ocean acidification, or the trade in poached wildlife. Sometimes this is because we haven’t collected it; in other cases some data exists but hasn’t been shared globally, or it’s in a myriad of incompatible formats. No matter what, we’re flying blind. “And you can’t manage something if you can’t measure it,” says David Jensen, the UN’s head of environmental peacebuilding.

In other words, if we’re going to help the planet heal and adapt, we need a data revolution. We need to build a “digital eco­system for the environment,” as Jensen puts it.

The good news is that we’ve got the tools. If there’s one thing tech excels at (for good and ill), it’s surveillance, right? We live in a world filled with cameras and pocket computers, titanic cloud computing, and the eerily sharp insights of machine learning. And this stuff can be used for something truly worthwhile: studying the planet.

There are already some remarkable cases of tech helping to break through the fog. Consider Global Fishing Watch, a nonprofit that tracks the world’s fishing vessels, looking for overfishing. They use everything from GPS-like signals emitted by ships to satellite infrared imaging of ship lighting, plugged into neural networks. (It’s massive, cloud-scale data: over 60 million data points per day, making the AI more than 90 percent accurate at classifying what type of fishing activity a boat is engaged in.)

“If a vessel is spending its time in an area that has little tuna and a lot of sharks, that’s questionable,” says Brian Sullivan, cofounder of the project and a senior program manager at Google Earth Outreach. Crucially, Global Fishing Watch makes its data open to anyone­­­—so now the National Geographic Society is using it to lobby for new marine preserves, and governments and nonprofits use it to target illicit fishing.

If we want better environmental data, we’ll need for-profit companies with the expertise and high-end sensors to pitch in too. Planet, a firm with an array of 140 satellites, takes daily snapshots of the entire Earth. Customers like insurance and financial firms love that sort of data. (It helps them understand weather and climate risk.) But Planet also offers it to services like Global Forest Watch, which maps deforestation and makes the information available to anyone (like activists who help bust illegal loggers). Meanwhile, Google’s skill in cloud-based data crunching helps illuminate the state of surface water: Google digitized 30 years of measurements from around the globe—extracting some from ancient magnetic tapes—then created an easy-to-use online tool that lets resource-poor countries figure out where their water needs protecting….(More)”.

Access to Data in Connected Cars and the Recent Reform of the Motor Vehicle Type Approval Regulation


Paper by Wolfgang Kerber and Daniel Moeller: “The need for regulatory solutions for access to in-vehicle data and resources of connected cars is one of the big controversial and unsolved policy issues. Last year the EU revised the Motor Vehicle Type Approval Regulation which already entailed a FRAND-like solution for the access to repair and maintenance information (RMI) to protect competition on the automotive aftermarkets. However, the transition to connected cars changes the technological conditions for this regulatory solution significantly. This paper analyzes the reform of the type approval regulation and shows that the regulatory solutions for access to RMI are so far only very insufficiently capable of dealing with the challenges coming along with increased connectivity, e.g. with regard to the new remote diagnostic, repair and maintenance services. Therefore, an important result of the paper is that the transition to connected cars will require a further reform of the rules for the regulated access to RMI (esp. with regard to data access, interoperability, and safety/security issues). However, our analysis also suggests that the basic approach of the current regulated access regime for RMI in the type approval regulation can also be a model for developing general solutions for the currently unsolved problems of access to in-vehicle data and resources in the ecosystem of connected driving….(More)”.

The Education Data Collaborative: A new kind of partnership.


About: “Whether we work within schools or as part of the broader ecosystem of parent-teacher associations, and philanthropic, nonprofit, and volunteer organizations, we need data to guide decisions about investing our time and resources.

This data is typically expensive to gather, often unvalidated (e.g. self-reported), and commonly available only to those who collect or report it. It can even be hard to ask for data when it’s not clear what’s available. At the same time, information – in the form of discrete research, report-card style PDFs, or static websites – is everywhere. The result is that many already resource-thin organizations that could be collaborating around strategies to help kids advance, spend a lot of time in isolation collecting and searching for data.

In the past decade, we’ve seen solid progress in addressing part of the problem: the emergence of connected longitudinal data systems (LDS). These warehouses and  linked databases contain data that can help us understand how students progress over time. No personally identifiable information (or PII) is shared, yet the data can reveal where interventions are most needed. Because these systems are typically designed for researchers and policy professionals, they are rarely accessible to the educators, parents, and partners – arts, sports, academic enrichment (e.g. STEM), mentoring, and family support programs – that play such important roles in helping young people learn and succeed…

“We need open tools for the ecosystem – parents, volunteers, non-profit organizations and the foundations and agencies that support them. These partners can realize significant benefit from the same kind of data policy makers and education leaders hold in their LDS.


That’s why we’re launching the Education Data Collaborative. Working together, we can build tools that help us use data to improve the design, efficacy, and impact of programs and interventions and find new  way to work with public education systems to achieve great things for kids. …Data collaboratives, data trusts, and other kinds of multi-sector data partnerships are among the most important civic innovations to emerge in the past decade….(More)”

The Age of Digital Interdependence


Report of the High-level Panel on Digital Cooperation: “The immense power and value of data in the modern economy can and must be harnessed to meet the SDGs, but this will require new models of collaboration. The Panel discussed potential pooling of data in areas such as health, agriculture and the environment to enable scientists and thought leaders to use data and artificial intelligence to better understand issues and find new ways to make progress on the SDGs. Such data commons would require criteria for establishing relevance to the SDGs, standards for interoperability, rules on access and safeguards to ensure privacy and security.

Anonymised data – information that is rendered anonymous in such a way that the data subject is not or no longer identifiable – about progress toward the SDGs is generally less sensitive and controversial than the use of personal data of the kind companies such as Facebook, Twitter or Google may collect to drive their business models, or facial and gait data that could be used for surveillance. However, personal data can also serve development goals, if handled with proper oversight to ensure its security and privacy.

For example, individual health data is extremely sensitive – but many people’s health data, taken together, can allow researchers to map disease outbreaks, compare the effectiveness of treatments and improve understanding of conditions. Aggregated data from individual patient cases was crucial to containing the Ebola outbreak in West Africa. Private and public sector healthcare providers around the world are now using various forms of electronic medical records. These help individual patients by making it easier to personalise health services, but the public health benefits require these records to be interoperable.

There is scope to launch collaborative projects to test the interoperability of data, standards and safeguards across the globe. The World Health Assembly’s consideration of a global strategy for digital health in 2020 presents an opportunity to launch such projects, which could initially be aimed at global health challenges such as Alzheimer’s and hypertension.

Improved digital cooperation on a data-driven approach to public health has the potential to lower costs, build new partnerships among hospitals, technology companies, insurance providers and research institutes and support the shift from treating diseases to improving wellness. Appropriate safeguards are needed to ensure the focus remains on improving health care outcomes. With testing, experience and necessary protective measures as well as guidelines for the responsible use of data, similar cooperation could emerge in many other fields related to the SDGs, from education to urban planning to agriculture…(More)”.

Data & Policy: A new venue to study and explore policy–data interaction


Opening editorial by Stefaan G. Verhulst, Zeynep Engin and Jon Crowcroft: “…Policy–data interactions or governance initiatives that use data have been the exception rather than the norm, isolated prototypes and trials rather than an indication of real, systemic change. There are various reasons for the generally slow uptake of data in policymaking, and several factors will have to change if the situation is to improve. ….

  • Despite the number of successful prototypes and small-scale initiatives, policy makers’ understanding of data’s potential and its value proposition generally remains limited (Lutes, 2015). There is also limited appreciation of the advances data science has made the last few years. This is a major limiting factor; we cannot expect policy makers to use data if they do not recognize what data and data science can do.
  • The recent (and justifiable) backlash against how certain private companies handle consumer data has had something of a reverse halo effect: There is a growing lack of trust in the way data is collected, analyzed, and used, and this often leads to a certain reluctance (or simply risk-aversion) on the part of officials and others (Engin, 2018).
  • Despite several high-profile open data projects around the world, much (probably the majority) of data that could be helpful in governance remains either privately held or otherwise hidden in silos (Verhulst and Young, 2017b). There remains a shortage not only of data but, more specifically, of high-quality and relevant data.
  • With few exceptions, the technical capacities of officials remain limited, and this has obviously negative ramifications for the potential use of data in governance (Giest, 2017).
  • It’s not just a question of limited technical capacities. There is often a vast conceptual and values gap between the policy and technical communities (Thompson et al., 2015; Uzochukwu et al., 2016); sometimes it seems as if they speak different languages. Compounding this difference in world views is the fact that the two communities rarely interact.
  • Yet, data about the use and evidence of the impact of data remain sparse. The impetus to use more data in policy making is stymied by limited scholarship and a weak evidential basis to show that data can be helpful and how. Without such evidence, data advocates are limited in their ability to make the case for more data initiatives in governance.
  • Data are not only changing the way policy is developed, but they have also reopened the debate around theory- versus data-driven methods in generating scientific knowledge (Lee, 1973; Kitchin, 2014; Chivers, 2018; Dreyfuss, 2017) and thus directly questioning the evidence base to utilization and implementation of data within policy making. A number of associated challenges are being discussed, such as: (i) traceability and reproducibility of research outcomes (due to “black box processing”); (ii) the use of correlation instead of causation as the basis of analysis, biases and uncertainties present in large historical datasets that cause replication and, in some cases, amplification of human cognitive biases and imperfections; and (iii) the incorporation of existing human knowledge and domain expertise into the scientific knowledge generation processes—among many other topics (Castelvecchi, 2016; Miller and Goodchild, 2015; Obermeyer and Emanuel, 2016; Provost and Fawcett, 2013).
  • Finally, we believe that there should be a sound under-pinning a new theory of what we call Policy–Data Interactions. To date, in reaction to the proliferation of data in the commercial world, theories of data management,1 privacy,2 and fairness3 have emerged. From the Human–Computer Interaction world, a manifesto of principles of Human–Data Interaction (Mortier et al., 2014) has found traction, which intends reducing the asymmetry of power present in current design considerations of systems of data about people. However, we need a consistent, symmetric approach to consideration of systems of policy and data, how they interact with one another.

All these challenges are real, and they are sticky. We are under no illusions that they will be overcome easily or quickly….

During the past four conferences, we have hosted an incredibly diverse range of dialogues and examinations by key global thought leaders, opinion leaders, practitioners, and the scientific community (Data for Policy, 2015201620172019). What became increasingly obvious was the need for a dedicated venue to deepen and sustain the conversations and deliberations beyond the limitations of an annual conference. This leads us to today and the launch of Data & Policy, which aims to confront and mitigate the barriers to greater use of data in policy making and governance.

Data & Policy is a venue for peer-reviewed research and discussion about the potential for and impact of data science on policy. Our aim is to provide a nuanced and multistranded assessment of the potential and challenges involved in using data for policy and to bridge the “two cultures” of science and humanism—as CP Snow famously described in his lecture on “Two Cultures and the Scientific Revolution” (Snow, 1959). By doing so, we also seek to bridge the two other dichotomies that limit an examination of datafication and is interaction with policy from various angles: the divide between practice and scholarship; and between private and public…

So these are our principles: scholarly, pragmatic, open-minded, interdisciplinary, focused on actionable intelligence, and, most of all, innovative in how we will share insight and pushing at the boundaries of what we already know and what already exists. We are excited to launch Data & Policy with the support of Cambridge University Press and University College London, and we’re looking for partners to help us build it as a resource for the community. If you’re reading this manifesto it means you have at least a passing interest in the subject; we hope you will be part of the conversation….(More)”.

How to use data for good — 5 priorities and a roadmap


Stefaan Verhulst at apolitical: “…While the overarching message emerging from these case studies was promising, several barriers were identified that if not addressed systematically could undermine the potential of data science to address critical public needs and limit the opportunity to scale the practice more broadly.

Below we summarise the five priorities that emerged through the workshop for the field moving forward.

1. Become People-Centric

Much of the data currently used for drawing insights involve or are generated by people.

These insights have the potential to impact people’s lives in many positive and negative ways. Yet, the people and the communities represented in this data are largely absent when practitioners design and develop data for social good initiatives.

To ensure data is a force for positive social transformation (i.e., they address real people’s needs and impact lives in a beneficiary way), we need to experiment with new ways to engage people at the design, implementation, and review stage of data initiatives beyond simply asking for their consent.

(Photo credit: Image from the people-led innovation report)

As we explain in our People-Led Innovation methodology, different segments of people can play multiple roles ranging from co-creation to commenting, reviewing and providing additional datasets.

The key is to ensure their needs are front and center, and that data science for social good initiatives seek to address questions related to real problems that matter to society-at-large (a key concern that led The GovLab to instigate 100 Questions Initiative).

2. Establish Data About the Use of Data (for Social Good)

Many data for social good initiatives remain fledgling.

As currently designed, the field often struggles with translating sound data projects into positive change. As a result, many potential stakeholders—private sector and government “owners” of data as well as public beneficiaries—remain unsure about the value of using data for social good, especially against the background of high risks and transactions costs.

The field needs to overcome such limitations if data insights and its benefits are to spread. For that, we need hard evidence about data’s positive impact. Ironically, the field is held back by an absence of good data on the use of data—a lack of reliable empirical evidence that could guide new initiatives.

The field needs to prioritise developing a far more solid evidence base and “business case” to move data for social good from a good idea to reality.

3. Develop End-to-End Data Initiatives

Too often, data for social good focus on the “data-to-knowledge” pipeline without focusing on how to move “knowledge into action.”

As such, the impact remains limited and many efforts never reach an audience that can actually act upon the insights generated. Without becoming more sophisticated in our efforts to provide end-to-end projects and taking “data from knowledge to action,” the positive impact of data will be limited….

4. Invest in Common Trust and Data Steward Mechanisms 

For data for social good initiatives (including data collaboratives) to flourish and scale, there must be substantial trust between all parties involved; and amongst the public-at-large.

Establishing such a platform of trust requires each actor to invest in developing essential trust mechanisms such as data governance structures, contracts, and dispute resolution methods. Today, designing and establishing these mechanisms take tremendous time, energy, and expertise. These high transaction costs result from the lack of common templates and the need to each time design governance structures from scratch…

5. Build Bridges Across Cultures

As C.P. Snow famously described in his lecture on “Two Cultures and the Scientific Revolution,” we must bridge the “two cultures” of science and humanism if we are to solve the world’s problems….

To implement these five priorities we will need experimentation at the operational but also institutional level. This involves the establishment of “data stewards” within organisations that can accelerate data for social good initiative in a responsible manner integrating the five priorities above….(More)”

We should extend EU bank data sharing to all sectors


Carlos Torres Vila in the Financial Times: “Data is now driving the global economy — just look at the list of the world’s most valuable companies. They collect and exploit the information that users generate through billions of online interactions taking place every day. 


But companies are hoarding data too, preventing others, including the users to whom the data relates, from accessing and using it. This is true of traditional groups such as banks, telcos and utilities, as well as the large digital enterprises that rely on “proprietary” data. 
Global and national regulators must address this problem by forcing companies to give users an easy way to share their own data, if they so choose. This is the logical consequence of personal data belonging to users. There is also the potential for enormous socio-economic benefits if we can create consent-based free data flows. 
We need data-sharing across companies in all sectors in a real time, standardised way — not at a speed and in a format dictated by the companies that stockpile user data. These new rules should apply to all electronic data generated by users, whether provided directly or observed during their online interactions with any provider, across geographic borders and in any sector. This could include everything from geolocation history and electricity consumption to recent web searches, pension information or even most recently played songs. 

This won’t be easy to achieve in practice, but the good news is that we already have a framework that could be the model for a broader solution. The UK’s Open Banking system provides a tantalising glimpse of what may be possible. In Europe, the regulation known as the Payment Services Directive 2 allows banking customers to share data about their transactions with multiple providers via secure, structured IT interfaces. We are already seeing this unlock new business models and drive competition in digital financial services. But these rules do not go far enough — they only apply to payments history, and that isn’t enough to push forward a data-driven economic revolution across other sectors of the economy. 

We need a global framework with common rules across regions and sectors. This has already happened in financial services: after the 2008 financial crisis, the G20 strengthened global banking standards and created the Financial Stability Board. The rules, while not perfect, have delivered uniformity which has strengthened the system. 

We need a similar global push for common rules on the use of data. While it will be difficult to achieve consensus on data, and undoubtedly more difficult still to implement and enforce it, I believe that now is the time to decide what we want. The involvement of the G20 in setting up global standards will be essential to realising the potential that data has to deliver a better world for all of us. There will be complaints about the cost of implementation. I know first hand how expensive it can be to simultaneously open up and protect sensitive core systems. 

The alternative is siloed data that holds back innovation. There will also be justified concerns that easier data sharing could lead to new user risks. Security must be a non-negotiable principle in designing intercompany interfaces and protecting access to sensitive data. But Open Banking shows that these challenges are resolvable. …(More)”.

EU countries and car manufacturers to share information to improve road safety


Press Release: “EU member states, car manufacturers and navigation systems suppliers will share information on road conditions with the aim of improving road safety. Cora van Nieuwenhuizen, Minister of Infrastructure and Water Management, agreed this today with four other EU countries during the ITS European Congress in Eindhoven. These agreements mean that millions of motorists in the Netherlands will have access to more information on unsafe road conditions along their route.

The data on road conditions that is registered by modern cars is steadily improving. For instance, information on iciness, wrong-way drivers and breakdowns in emergency lanes. This kind of data can be instantly shared with road authorities and other vehicles following the same route. Drivers can then adapt their driving behaviour appropriately so that accidents and delays are prevented….

The partnership was announced today at the ITS European Congress, the largest European event in the fields of smart mobility and the digitalisation of transport. Among other things, various demonstrations were given on how sharing this type of data contributes to road safety. In the year ahead, the car manufacturers BMW, Volvo, Ford and Daimler, the EU member states Germany, the Netherlands, Finland, Spain and Luxembourg, and navigation system suppliers TomTom and HERE will be sharing data. This means that millions of motorists across the whole of Europe will receive road safety information in their car. Talks on participating in the partnership are also being conducted with other European countries and companies.

ADAS

At the ITS congress, Minister Van Nieuwenhuizen and several dozen parties today also signed an agreement on raising awareness of advanced driver assistance systems (ADAS) and their safe use. Examples of ADAS include automatic braking systems, and blind spot detection and lane keeping systems. Using these driver assistance systems correctly makes driving a car safer and more sustainable. The agreement therefore also includes the launch of the online platform “slimonderweg.nl” where road users can share information on the benefits and risks of ADAS.
Minister Van Nieuwenhuizen: “Motorists are often unaware of all the capabilities modern cars offer. Yet correctly using driver assistance systems really can increase road safety. From today, dozens of parties are going to start working on raising awareness of ADAS and improving and encouraging the safe use of such systems so that more motorists can benefit from them.”

Connected Transport Corridors

Today at the congress, progress was also made regarding the transport of goods. For example, at the end of this year lorries on three transport corridors in our country will be sharing logistics data. This involves more than just information on environmental zones, availability of parking, recommended speeds and predicted arrival times at terminals. Other new technologies will be used in practice on a large scale, including prioritisation at smart traffic lights and driving in convoy. Preparatory work on the corridors around Amsterdam and Rotterdam and in the southern Netherlands has started…..(More)”.

Can tracking people through phone-call data improve lives?


Amy Maxmen in Nature: “After an earthquake tore through Haiti in 2010, killing more than 100,000 people, aid agencies spread across the country to work out where the survivors had fled. But Linus Bengtsson, a graduate student studying global health at the Karolinska Institute in Stockholm, thought he could answer the question from afar. Many Haitians would be using their mobile phones, he reasoned, and those calls would pass through phone towers, which could allow researchers to approximate people’s locations. Bengtsson persuaded Digicel, the biggest phone company in Haiti, to share data from millions of call records from before and after the quake. Digicel replaced the names and phone numbers of callers with random numbers to protect their privacy.

Bengtsson’s idea worked. The analysis wasn’t completed or verified quickly enough to help people in Haiti at the time, but in 2012, he and his collaborators reported that the population of Haiti’s capital, Port-au-Prince, dipped by almost one-quarter soon after the quake, and slowly rose over the next 11 months1. That result aligned with an intensive, on-the-ground survey conducted by the United Nations.

Humanitarians and researchers were thrilled. Telecommunications companies scrutinize call-detail records to learn about customers’ locations and phone habits and improve their services. Researchers suddenly realized that this sort of information might help them to improve lives. Even basic population statistics are murky in low-income countries where expensive household surveys are infrequent, and where many people don’t have smartphones, credit cards and other technologies that leave behind a digital trail, making remote-tracking methods used in richer countries too patchy to be useful.

Since the earthquake, scientists working under the rubric of ‘data for good’ have analysed calls from tens of millions of phone owners in Pakistan, Bangladesh, Kenya and at least two dozen other low- and middle-income nations. Humanitarian groups say that they’ve used the results to deliver aid. And researchers have combined call records with other information to try to predict how infectious diseases travel, and to pinpoint locations of poverty, social isolation, violence and more (see ‘Phone calls for good’)….(More)”.

A Symphony, Not a Solo: How Collective Management Organisations Can Embrace Innovation and Drive Data Sharing in the Music Industry


Paper by David Osimo, Laia Pujol Priego, Turo Pekari and Ano Sirppiniemi: “…data is becoming a fundamental source of competitive advantage in music, just as in other sectors, and streaming services in particular are generating large volume of new data offering unique insight around customer taste and behavior. (As Financial Times recently put it, the music
industry is having its “moneyball” moment) But how are the different players getting ready for this change?

This policy brief aims to look at the question from the perspective of CMOs, the organisations charged with redistributing royalties from music users to music rightsholders (such as musical authors and publishers).

The paper is divided in three sections. Part I will look at the current positioning of CMOs in this new data-intensive ecosystem. Part II will discuss how greater data sharing and reuse can maximize innovation, comparing the music industries with other industries. Part III will make policy and business-model reform recommendations for CMOs to stimulate data-driven innovation, internally and in the industry as a whole….(More)”