Our data, our society, our health: a vision for inclusive and transparent health data science in the UK and Beyond


Paper by Elizabeth Ford et al in Learning Health Systems: “The last six years have seen sustained investment in health data science in the UK and beyond, which should result in a data science community that is inclusive of all stakeholders, working together to use data to benefit society through the improvement of public health and wellbeing.

However, opportunities made possible through the innovative use of data are still not being fully realised, resulting in research inefficiencies and avoidable health harms. In this paper we identify the most important barriers to achieving higher productivity in health data science. We then draw on previous research, domain expertise, and theory, to outline how to go about overcoming these barriers, applying our core values of inclusivity and transparency.

We believe a step-change can be achieved through meaningful stakeholder involvement at every stage of research planning, design and execution; team-based data science; as well as harnessing novel and secure data technologies. Applying these values to health data science will safeguard a social license for health data research, and ensure transparent and secure data usage for public benefit….(More)”.

PayStats helps assess the impact of the low-emission area Madrid Central


BBVA API Market: “How do town-planning decisions affect a city’s routines? How can data help assess and make decisions? The granularity and detailed information offered by PayStats allowed Madrid’s city council to draw a more accurate map of consumer behavior and gain an objective measurement of the impact of the traffic restriction measures on commercial activity.

In this case, 20 million aggregate and anonymized transactions with BBVA cards and any other card at BBVA POS terminals were analyzed to study the effect of the changes made by Madrid’s city council to road access to the city center.

The BBVA PayStats API is targeted at all kinds of organizations including the public sector, as in this case. Madrid’s city council used it to find out how restricting car access to Madrid Central impacted Christmas shopping. From information gathered between December 1 2018 and January 7 2019, a comparison was made between data from the last two Christmases as well as the increased revenue in Madrid Central (Gran Vía and five subareas) vs. the increase in the entire city.

According to the report drawn up by council experts, 5.984 billion euros were spent across the city. The sample shows a 3.3% increase in spending in Madrid when compared to the same time the previous year; this goes up to 9.5% in Gran Vía and reaches 8.6% in the central area….(More)”.

How data collected from mobile phones can help electricity planning


Article by Eduardo Alejandro Martínez Ceseña, Joseph Mutale, Mathaios Panteli, and Pierluigi Mancarella in The Conversation: “Access to reliable and affordable electricity brings many benefits. It supports the growth of small businesses, allows students to study at night and protects health by offering an alternative cooking fuel to coal or wood.

Great efforts have been made to increase electrification in Africa, but rates remain low. In sub-Saharan Africa only 42% of urban areas have access to electricity, just 22% in rural areas.

This is mainly because there’s not enough sustained investment in electricity infrastructure, many systems can’t reliably support energy consumption or the price of electricity is too high.

Innovation is often seen as the way forward. For instance, cheaper and cleaner technologies, like solar storage systems deployed through mini grids, can offer a more affordable and reliable option. But, on their own, these solutions aren’t enough.

To design the best systems, planners must know where on- or off-grid systems should be placed, how big they need to be and what type of energy should be used for the most effective impact.

The problem is reliable data – like village size and energy demand – needed for rural energy planning is scarce or non-existent. Some can be estimated from records of human activities – like farming or access to schools and hospitals – which can show energy needs. But many developing countries have to rely on human activity data from incomplete and poorly maintained national census. This leads to inefficient planning.

In our research we found that data from mobile phones offer a solution. They provide a new source of information about what people are doing and where they’re located.

In sub-Saharan Africa, there are more people with mobile phones than access to electricity, as people are willing to commute to get a signal and/or charge their phones.

This means that there’s an abundance of data – that’s constantly updated and available even in areas that haven’t been electrified – that could be used to optimise electrification planning….

We were able to use mobile data to develop a countrywide electrification strategy for Senegal. Although Senegal has one of the highest access to electricity rates in sub-Saharan Africa, just 38% of people in rural areas have access.

By using mobile data we were able to identify the approximate size of rural villages and access to education and health facilities. This information was then used to size and cost different electrification options and select the most economic one for each zone – whether villages should be connected to the grids, or where off-grid systems – like solar battery systems – were a better option.

To collect the data we randomly selected mobile phone data from 450,000 users from Senegal’s main telecomms provider, Sonatel, to understand exactly how information from mobile phones could be used. This includes the location of user and the characteristics of the place they live….(More)”

Data Trusts as an AI Governance Mechanism


Paper by Chris Reed and Irene YH Ng: “This paper is a response to the Singapore Personal Data Protection Commission consultation on a draft AI Governance Framework. It analyses the five data trust models proposed by the UK Open Data Institute and identifies that only the contractual and corporate models are likely to be legally suitable for achieving the aims of a data trust.

The paper further explains how data trusts might be used as in the governance of AI, and investigates the barriers which Singapore’s data protection law presents to the use of data trusts and how those barriers might be overcome. Its conclusion is that a mixed contractual/corporate model, with an element of regulatory oversight and audit to ensure consumer confidence that data is being used appropriately, could produce a useful AI governance tool…(More)”.

Visualizing where rich and poor people really cross paths—or don’t


Ben Paynter at Fast Company: “…It’s an idea that’s hard to visualize unless you can see it on a map. So MIT Media Lab collaborated with the location intelligence firm Cuebiqto build one. The result is called the Atlas of Inequality and harvests the anonymized location data from 150,000 people who opted in to Cuebiq’s Data For Good Initiative to track their movement for scientific research purposes. After isolating the general area (based on downtime) where each subject lived, MIT Media Lab could estimate what income bracket they occupied. The group then used data from a six-month period between late 2016 and early 2017 to figure out where these people traveled, and how their paths overlapped.

[Screenshot: Atlas of Inequality]

The result is an interactive view of just how filtered, sheltered, or sequestered many people’s lives really are. That’s an important thing to be reminded of at a time when the U.S. feels increasingly ideologically and economically divided. “Economic inequality isn’t just limited to neighborhoods, it’s part of the places you visit every day,” the researchers say in a mission statement about the Atlas….(More)”.

The Palgrave Handbook of Global Health Data Methods for Policy and Practice


Book edited by Sarah B. Macfarlane and Carla AbouZahr: “This handbook compiles methods for gathering, organizing and disseminating data to inform policy and manage health systems worldwide. Contributing authors describe national and international structures for generating data and explain the relevance of ethics, policy, epidemiology, health economics, demography, statistics, geography and qualitative methods to describing population health. The reader, whether a student of global health, public health practitioner, programme manager, data analyst or policymaker, will appreciate the methods, context and importance of collecting and using global health data….(More)”.

Toward an Open Data Bias Assessment Tool Measuring Bias in Open Spatial Data


Working Paper by Ajjit Narayanan and Graham MacDonald: “Data is a critical resource for government decisionmaking, and in recent years, local governments, in a bid for transparency, community engagement, and innovation, have released many municipal datasets on publicly accessible open data portals. In recent years, advocates, reporters, and others have voiced concerns about the bias of algorithms used to guide public decisions and the data that power them.

Although significant progress is being made in developing tools for algorithmic bias and transparency, we could not find any standardized tools available for assessing bias in open data itself. In other words, how can policymakers, analysts, and advocates systematically measure the level of bias in the data that power city decisionmaking, whether an algorithm is used or not?

To fill this gap, we present a prototype of an automated bias assessment tool for geographic data. This new tool will allow city officials, concerned residents, and other stakeholders to quickly assess the bias and representativeness of their data. The tool allows users to upload a file with latitude and longitude coordinates and receive simple metrics of spatial and demographic bias across their city.

The tool is built on geographic and demographic data from the Census and assumes that the population distribution in a city represents the “ground truth” of the underlying distribution in the data uploaded. To provide an illustrative example of the tool’s use and output, we test our bias assessment on three datasets—bikeshare station locations, 311 service request locations, and Low Income Housing Tax Credit (LIHTC) building locations—across a few, hand-selected example cities….(More)”

Circular City Data


First Volume of Circular City, A Research Journal by New Lab edited by André Corrêa d’Almeida: “…Circular City Data is the topic being explored in the first iteration of New Lab’s The Circular City program, which looks at data and knowledge as the energy, flow, and medium of collaboration. Circular data refers to the collection, production, and exchange of data, and business insights, between a series of collaborators around a shared set of inquiries. In some scenarios, data may be produced by start-ups and of high value to the city; in other cases, data may be produced by the city and of potential value to the public, start-ups, or enterprise companies. The conditions that need to be in place to safely, ethically, and efficiently extrapolate the highest potential value from data are what this program aims to uncover.

Similar to living systems, urban systems can be enhanced if the total pool of data available, i.e., energy, can be democratized and decentralized and data analytics used widely to positively impact quality of life. The abundance of data available, the vast differences in capacity across organizations to handle it, and the growing complexity of urban challenges provides an opportunity to test how principles of circular city data can help establish new forms of public and private partnerships that make cities more economically prosperous, livable, and resilient. Though we talk of an overabundance of data, it is often still not visible or tactically wielded at the local level in a way that benefits people.

Circular City Data is an effort to build a safe environment whereby start-ups, city agencies, and larger firms can collect, produce, access and exchange data, as well as business insights, through transaction mechanisms that do not necessarily require currency, i.e., through reciprocity. Circular data is data that travels across a number of stakeholders, helping to deliver insights and make clearer the opportunities where such stakeholders can work together to improve outcomes. It includes cases where a set of “circular” relationships need to be in place in order to produce such data and business insights. For example, if an AI company lacks access to raw data from the city, they won’t be able to provide valuable insights to the city. Or, Numina required an established relationship with the DBP in order to access infrastructure necessary for them to install their product and begin generating data that could be shared back with them. ***

Next, the case study documents and explains how The Circular City program was conceived, designed, and implemented, with the goal of offering lessons for scalability at New Lab and replicability in other cities around the world. The three papers that follow investigate and methodologically test the value of circular data applied to three different, but related, urban challenges: economic growth, mobility, and resilience. At the end, the conclusion offers a meta-analysis of the value of circular city data for the future of cities and presents, integrated, the tools developed in each paper that can be used for implementation and scaling-up of a circular city program…(More).

Contents

  • Introduction to The Circular City Research Program (André Corrêa d’Almeida)
  • The Circular City Program: The Case Study (André Corrêa d’Almeida and Caroline McHeffey)  
  • Circular Data for a Circular City: Value Propositions for Economic Development (Stefaan G. Verhulst, Andrew Young, and Andrew J. Zahuranec)  
  • Circular Data for a Circular City: Value Propositions for Mobility (Arnaud Sahuguet)
  • Circular Data for a Circular City: Value Propositions for Resilience and Sustainability (Nilda Mesa)
  • Conclusio (André Corrêa d’Almeida)


Africa Data Revolution Report 2018


Report by Jean-Paul Van Belle et al: ” The Africa Data Revolution Report 2018 delves into the recent evolution and current state of open data – with an emphasis on Open Government Data – in the African data communities. It explores key countries across the continent, researches a wide range of open data initiatives, and benefits from global thematic expertise. This second edition improves on process, methodology and collaborative partnerships from the first edition.

It draws from country reports, existing global and continental initiatives, and key experts’ input, in order to provide a deep analysis of the
actual impact of open data in the African context. In particular, this report features a dedicated Open Data Barometer survey as well as a special 2018
Africa Open Data Index regional edition surveying the status and impact of open data and dataset availability in 30 African countries. The research is complemented with six in-depth qualitative case studies featuring the impact of open data in Kenya, South Africa (Cape Town), Ghana, Rwanda, Burkina Faso and Morocco. The report was critically reviewed by an eminent panel of experts.

Findings: In some governments, there is a slow iterative cycle between innovation, adoption, resistance and re-alignment before finally resulting in Open Government Data (OGD) institutionalization and eventual maturity. There is huge diversity between African governments in embracing open data, and each country presents a complex and unique picture. In several African countries, there appears to be genuine political will to open up government based datasets, not only for increased transparency but also to achieve economic impacts, social equity and stimulate innovation.

The role of open data intermediaries is crucial and has been insufficiently recognized in the African context. Open data in Africa needs a vibrant, dynamic, open and multi-tier data ecosystem if the datasets are to make a real impact. Citizens are rarely likely to access open data themselves. But the democratization of information and communication platforms has opened up opportunities among a large and diverse set of intermediaries to explore and combine relevant data sources, sometimes with private or leaked data. The news media, NGOs and advocacy groups, and to a much lesser extent academics and social or profit-driven entrepreneurs have shown that OGD can create real impact on the achievement of the SDGs…

The report encourages national policy makers and international funding or development agencies to consider the status, impact and future of open
data in Africa on the basis of this research. Other stakeholders working with or for open data can hopefully  also learn from what is happening on the continent. It is hoped that the findings and recommendations contained in the report will form the basis of a robust, informed and dynamic debate around open government data in Africa….(More)”.

Data Trusts: Ethics, Architecture and Governance for Trustworthy Data Stewardship


Web Science Institute Paper by Kieron O’Hara: “In their report on the development of the UK AI industry, Wendy Hall and Jérôme Pesenti
recommend the establishment of data trusts, “proven and trusted frameworks and agreements” that will “ensure exchanges [of data] are secure and mutually beneficial” by promoting trust in the use of data for AI. Hall and Pesenti leave the structure of data trusts open, and the purpose of this paper is to explore the questions of (a) what existing structures can data trusts exploit, and (b) what relationship do data trusts have to
trusts as they are understood in law?

The paper defends the following thesis: A data trust works within the law to provide ethical, architectural and governance support for trustworthy data processing

Data trusts are therefore both constraining and liberating. They constrain: they respect current law, so they cannot render currently illegal actions legal. They are intended to increase trust, and so they will typically act as
further constraints on data processors, adding the constraints of trustworthiness to those of law. Yet they also liberate: if data processors
are perceived as trustworthy, they will get improved access to data.

Most work on data trusts has up to now focused on gaining and supporting the trust of data subjects in data processing. However, all actors involved in AI – data consumers, data providers and data subjects – have trust issues which data trusts need to address.

Furthermore, it is not only personal data that creates trust issues; the same may be true of any dataset whose release might involve an organisation risking competitive advantage. The paper addresses four areas….(More)”.