data collaboratives

Investing in Data Saves Lives

Curated on June 1, 2021 by Stefaan Verhulst

Mark Lowcock and Raj Shah at Project Syndicate: “…Our experience of building a predictive model, and its use by public-health officials in these countries, showed that this approach could lead to better humanitarian outcomes. But it was also a reminder that significant data challenges, regarding both gaps and quality, limit the viability and accuracy of such models for the world’s most vulnerable countries. For example, data on the prevalence of cardiovascular diseases was 4-7 years old in several poorer countries, and not available at all for Sudan and South Sudan.

Globally, we are still missing about 50% of the data needed to respond effectively in countries experiencing humanitarian emergencies. OCHA and The Rockefeller Foundation are cooperating to provide early insight into crises, during and beyond the COVID-19 pandemic. But realizing the full potential of our approach depends on the contributions of others.

So, as governments, development banks, and major humanitarian and development agencies reflect on the first year of the pandemic response, as well as on discussions at the recent World Bank Spring Meetings, they must recognize the crucial role data will play in recovering from this crisis and preventing future ones. Filling gaps in critical data should be a top priority for all humanitarian and development actors.

Governments, humanitarian organizations, and regional development banks thus need to invest in data collection, data-sharing infrastructure, and the people who manage these processes. Likewise, these stakeholders must become more adept at responsibly sharing their data through open data platforms and that maintain rigorous interoperability standards.

Where data are not available, the private sector should develop new sources of information through innovative methods such as using anonymized social-media data or call records to understand population movement patterns….(More)”.

Next-generation nowcasting to improve decision making in a crisis

Curated on May 27, 2021May 27, 2021 by Stefaan Verhulst

Frank Gerhard, Marie-Paule Laurent, Kyriakos Spyrounakos, and Eckart Windhagen at McKinsey: “In light of the limitations of the traditional models, we recommend a modified approach to nowcasting that uses country- and industry-specific expertise to boil down the number of variables to a selected few for each geography or sector, depending on the individual economic setting. Given the specific selection of each core variable, the relationships between the variables will be relatively stable over time, even during a major crisis. Admittedly, the more variables used, the easier it is to explain an economic shift; however, using more variables also means a greater chance of a break in some of the statistical relationships, particularly in response to an exogenous shock.

This revised nowcasting model will be more flexible and robust in periods of economic stress. It will provide economically intuitive outcomes, include the consideration of complementary, high-frequency data, and offer access to economic insights that are at once timely and unique.

Nowcast for Q1 2021 shows differing recovery speeds by sector and geography.

For example, consumer spending can be estimated in different US cities by combining data such as wages from business applications and footfall from mobility trend reports. As a more complex example: eurozone capitalization rates are, at the time of the writing of this article, available only through January 2021. However, a revamped nowcasting model can estimate current capitalization rates in various European countries by employing a handful of real-time and high-frequency variables for each, such as retail confidence indicators, stock-exchange indices, price expectations, construction estimates, base-metals prices and output, and even deposits into financial institutions. The choice of variable should, of course, be guided by industry and sector experts.

Similarly, published figures for gross value added (GVA) at the sector level in Europe are available only up to the second quarter of 2020. However, by utilizing selected variables, the new approach to nowcasting can provide an estimate of GVA through the first quarter of 2021. It can also highlight the different experiences of each region and industry sector in the recent recovery. Note that the sectors reliant on in-person interactions and of a nonessential nature have been slow to recover, as have the countries more reliant on international markets (exhibit)….(More)”.

Enabling Trusted Data Collaboration in Society

Curated on May 13, 2021May 13, 2021 by Stefaan Verhulst

Launch of Public Beta of the Data Responsibility Journey Mapping Tool: “Data Collaboratives, the purpose-driven reuse of data in the public interest, have demonstrated their ability to unlock the societal value of siloed data and create real-world impacts. Data collaboration has been key in generating new insights and action in areas like public health, education, crisis response, and economic development, to name a few. Designing and deploying a data collaborative, however, is a complex undertaking, subject to risks of misuse of data as well as missed use of data that could have provided public value if used effectively and responsibly.

Today, The GovLab is launching the public beta of a new tool intended to help Data Stewards — responsible data leaders across sectors — and other decision-makers assess and mitigate risks across the life cycle of a data collaborative. The Data Responsibility Journey is an assessment tool for Data Stewards to identify and mitigate risks, establish trust, and maximize the value of their work. Informed by The GovLab’s long standing research and practice in the field, and myriad consultations with data responsibility experts across regions and contexts, the tool aims to support decision-making in public agencies, civil society organizations, large businesses, small businesses, and humanitarian and development organizations, in particular.

The Data Responsibility Journey guides users through important questions and considerations across the lifecycle of data stewardship and collaboration: Planning, Collecting, Processing, Sharing, Analyzing, and Using. For each stage, users are asked to consider whether important data responsibility issues have been taken into account as part of their implementation strategy. When users flag an issue as in need of more attention, it is automatically added to a customized data responsibility strategy report providing actionable recommendations, relevant tools and resources, and key internal and external stakeholders that could be engaged to help operationalize these data responsibility actions…(More)”.

Data for Good Collaboration

Curated on May 12, 2021May 12, 2021 by Stefaan Verhulst

Research Report by Swinburne University of Technology’s Social Innovation Research Institute: “…partnered with the Lord Mayor’s Charitable Foundation, Entertainment Assist, Good Cycles and Yooralla Disability Services, to create the data for good collaboration. The project had two aims: – Build organisational data capacity through knowledge sharing about data literacy, expertise and collaboration – Deliver data insights through a methodology of collaborative data analytics This report presents key findings from our research partnership, which involved the design and delivery of a series of webinars that built data literacy; and participatory data capacity-building workshops facilitated by teams of social scientists and data scientists. It also draws on interviews with participants, reflecting on the benefits and opportunities data literacy can offer to individuals and organisations in the not-for-profit and NGO sectors…(More)”.

Developing a Data Reuse Strategy for Solving Public Problems

Curated on May 6, 2021May 6, 2021 by Stefaan Verhulst

The Data Stewards Academy…A self-directed learning program from the Open Data Policy Lab (The GovLab): “Communities across the world face unprecedented challenges. Strained by climate change, crumbling infrastructure, growing economic inequality, and the continued costs of the COVID-19 pandemic, institutions need new ways of solving public problems and improving how they operate.

In recent years, data has been increasingly used to inform policies and interventions targeted at these issues. Yet, many of these data projects, data collaboratives, and open data initiatives remain scattered. As we enter into a new age of data use and re-use, a third wave of open data, it is more important than ever to be strategic and purposeful, to find new ways to connect the demand for data with its supply to meet institutional objectives in a socially responsible way.

This self-directed learning program, adapted from a selective executive education course, will help data stewards (and aspiring data stewards) develop a data re-use strategy to solve public problems. Noting the ways data resources can inform their day-to-day and strategic decision-making, the course provides learners with ways they can use data to improve how they operate and pursue goals in the public’s interests. By working differently—using agile methods and data analytics—public, private, and civil sector leaders can promote data re-use and reduce data access inequities in ways that advance their institution’s goals.

In this self-directed learning program, we will teach participants how to develop a 21st century data strategy. Participants will learn:

Why It Matters: A discussion of the three waves of open data and how data re-use has proven to be transformative;
The Current State of Play: Current practice around data re-use, including deficits of current approaches and the need to shift from ad hoc engagements to more systematic, sustainable, and responsible models;
Defining Demand: Methodologies for how organizations can formulate questions that data can answer; and make data collaboratives more purposeful;
Mapping Supply: Methods for organizations to discover and assess the open and private data needed to answer the questions at hand that potentially may be available to them;
Matching Supply with Demand: Operational models for connecting and meeting the needs of supply- and demand-side actors in a sustainable way;
Identifying Risks: Overview of the risks that can emerge in the course of data re-use;
Mitigating Risks and Other Considerations: Technical, legal and contractual issues that can be leveraged or may arise in the course of data collaboration and other data work; and
Institutionalizing Data Re-use: Suggestions for how organizations can incorporate data re-use into their organizational structure and foster future collaboration and data stewardship.

The Data Stewardship Executive Education Course was designed and implemented by program leads Stefaan Verhulst, co-founder and chief research development officer at the GovLab, and Andrew Young, The GovLab’s knowledge director, in close collaboration with a global network of expert faculty and advisors. It aims to….(More)”.

WHO, Germany launch new global hub for pandemic and epidemic intelligence

Curated on May 5, 2021May 5, 2021 by Stefaan Verhulst

Press Release: “The World Health Organization (WHO) and the Federal Republic of Germany will establish a new global hub for pandemic and epidemic intelligence, data, surveillance and analytics innovation. The Hub, based in Berlin and working with partners around the world, will lead innovations in data analytics across the largest network of global data to predict, prevent, detect prepare for and respond to pandemic and epidemic risks worldwide.

H.E. German Federal Chancellor Dr Angela Merkel said: “The current COVID-19 pandemic has taught us that we can only fight pandemics and epidemics together. The new WHO Hub will be a global platform for pandemic prevention, bringing together various governmental, academic and private sector institutions. I am delighted that WHO chose Berlin as its location and invite partners from all around the world to contribute to the WHO Hub.”

The WHO Hub for Pandemic and Epidemic Intelligence is part of WHO’s Health Emergencies Programme and will be a new collaboration of countries and partners worldwide, driving innovations to increase availability and linkage of diverse data; develop tools and predictive models for risk analysis; and to monitor disease control measures, community acceptance and infodemics. Critically, the WHO Hub will support the work of public health experts and policy-makers in all countries with insights so they can take rapid decisions to prevent and respond to future public health emergencies.

“We need to identify pandemic and epidemic risks as quickly as possible, wherever they occur in the world. For that aim, we need to strengthen the global early warning surveillance system with improved collection of health-related data and inter-disciplinary risk analysis,” said Jens Spahn, German Minister of Health. “Germany has consistently been committed to support WHO’s work in preparing for and responding to health emergencies, and the WHO Hub is a concrete initiative that will make the world safer.”

Working with partners globally, the WHO Hub will drive a scale-up in innovation for existing forecasting and early warning capacities in WHO and Member States. At the same time, the WHO Hub will accelerate global collaborations across public and private sector organizations, academia, and international partner networks. It will help them to collaborate and co-create the necessary tools for managing and analyzing data for early warning surveillance. It will also promote greater access to data and information….(More)”.

Responsible Data Science

Curated on May 5, 2021May 5, 2021 by Stefaan Verhulst

Book by Peter Bruce and Grant Fleming: “The increasing popularity of data science has resulted in numerous well-publicized cases of bias, injustice, and discrimination. The widespread deployment of “Black box” algorithms that are difficult or impossible to understand and explain, even for their developers, is a primary source of these unanticipated harms, making modern techniques and methods for manipulating large data sets seem sinister, even dangerous. When put in the hands of authoritarian governments, these algorithms have enabled suppression of political dissent and persecution of minorities. To prevent these harms, data scientists everywhere must come to understand how the algorithms that they build and deploy may harm certain groups or be unfair.

Responsible Data Science delivers a comprehensive, practical treatment of how to implement data science solutions in an even-handed and ethical manner that minimizes the risk of undue harm to vulnerable members of society. Both data science practitioners and managers of analytics teams will learn how to:

Improve model transparency, even for black box models
Diagnose bias and unfairness within models using multiple metrics
Audit projects to ensure fairness and minimize the possibility of unintended harm…(More)”

Mapping the United Nations Fundamental Principles of Official Statistics against new and big data sources

Curated on May 3, 2021 by Stefaan Verhulst

Paper by Dominik Rozkrut, Olga Świerkot-Strużewska, and Gemma Van Halderen: “Never has there been a more exciting time to be an official statistician. The data revolution is responding to the demands of the CoVID-19 pandemic and a complex sustainable development agenda to improve how data is produced and used, to close data gaps to prevent discrimination, to build capacity and data literacy, to modernize data collection systems and to liberate data to promote transparency and accountability. But can all data be liberated in the production and communication of official statistics? This paper explores the UN Fundamental Principles of Official Statistics in the context of eight new and big data sources. The paper concludes each data source can be used for the production of official statistics in adherence with the Fundamental Principles and argues these data sources should be used if National Statistical Systems are to adhere to the first Fundamental Principle of compiling and making available official statistics that honor citizen’s entitlement to public information….(More)”.

Principles and Practices for a Federal Statistical Agency

Curated on April 30, 2021April 30, 2021 by Stefaan Verhulst

Book by the National Academies of Sciences, Engineering, and Medicine: “Government statistics are widely used to inform decisions by policymakers, program administrators, businesses and other organizations as well as households and the general public. Principles and Practices for a Federal Statistical Agency, Seventh Edition will assist statistical agencies and units, as well as other agencies engaged in statistical activities, to carry out their responsibilities to provide accurate, timely, relevant, and objective information for public and policy use. This report will also inform legislative and executive branch decision makers, data users, and others about the characteristics of statistical agencies that enable them to serve the public good….(More)”

Building on a year of open data: progress and promise

Curated on April 29, 2021April 29, 2021 by Stefaan Verhulst

Jennifer Yokoyama at Microsoft: “…The biggest takeaway from our work this past year – and the one thing I hope any reader of this post will take away – is that data collaboration is a spectrum. From the presence (or absence) of data to how open that data is to the trust level of the collaboration participants, these factors may necessarily lead to different configurations and different goals, but they can all lead to more open data and innovative insights and discoveries.

Here are a few other lessons we have learned over the last year:

Principles set the foundation for stakeholder collaboration: When we launched the Open Data Campaign, we adopted five principles that guide our contributions and commitments to trusted data collaborations: Open, Usable, Empowering, Secure and Private. These principles underpin our participation, but importantly, organizations can build on them to establish responsible ways to share and collaborate around their data. The London Data Commission, for example, established a set of data sharing principles for public- and private-sector organizations to ensure alignment and to guide the participating groups in how they share data.
There is value in pilot projects: Traditionally, data collaborations with several stakeholders require time – often including a long runway for building the collaboration, plus the time needed to execute on the project and learn from it. However, our learnings show short-term projects that experiment and test data collaborations can provide valuable insights. The London Data Commission did exactly that with the launch of four short-term pilot projects. Due to the success of the pilots, the partners are exploring how they can be expanded upon.
Open data doesn’t require new data: Identifying data to share does not always mean it must be newly shared data; sometimes the data was narrowly shared, but can be shared more broadly, made more accessible or analyzed for a different purpose. Microsoft’s environmental indicator data is an example of data that was already disclosed in certain venues, but was then made available to the Linux Foundation’s OS-Climate Initiative to be consumed through analytics, thereby extending its reach and impact…

To get started, we suggest that emerging data collaborations make use of the wealth of existing resources. When embarking on data collaborations, we leveraged many of the definitions, toolkits and guides from leading organizations in this space. As examples, resources such as the Open Data Institute’s Data Ethics Canvas are extremely useful as a framework to develop ethical guidance. Additionally, The GovLab’s Open Data Policy Lab and Executive Course on Data Stewardship, both supported by Microsoft, highlight important case studies, governance considerations and frameworks when sharing data. If you want to learn more about the exciting work our partners are doing, check out the latest posts from the Open Data Institute and GovLab…(More)”. See also Open Data Policy Lab.