Learning to Share: Lessons on Data-Sharing from Beyond Social Media

Paper by CDT: “What role has social media played in society? Did it influence the rise of Trumpism in the U.S. and the passage of Brexit in the UK? What about the way authoritarians exercise power in India or China? Has social media undermined teenage mental health? What about its role in building social and community capital, promoting economic development, and so on?

To answer these and other important policy-related questions, researchers such as academics, journalists, and others need access to data from social media companies. However, this data is generally not available to researchers outside of social media companies and, where it is available, it is often insufficient, meaning that we are left with incomplete answers.

Governments on both sides of the Atlantic have passed or proposed legislation to address the problem by requiring social media companies to provide certain data to vetted researchers (Vogus, 2022a). Researchers themselves have thought a lot about the problem, including the specific types of data that can further public interest research, how researchers should be vetted, and the mechanisms companies can use to provide data (Vogus, 2022b).

For their part, social media companies have sanctioned some methods to share data to certain types of researchers through APIs (e.g., for researchers with university affiliations) and with certain limitations (such as limits on how much and what types of data are available). In general, these efforts have been insufficient. In part, this is due to legitimate concerns such as the need to protect user privacy or to avoid revealing company trade secrets.  But, in some cases, the lack of sharing is due to other factors such as lack of resources or knowledge about how to share data effectively or resistance to independent scrutiny.

The problem is complex but not intractable. In this report, we look to other industries where companies share data with researchers through different mechanisms while also addressing concerns around privacy. In doing so, our analysis contributes to current public and corporate discussions about how to safely and effectively share social media data with researchers. We review experiences based on the governance of clinical trials, electricity smart meters, and environmental impact data…(More)”

New WHO policy requires sharing of all research data

Press release: “Science and public health can benefit tremendously from sharing and reuse of health data. Sharing data allows us to have the fullest possible understanding of health challenges, to develop new solutions, and to make decisions using the best available evidence.

The Research for Health department has helped spearhead the launch of a new policy from the Science Division which covers all research undertaken by or with support from WHO. The goal is to make sure that all research data is shared equitably, ethically and efficiently. Through this policy, WHO indicates its commitment to transparency in order to reach the goal of one billion more people enjoying better health and well-being.

The WHO policy is accompanied by practical guidance to enable researchers to develop and implement a data management and sharing plan, before the research has even started. The guide provides advice on the technical, ethical and legal considerations to ensure that data, even patient data, can be shared for secondary analysis without compromising personal privacy.  Data sharing is now a requirement for research funding awarded by WHO and TDR. 

“We have seen the problems caused by the lack of data sharing on COVID-19,” said Dr. Soumya Swaminathan, WHO Chief Scientist. “When data related to research activities are shared ethically, equitably and efficiently, there are major gains for science and public health.”

The policy to share data from all research funded or conducted by WHO, and practical guidance to do so, can be found here…(More)”.

Using real-time indicators for economic decision-making in government: Lessons from the Covid-19 crisis in the UK

Paper by David Rosenfeld: “When the UK went into lockdown in mid-March 2020, government was faced with the dual challenge of managing the impact of closing down large parts of the economy and responding effectively to the pandemic. Policy-makers needed to make rapid decisions regarding, on the one hand, the extent of restrictions on movement and economic activity to limit the spread of the virus, and on the other, the amount of support that would be provided to individuals and businesses affected by the crisis. Traditional, official statistics, such as gross domestic product (GDP) or unemployment, which get released on a monthly basis and with a lag, could not be relied upon to monitor the situation and guide policy decisions.

In response, teams of data scientists and statisticians pivoted to develop alternative indicators, leading to an unprecedented amount of innovation in how statistics and data were used in government. This ranged from monitoring sewage water for signs of Covid-19 infection to the Office for National Statistics (ONS) developing a new range of ‘faster indicators’ of economic activity using online job vacancies and data on debit and credit card expenditure from the Clearing House Automated Payment System (CHAPS).

The ONS received generally positive reviews for its performance during the crisis (The Economist, 2022), in contrast to the 2008 financial crisis when policy-makers did not realise the extent of the recession until subsequent revisions to GDP estimates were made. Partly in response to this, the Independent Review of UK Economic Statistics (HM Treasury, 2016) recommended improvements to the use of administrative data and alternative indicators as well as to data science capability to exploit both the extra granularity and the timeliness of new data sources.

This paper reviews the elements that contributed to successes in using real-time data during the pandemic as well as the challenges faced during this period, with a view to distilling some lessons for future use in government. Section 2 provides an overview of real-time indicators (RTIs) and how they were used in the UK during the Covid-19 crisis. The next sections analyse the factors that underpinned the successes (or lack thereof) in using such indicators: section 3 addresses skills, section 4 infrastructure, and section 5 legal frameworks and processes. Section 6 concludes with a summary of the main lessons for governments that hope to make greater use of RTIs…(More)”.

‘Very Harmful’ Lack of Data Blunts U.S. Response to Outbreaks

Paper by Sharon LaFraniere: “After a middle-aged woman tested positive for Covid-19 in January at her workplace in Fairbanks, public health workers sought answers to questions vital to understanding how the virus was spreading in Alaska’s rugged interior.

The woman, they learned, had underlying conditions and had not been vaccinated. She had been hospitalized but had recovered. Alaska and many other states have routinely collected that kind of information about people who test positive for the virus. Part of the goal is to paint a detailed picture of how one of the worst scourges in American history evolves and continues to kill hundreds of people daily, despite determined efforts to stop it.

But most of the information about the Fairbanks woman — and tens of millions more infected Americans — remains effectively lost to state and federal epidemiologists. Decades of underinvestment in public health information systems has crippled efforts to understand the pandemic, stranding crucial data in incompatible data systems so outmoded that information often must be repeatedly typed in by hand. The data failure, a salient lesson of a pandemic that has killed more than one million Americans, will be expensive and time-consuming to fix….(More)”.

The precise cost in needless illness and death cannot be quantified. The nation’s comparatively low vaccination rate is clearly a major factor in why the United States has recorded the highest Covid death rate among large, wealthy nations. But federal experts are certain that the lack of comprehensive, timely data has also exacted a heavy toll.

“It has been very harmful to our response,” said Dr. Ashish K. Jha, who leads the White House effort to control the pandemic. “It’s made it much harder to respond quickly.”

Details of the Fairbanks woman’s case were scattered among multiple state databases, none of which connect easily to the others, much less to the Centers for Disease Control and Prevention, the federal agency in charge of tracking the virus. Nine months after she fell ill, her information was largely useless to epidemiologists because it was impossible to synthesize most of it with data on the roughly 300,000 other Alaskans and the 95 million-plus other Americans who have gotten Covid.

Towards an international data governance framework

Paper by Steve MacFeely et al: “The CCSA argued that a Global Data Compact (GDC) could provide a framework to ensure that data are safeguarded as a global public good and as a resource to achieve equitable and sustainable development. This compact, by promoting common objectives, would help avoid fragmentation where each country or region adopts their own approach to data collection, storage, and use. A coordinated approach would give individuals and enterprises confidence that data relevant to them carries protections and obligations no matter where they are collected or used…

The universal principles and standards should set out the elements of responsible and ethical handling and sharing of data and data products. The compact should also move beyond simply establishing ethical principles and create a global architecture that includes standards and incentives for compliance. Such an architecture could be the foundation for rethinking the data economy, promoting open data, encouraging data exchange, fostering innovation and facilitating international trade. It should build upon the existing canon of international human rights and other conventions, laws and treaties that set out useful principles and compliance mechanisms.

Such a compact will require a new type of global architecture. Modern data ecosystems are not controlled by states alone, so any Compact, Geneva Convention, Commons, or Bretton Woods type agreement will require a multitude of stakeholders and signatories – states, civil society, and the private sector at the very least. This would be very different to any international agreement that currently exists. Therefore, to support a GDC, a new global institution or platform may be needed to bring together the many data communities and ecosystems, that comprise not only national governments, private sector and civil society but also participants in specific fields, such as artificial intelligence, digital and IT services. Participants would maintain and update data standards, oversee accountability frameworks, and support mechanisms to facilitate the exchange and responsible use of data. The proposed Global Digital Compact which has been proposed as part of Our Common Agenda will also need to address the challenges of bringing many different constituencies together and may point the way…(More)”

A Massive LinkedIn Study Reveals Who Actually Helps You Get That Job

Article by Viviane Callier : “If you want a new job, don’t just rely on friends or family. According to one of the most influential theories in social science, you’re more likely to nab a new position through your “weak ties,” loose acquaintances with whom you have few mutual connections. Sociologist Mark Granovetter first laid out this idea in a 1973 paper that has garnered more than 65,000 citations. But the theory, dubbed “the strength of weak ties,” after the title of Granovetter’s study, lacked causal evidence for decades. Now a sweeping study that looked at more than 20 million people on the professional social networking site LinkedIn over a five-year period finally shows that forging weak ties does indeed help people get new jobs. And it reveals which types of connections are most important for job hunters…Along with job seekers, policy makers could also learn from the new paper. “One thing the study highlights is the degree to which algorithms are guiding fundamental, baseline, important outcomes, like employment and unemployment,” Aral says. The role that LinkedIn’s People You May Know function plays in gaining a new job demonstrates “the tremendous leverage that algorithms have on employment and probably other factors of the economy as well.” It also suggests that such algorithms could create bellwethers for economic changes: in the same way that the Federal Reserve looks at the Consumer Price Index to decide whether to hike interest rates, Aral suggests, networks such as LinkedIn might provide new data sources to help policy makers parse what is happening in the economy. “I think these digital platforms are going to be an important source of that,” he says…(More)”

The Public Good and Public Attitudes Toward Data Sharing Through IoT

Paper by Karen Mossberger, Seongkyung Cho and Pauline Cheong: “The Internet of Things has created a wealth of new data that is expected to deliver important benefits for IoT users and for society, including for the public good. Much of the literature has focused on data collection through individual adoption of IoT devices, and big data collection by companies with accompanying fears of data misuse. While citizens also increasingly produce data as they move about in public spaces, less is known about citizen support for data collection in smart city environments, or for data sharing for a variety of public-regarding purposes. Through a nationally representative survey of over 2,000 respondents as well as interviews, we explore the willingness of citizens to share their data with different parties and in various circumstances, using the contextual integrity framework, the literature on the ‘publicness’ of organizations, and public value creation. We describe the results of the survey across different uses, for data sharing from devices and for data collection in public spaces. We conduct multivariate regression to predict individual characteristics that influence attitudes toward use of IoT data for public purposes. Across different contexts, from half to 2/3 of survey respondents were willing to share data from their own IoT devices for public benefits, and 80-93% supported the use of sensors in public places for a variety of collective benefits. Yet government is less trusted with this data than other organizations with public purposes, such as universities, nonprofits and health care institutions. Trust in government, among other factors, was significantly related to data sharing and support for smart city data collection. Cultivating trust through transparent and responsible data stewardship will be important for future use of IoT data for public good…(More)”.

Superhuman science: How artificial intelligence may impact innovation

Working paper by Ajay Agrawal, John McHale, and Alexander Oettl: “New product innovation in fields like drug discovery and material science can be characterized as combinatorial search over a vast range of possibilities. Modeling innovation as a costly multi-stage search process, we explore how improvements in Artificial Intelligence (AI) could affect the productivity of the discovery pipeline in allowing improved prioritization of innovations that flow through that pipeline. We show how AI-aided prediction can increase the expected value of innovation and can increase or decrease the demand for downstream testing, depending on the type of innovation, and examine how AI can reduce costs associated with well-defined bottlenecks in the discovery pipeline. Finally, we discuss the critical role that policy can play to mitigate potential market failures associated with access to and provision of data as well as the provision of training necessary to more closely approach the socially optimal level of productivity enhancing innovations enabled by this technology…(More)”.

Trust Based Resolving of Conflicts for Collaborative Data Sharing in Online Social Networks

Paper by Nisha P. Shetty et al: “Twenty-first century, the era of Internet, social networking platforms like Facebook and Twitter play a predominant role in everybody’s life. Ever increasing adoption of gadgets such as mobile phones and tablets have made social media available all times. This recent surge in online interaction has made it imperative to have ample protection against privacy breaches to ensure a fine grained and a personalized data publishing online. Privacy concerns over communal data shared amongst multiple users are not properly addressed in most of the social media. The proposed work deals with effectively suggesting whether or not to grant access to the data which is co-owned by multiple users. Conflicts in such scenario are resolved by taking into consideration the privacy risk and confidentiality loss observed if the data is shared. For secure sharing of data, a trust framework based on the user’s interest and interaction parameters is put forth. The proposed work can be extended to any data sharing multiuser platform….(More)”.

Can Artificial Intelligence Improve Gender Equality? Evidence from a Natural Experiment

Paper by Zhengyang Bao and Difang Huang: “Difang HuangGender stereotypes and discriminatory practices in the education system are important reasons for women’s under-representation in many fields. How to create a gender-neutral learning environment when teachers’ gender composition and mindset are slow to change? Artificial intelligence (AI)’s recent development provides a way to achieve this goal. Engineers can make AI trainers appear gender neutral and not take gender-related information as input. We use data from a natural experiment where AI trainers replace some human teachers for a male-dominated strategic board game to test the effectiveness of such AI training. The introduction of AI improves boys’ and girls’ performance faster and reduces the pre-existing gender gap. Class recordings suggest that AI trainers’ gender-neutral emotional status can partly explain the improvement in gender quality. We provide the first evidence demonstrating AI’s potential to promote equality for society…(More)”.