Business Models For Sustainable Research Data Repositories


OECD Report: “In 2007, the OECD Principles and Guidelines for Access to Research Data from Public Funding were published and in the intervening period there has been an increasing emphasis on open science. At the same time, the quantity and breadth of research data has massively expanded. So called “Big Data” is no longer limited to areas such as particle physics and astronomy, but is ubiquitous across almost all fields of research. This is generating exciting new opportunities, but also challenges.

The promise of open research data is that they will not only accelerate scientific discovery and improve reproducibility, but they will also speed up innovation and improve citizen engagement with research. In short, they will benefit society as a whole. However, for the benefits of open science and open research data to be realised, these data need to be carefully and sustainably managed so that they can be understood and used by both present and future generations of researchers.

Data repositories – based in local and national research institutions and international bodies – are where the long-term stewardship of research data takes place and hence they are the foundation of open science. Yet good data stewardship is costly and research budgets are limited. So, the development of sustainable business models for research data repositories needs to be a high priority in all countries. Surprisingly, perhaps, little systematic analysis has been done on income streams, costs, value propositions, and business models for data repositories, and that is the gap this report attempts to address, from a science policy perspective…..

This project was designed to take up the challenge and to contribute to a better understanding of how research data repositories are funded, and what developments are occurring in their funding. Central questions included:

  • How are data repositories currently funded, and what are the key revenue sources?
  • What innovative revenue sources are available to data repositories?
  • How do revenue sources fit together into sustainable business models?
  • What incentives for, and means of, optimising costs are available?
  • What revenue sources and business models are most acceptable to key stakeholders?…(More)”

There’s more to evidence-based policies than data: why it matters for healthcare


 at The Conversation: “The big question is: how can countries strengthen their health systems to deliver accessible, affordable and equitable care when they are often under-financed and governed in complex ways?

One answer lies in governments developing policies and programmes that are informed by evidence of what works or doesn’t. This should include what we would call “traditional data”, but should also include a broader definition of evidence. This would mean including, for example, information from citizens and stakeholders as well as programme evaluations. In this way, policies can be made more relevant for the people they affect.

Globally there is an increasing appreciation for this sort of policymaking that relies of a broader definition of evidence. Countries such as South Africa, Ghana and Thailand provide good examples.

What is evidence?

Using evidence to inform the development of health care has grown out of the use of science to choose the best decisions. It is based on data being collected in a methodical way. This approach is useful but it can’t always be neatly applied to policymaking. There are several reasons for this.

The first is that there are many different types of evidence. Evidence is more than data, even though the terms are often used to mean the same thing. For example, there is statistical and administrative data, research evidence, citizen and stakeholder information as well as programme evaluations.

The challenge is that some of these are valued more than others. More often than not, statistical data is more valued in policymaking. But both researchers and policymakers must acknowledge that for policies to be sound and comprehensive, different phases of policymaking process would require different types of evidence.

Secondly, data-as-evidence is only one input into policymaking. Policymakers face a long list of pressures they must respond to, including time, resources, political obligations and unplanned events.

Researchers may push technically excellent solutions designed in research environments. But policymakers may have other priorities in mind: are the solutions being put to them practical and affordable?Policymakers also face the limitations of having to balance various constituents while straddling the constraints of the bureaucracies they work in.

Researchers must recognise that policymakers themselves are a source of evidence of what works or doesn’t. They are able to draw on their own experiences, those of their constituents, history and their contextual knowledge of the terrain.

What this boils down to is that for policies that are based on evidence to be effective, fewer ‘push/pull’ models of evidence need to be used. Instead the models where evidence is jointly fashioned should be employed.

This means that policymakers, researchers and other key actors (like health managers or communities) must come together as soon as a problem is identified. They must first understand each other’s ideas of evidence and come to a joint conclusion of what evidence would be appropriate for the solution.

In South Africa, for example, the Department of Environmental Affairshas developed a four-phase process to policymaking. In the first phase, researchers and policymakers come together to set the agenda and agree on the needed solution. Their joint decision is then reviewed before research is undertaken and interpreted together….(More)”.

Big data in social and psychological science: theoretical and methodological issues


Paper by Lin Qiu, Sarah Hian May Chan and David Chan in the Journal of Computational Social Science: “Big data presents unprecedented opportunities to understand human behavior on a large scale. It has been increasingly used in social and psychological research to reveal individual differences and group dynamics. There are a few theoretical and methodological challenges in big data research that require attention. In this paper, we highlight four issues, namely data-driven versus theory-driven approaches, measurement validity, multi-level longitudinal analysis, and data integration. They represent common problems that social scientists often face in using big data. We present examples of these problems and propose possible solutions….(More)”.

Enhancing social impact through better monitoring, evaluation, and learning


Deloitte: “Social sector organizations tackle some of the world’s most difficult and complex challenges on a daily basis. And, just as in other industries, getting the right data and information at the right time is essential to understanding what an organization needs to achieve, whether it is doing what it set out to do, and what impact its efforts are actually having. Yet, despite marked advances in the tools and methods for monitoring, evaluation, and learning in the social sector, as well as a growing number of bright spots in practice emerging in the field, there is broad dissatisfaction across the sector about how data is—or is not—used….

Based on our interviews, the research team identified three characteristics that participants within and outside the social sector believe should be defining pillars of a better future for monitoring, evaluation, and learning. These three characteristics are purpose, perspective, and alignment with other actors….(More)”

Screen Shot 2017-12-14 at 7.38.20 AM

The Engineers and the Political System


Aaron Timms at the Los Angeles Review of Books: “Engineers enjoy a prestige in China that connects them to political power far more directly than in the United States. ….America, by contrast, has historically been governed by lawyers. That remains true today: there are 218 lawyers in Congress and 208 former businesspeople, according to the Congressional Research Service, but only eight engineers. (Science is even more severely underrepresented, with just three members in the House.) It’s unlikely that that balance will tilt meaningfully in favor of STEM-ers in the near term. But in another sense, the growing cultural capital of the engineers will inevitably translate to political power, whatever its form.

The engineering profession today is broad, much broader than it was in 1921 when Thorstein Veblen published The Engineers and the Price System, his classic pamphlet on industrial sabotage and government by technocrats. Engineering has outgrown the four traditional branches (chemical, civil, electrical, mechanical) to include all the professions in which the laws of mathematics and science are applied to real-world problems…..In a way that was never the case for previous generations, engineering today is politics, and politics engineering. Power is coming for the engineers, but are the engineers ready for power?

…tech smarts do not port easily to politics. However violently Silicon Valley pushes the story that it’s here to fix things for all of us, building an algorithm and coming up with intelligent ways to improve society are not the same thing. The triumph of the engineers is that they’ve managed to convince so many people otherwise.

This victory is more than simply economic or mechanical; engineering has also come to permeate the language of politics itself. Zuckerberg’s doe-eyed both-sidesism is the latest expression of the idea, nourished through the Clinton years and the height of the evidence-based policy movement, that facts offer the surest solution to knotty political problems. This is, we already know, a temple built on sand, ignoring as it does the intractably political nature of politics; hence the failure of “figures” and “facts” and “evidence” to do anything to shift positions on gun reform or voter fraud. But it’s a temple with enduring bipartisan appeal, and the engineers have come along at the right moment to give it a fresh lick of paint. If thinking like an engineer is the new way to do business, engineerialism, in politics, is the new centrism — rule by experts remarketed for the innovation age. It might be generations before a Veblenian technocrat calls the White House home, but no presidency can match the power engineers already have — a power to define progress, a power without check….(More)”.

Solving Public Problems with Data


Dinorah Cantú-Pedraza and Sam DeJohn at The GovLab: “….To serve the goal of more data-driven and evidence-based governing,  The GovLab at NYU Tandon School of Engineering this week launched “Solving Public Problems with Data,” a new online course developed with support from the Laura and John Arnold Foundation.

This online lecture series helps those working for the public sector, or simply in the public interest, learn to use data to improve decision-making. Through real-world examples and case studies — captured in 10 video lectures from leading experts in the field — the new course outlines the fundamental principles of data science and explores ways practitioners can develop a data analytical mindset. Lectures in the series include:

  1. Introduction to evidence-based decision-making  (Quentin Palfrey, formerly of MIT)
  2. Data analytical thinking and methods, Part I (Julia Lane, NYU)
  3. Machine learning (Gideon Mann, Bloomberg LP)
  4. Discovering and collecting data (Carter Hewgley, Johns Hopkins University)
  5. Platforms and where to store data (Arnaud Sahuguet, Cornell Tech)
  6. Data analytical thinking and methods, Part II (Daniel Goroff, Alfred P. Sloan Foundation)
  7. Barriers to building a data practice (Beth Blauer, Johns Hopkins University and GovEx)
  8. Data collaboratives (Stefaan G. Verhulst, The GovLab)
  9. Strengthening a data analytic culture (Amen Ra Mashariki, ESRI)
  10. Data governance and sharing (Beth Simone Noveck, NYU Tandon/The GovLab)

The goal of the lecture series is to enable participants to define and leverage the value of data to achieve improved outcomes and equities, reduced cost and increased efficiency in how public policies and services are created. No prior experience with computer science or statistics is necessary or assumed. In fact, the course is designed precisely to serve public professionals seeking an introduction to data science….(More)”.

Science’s Next Frontier? It’s Civic Engagement


Louise Lief at Discover Magazine: “…As a lay observer who has explored scientists’ relationship to the public, I have often wondered why many scientists and scientific institutions continue to rely on what is known as the “deficit model” of science communication, despite its well-documented shortcomings and even a backfire effect. This approach views the public as  “empty vessels” or “warped minds” ready to be set straight with facts. Perhaps many scientists continue to use it because it’s familiar and mimics classroom instruction. But it’s not doing the job.

Scientists spend much of their time with the public defending science, and little time building trust.

Many scientists also give low priority to trust building. At the 2016 American Association for the Advancement of Science conference, Michigan State University professor John C. Besley showed these results (right) of a survey of scientists’ priorities for engaging with the public online.

Scientists are focusing on the frustrating, reactive task of defending science, spending little time establishing bonds of trust with the public, which comes in last as a professional priority. How much more productive their interactions with the public – and through them, policymakers — would be if establishing trust was a top priority!

There is evidence that the public is hungry for such exchanges. When Research!America asked the public in 2016 how important is it for scientists to inform elected officials and the public about their research and its impact on society, 84 percent said it was very or somewhat important — a number that ironically mirrors the percentage of Americans who cannot name a scientist….

This means scientists need to go even further, venturing into unfamiliar local venues where science may not be mentioned but where communities gather to discuss their problems. Interesting new opportunities to do this are emerging nation wide. In 2014 the Chicago Community Trust, one of the nation’s largest community foundations, launched a series of dinners across the city through a program called On the Table, to discuss community problems and brainstorm possible solutions. In 2014, the first year, almost 10,000 city residents participated. In 2017, almost 100,000 Chicago residents took part. Recently the Trust added a grants component to the program, awarding more than $135,000 in small grants to help participants translate their ideas into action….(More)”.

Democracy is dead: long live democracy!


Helen Margetts in OpenDemocracy: “In the course of the World Forum for Democracy 2017, and in political commentary more generally, social media are blamed for almost everything that is wrong with democracy. They are held responsible for pollution of the democratic environment through fake news, junk science, computational propaganda and aggressive micro-targeting. In turn, these phenomena have been blamed for the rise of populism, political polarization, far-right extremism and radicalisation, waves of hate against women and minorities, post-truth, the end of representative democracy, fake democracy and ultimately, the death of democracy. It feels like the tirade of relatives of the deceased at the trial of the murderer. It is extraordinary how much of this litany is taken almost as given, the most gloomy prognoses as certain visions of the future.

Yet actually we know rather little about the relationship between social media and democracy. Because ten years of the internet and social media have challenged everything we thought we knew.  They have injected volatility and instability into political systems, bringing a continual cast of unpredictable events. They bring into question normative models of democracy – by which we might understand the macro-level shifts at work  – seeming to make possible the highest hopes and worst fears of republicanism and pluralism.

They have transformed the ecology of interest groups and mobilizations. They have challenged élites and ruling institutions, bringing regulatory decay and policy sclerosis. They create undercurrents of political life that burst to the surface in seemingly random ways, making fools of opinion polls and pollsters. And although the platforms themselves generate new sources of real-time transactional data that might be used to understand and shape this changed environment, most of this data is proprietary and inaccessible to researchers, meaning that the revolution in big data and data science has passed by democracy research.

What do we know? The value of tiny acts

Certainly digital media are entwined with every democratic institution and the daily lives of citizens. When deciding whether to vote, to support, to campaign, to demonstrate, to complain – digital media are with us at every step, shaping our information environment and extending our social networks by creating hundreds or thousands of ‘weak ties’, particularly for users of social media platforms such as Facebook or Instagram….(More)”.

When Data Science Destabilizes Democracy and Facilitates Genocide


Rachel Thomas in Fast.AI onWhat is the ethical responsibility of data scientists?”…What we’re talking about is a cataclysmic change… What we’re talking about is a major foreign power with sophistication and ability to involve themselves in a presidential election and sow conflict and discontent all over this country… You bear this responsibility. You’ve created these platforms. And now they are being misusedSenator Feinstein said this week in a senate hearing. Who has created a cataclysmic change? Who bears this large responsibility? She was talking to executives at tech companies and referring to the work of data scientists.

Data science can have a devastating impact on our world, as illustrated by inflammatory Russian propaganda being shown on Facebook to 126 million Americans leading up to the 2016 election (and the subject of the senate hearing described above) or by lies spread via Facebook that are fueling ethnic cleansing in Myanmar. Over half a million Rohinyga have been driven from their homes due to systematic murder, rape, and burning. Data science is foundational to Facebook’s newsfeed, in determining what content is prioritized and who sees what….

The examples of bias in data science are myriad and include:

You can do awesome and meaningful things with data science (such as diagnosing cancer, stopping deforestation, increasing farm yields, and helping patients with Parkinson’s disease), and you can (often unintentionally) enable terrible things with data science, as the examples in this post illustrate. Being a data scientist entails both great opportunity, as well as great responsibility, to use our skills to not make the world a worse place. Ultimately, doing data science is about humans, not just the users of our products, but everyone who will be impacted by our work. (More)”.

Understanding Corporate Data Sharing Decisions: Practices, Challenges, and Opportunities for Sharing Corporate Data with Researchers


Leslie Harris at the Future of Privacy Forum: “Data has become the currency of the modern economy. A recent study projects the global volume of data to grow from about 0.8 zettabytes (ZB) in 2009 to more than 35 ZB in 2020, most of it generated within the last two years and held by the corporate sector.

As the cost of data collection and storage becomes cheaper and computing power increases, so does the value of data to the corporate bottom line. Powerful data science techniques, including machine learning and deep learning, make it possible to search, extract and analyze enormous sets of data from many sources in order to uncover novel insights and engage in predictive analysis. Breakthrough computational techniques allow complex analysis of encrypted data, making it possible for researchers to protect individual privacy, while extracting valuable insights.

At the same time, these newfound data sources hold significant promise for advancing scholarship and shaping more impactful social policies, supporting evidence-based policymaking and more robust government statistics, and shaping more impactful social interventions. But because most of this data is held by the private sector, it is rarely available for these purposes, posing what many have argued is a serious impediment to scientific progress.

A variety of reasons have been posited for the reluctance of the corporate sector to share data for academic research. Some have suggested that the private sector doesn’t realize the value of their data for broader social and scientific advancement. Others suggest that companies have no “chief mission” or public obligation to share. But most observers describe the challenge as complex and multifaceted. Companies face a variety of commercial, legal, ethical, and reputational risks that serve as disincentives to sharing data for academic research, with privacy – particularly the risk of reidentification – an intractable concern. For companies, striking the right balance between the commercial and societal value of their data, the privacy interests of their customers, and the interests of academics presents a formidable dilemma.

To be sure, there is evidence that some companies are beginning to share for academic research. For example, a number of pharmaceutical companies are now sharing clinical trial data with researchers, and a number of individual companies have taken steps to make data available as well. What is more, companies are also increasingly providing open or shared data for other important “public good” activities, including international development, humanitarian assistance and better public decision-making. Some are contributing to data collaboratives that pool data from different sources to address societal concerns. Yet, it is still not clear whether and to what extent this “new era of data openness” will accelerate data sharing for academic research.

Today, the Future of Privacy Forum released a new study, Understanding Corporate Data Sharing Decisions: Practices, Challenges, and Opportunities for Sharing Corporate Data with ResearchersIn this report, we aim to contribute to the literature by seeking the “ground truth” from the corporate sector about the challenges they encounter when they consider making data available for academic research. We hope that the impressions and insights gained from this first look at the issue will help formulate further research questions, inform the dialogue between key stakeholders, and identify constructive next steps and areas for further action and investment….(More)”.