Social Research in Times of Big Data: The Challenges of New Data Worlds and the Need for a Sociology of Social Research

Paper by Rainer Diaz-Bone et al: “The phenomenon of big data does not only deeply affect current societies but also poses crucial challenges to social research. This article argues for moving towards a sociology of social research in order to characterize the new qualities of big data and its deficiencies. We draw on the neopragmatist approach of economics of convention (EC) as a conceptual basis for such a sociological perspective.

This framework suggests investigating processes of quantification in their interplay with orders of justifications and logics of evaluation. Methodological issues such as the question of the “quality of big data” must accordingly be discussed in their deep entanglement with epistemic values, institutional forms, and historical contexts and as necessarily implying political issues such as who controls and has access to data infrastructures. On this conceptual basis, the article uses the example of health to discuss the challenges of big data analysis for social research.

Phenomena such as the rise of new and massive privately owned data infrastructures, the economic valuation of huge amounts of connected data, or the movement of “quantified self” are presented as indications of a profound transformation compared to established forms of doing social research. Methodological and epistemological, but also institutional and political, strategies are presented to face the risk of being “outperformed” and “replaced” by big data analysis as they are already done in big US American and Chinese Internet enterprises. In conclusion, we argue that the sketched developments have important implications both for research practices and methods teaching in the era of big data…(More)”.

Philanthropy and the Future of Science and Technology

Book by Evan Michelson: “An increasingly important and often overlooked issue in science and technology policy is recognizing the role that philanthropies play in setting the direction of research. In an era where public and private resources for science are strained, the practices that foundations adopt to advance basic and applied research needs to be better understood. This first-of-its-kind study provides a detailed assessment of the current state of science philanthropy. This examination is particularly timely, given that science philanthropies will have an increasingly important and outsized role to play in advancing responsible innovation and in shaping how research is conducted.

Philanthropy and the Future of Science and Technology surveys the landscape of contemporary philanthropic involvement in science and technology by combining theoretical insights drawn from the responsible research and innovation (RRI) framework with empirical analysis investigating an array of detailed examples and case studies. Insights from interviews conducted with foundation representatives, scholars, and practitioners from a variety of sectors add real-world perspective. A wide range of philanthropic interventions are explored, focusing on support for individuals, institutions, and networks, with attention paid to the role that science philanthropies play in helping to establish and coordinate multi-sectoral funding partnerships. Novel approaches to science philanthropy are also considered, including the emergence of crowdfunding and the development of new institutional mechanisms to advance scientific research. The discussion concludes with an imaginative look into the future, outlining a series of lessons learned that can guide how new and established science philanthropies operate and envisioning alternative scenarios for the future that can inform how science philanthropy progresses over the coming decades.

This book offers a major contribution to the advancement of philanthropic investment in science and technology. Thus, it will be of considerable interest to researchers and students in public policy, public administration, political science, science and technology studies, sociology of science, and related disciplines….(More)”.

Peer-Reviewed Scientific Journals Don’t Really Do Their Job

Article by Simine Vazire: “THE RUSH FOR scientific cures and treatments for Covid-19 has opened the floodgates of direct communication between scientists and the public. Instead of waiting for their work to go through the slow process of peer review at scientific journals, scientists are now often going straight to print themselves, posting write-ups of their work to public servers as soon as they’re complete. This disregard for the traditional gatekeepers has led to grave concerns among both scientists and commentators: Might not shoddy science—and dangerous scientific errors—make its way into the media, and spread before an author’s fellow experts can correct it? As two journalism professors suggested in an op-ed last month for The New York Times, it’s possible the recent spread of so-called preprints has only “sown confusion and discord with a general public not accustomed to the high level of uncertainty inherent in science.”

There’s another way to think about this development, however. Instead of showing (once again) that formal peer review is vital for good science, the last few months could just as well suggest the opposite. To me, at least—someone who’s served as an editor at seven different journals, and editor in chief at two—the recent spate of decisions to bypass traditional peer review gives the lie to a pair of myths that researchers have encouraged the public to believe for years: First, that peer-reviewed journals publish only trustworthy science; and second, that trustworthy science is published only in peer-reviewed journals.

Scientists allowed these myths to spread because it was convenient for us. Peer-reviewed journals came into existence largely to keep government regulators off our backs. Scientists believe that we are the best judges of the validity of each other’s work. That’s very likely true, but it’s a huge leap from that to “peer-reviewed journals publish only good science.” The most selective journals still allow flawed studies—even really terribly flawed ones—to be published all the time. Earlier this month, for instance, the journal Proceedings of the National Academy of Sciences put out a paper claiming that mandated face coverings are “the determinant in shaping the trends of the pandemic.” PNAS is a very prestigious journal, and their website claims that they are an “authoritative source” that works “to publish only the highest quality scientific research.” However, this paper was quickly and thoroughly criticized on social media; by last Thursday, 45 researchers had signed a letter formally calling for its retraction.

Now the jig is up. Scientists are writing papers that they want to share as quickly as possible, without waiting the months or sometimes years it takes to go through journal peer review. So they’re ditching the pretense that journals are a sure-fire quality control filter, and sharing their papers as self-published PDFs. This might be just the shakeup that peer review needs….(More)”.

How Facebook, Twitter and other data troves are revolutionizing social science

Heidi Ledford at Nature: “Elizaveta Sivak spent nearly a decade training as a sociologist. Then, in the middle of a research project, she realized that she needed to head back to school.

Sivak studies families and childhood at the National Research University Higher School of Economics in Moscow. In 2015, she studied the movements of adolescents by asking them in a series of interviews to recount ten places that they had visited in the past five days. A year later, she had analysed the data and was feeling frustrated by the narrowness of relying on individual interviews, when a colleague pointed her to a paper analysing data from the Copenhagen Networks Study, a ground-breaking project that tracked the social-media contacts, demographics and location of about 1,000 students, with five-minute resolution, over five months1. She knew then that her field was about to change. “I realized that these new kinds of data will revolutionize social science forever,” she says. “And I thought that it’s really cool.”

With that, Sivak decided to learn how to program, and join the revolution. Now, she and other computational social scientists are exploring massive and unruly data sets, extracting meaning from society’s digital imprint. They are tracking people’s online activities; exploring digitized books and historical documents; interpreting data from wearable sensors that record a person’s every step and contact; conducting online surveys and experiments that collect millions of data points; and probing databases that are so large that they will yield secrets about society only with the help of sophisticated data analysis.

Over the past decade, researchers have used such techniques to pick apart topics that social scientists have chased for more than a century: from the psychological underpinnings of human morality, to the influence of misinformation, to the factors that make some artists more successful than others. One study uncovered widespread racism in algorithms that inform health-care decisions2; another used mobile-phone data to map impoverished regions in Rwanda3.

“The biggest achievement is a shift in thinking about digital behavioural data as an interesting and useful source”, says Markus Strohmaier, a computational social scientist at the GESIS Leibniz Institute for the Social Sciences in Cologne, Germany.

Not everyone has embraced that shift. Some social scientists are concerned that the computer scientists flooding into the field with ambitions as big as their data sets are not sufficiently familiar with previous research. Another complaint is that some computational researchers look only at patterns and do not consider the causes, or that they draw weighty conclusions from incomplete and messy data — often gained from social-media platforms and other sources that are lacking in data hygiene.

The barbs fly both ways. Some computational social scientists who hail from fields such as physics and engineering argue that many social-science theories are too nebulous or poorly defined to be tested.

This all amounts to “a power struggle within the social-science camp”, says Marc Keuschnigg, an analytical sociologist at Linköping University in Norrköping, Sweden. “Who in the end succeeds will claim the label of the social sciences.”

But the two camps are starting to merge. “The intersection of computational social science with traditional social science is growing,” says Keuschnigg, pointing to the boom in shared journals, conferences and study programmes. “The mutual respect is growing, also.”…(More)”.

The people solving mysteries during lockdown

Frank Swain at the BBC: “For almost half a century, Benedictine monks in Herefordshire dutifully logged the readings of a rain gauge on the grounds of Belmont Abbey, recording the quantity of rain that had fallen each month without fail. That is, until 1948, when measurements were suspended while the abbot waited for someone to repair a bullet hole in the gauge funnel.

How the bullet hole came to be there is still a mystery, but it’s just one of the stories uncovered by a team of 16,000 volunteers who have been taking part in Rainfall Rescue, a project to digitise hand-written records of British weather. The documents, held by the Met Office, contain 3.5 million datapoints and stretch as far back as 1820.

Ed Hawkins, a climate scientist at the University of Reading, leads the project. “It launched at the end of March, we realised people would have a lot of spare time on their hands,” he explains. “It was completed in 16 days. I was expecting 16 weeks, not 16 days… the volunteers absolutely blitzed it.” He says the data will be used to improve future weather predictions and climate modelling.

With millions of people trapped at home during the pandemic, citizen science projects are seeing a boom in engagement. Rainfall Rescue uses a platform called Zooniverse, which hosts dozens of projects covering everything from artworks to zebra. While the projects generally have scientific aims, many allow people to also contribute some good to the world. 

Volunteers can scour satellite images for rural houses across Africa so they can be connected to the electricity grid, for example. Another – led by researchers at the University of Nottingham in the UK – is hunting for signs of modern slavery in the shape of brick kilns in South Asia (although the project has faced some criticism for being an over-simplified way of looking at modern slavery).

Others are trying to track the spread of invasive species in the ocean from underwater photographs, or identify earthquakes and tremors by speeding up the seismic signals so they become audible and can be classified by sharp-eared volunteers. “You could type in data on old documents, count penguins, go to the Serengeti and look at track camera images – it’s an incredible array,” says Hawkins. “Whatever you’re interested in there’s something for you.”…(More)”.

A Practical Guide for Establishing an Evidence Centre

Report by Alliance for Useful Evidence: “Since 2013, Nesta and the Alliance for Useful Evidence have supported the development of more than eight evidence centres. This report draws on insight from our own experience, published material and interviews with senior leaders from a range of evidence intermediaries.

The report identifies five common ingredients that contribute to successful evidence centres:

  1. Clear objectives: Good knowledge of the centre’s intended user group(s), clear outcomes to work towards and an evidence-informed theory of change.
  2. Robust organisational development: Commitment to create an independent and sustainable organisation with effective governance and the right mix of skills and experience, over a timescale that will be sufficient to make a difference.
  3. Engaged users: Understanding users’ evidence needs and working collaboratively with them to increase their capability, motivation and opportunity to use evidence in their decision-making.
  4. Rigorous curation and creation of evidence: A robust and transparent approach to selecting and generating high-quality evidence for the centre’s users.
  5. A focus on impact: Commitment to learn from the centre’s activities, including successes and failures, so that you can increase your effectiveness in achieving your objectives…(More)”.

The “Social” Side of Big Data: Teaching BD Analytics to Political Science Students

Case report by Giampiero Giacomello and Oltion Preka: “In an increasingly technology-dependent world, it is not surprising that STEM (Science, Technology, Engineering, and Mathematics) graduates are in high demand. This state of affairs, however, has made the public overlook the case that not only computing and artificial intelligence are naturally interdisciplinary, but that a huge portion of generated data comes from human–computer interactions, thus they are social in character and nature. Hence, social science practitioners should be in demand too, but this does not seem the case. One of the reasons for such a situation is that political and social science departments worldwide tend to remain in their “comfort zone” and see their disciplines quite traditionally, but by doing so they cut themselves off from many positions today. The authors believed that these conditions should and could be changed and thus in a few years created a specifically tailored course for students in Political Science. This paper examines the experience of the last year of such a program, which, after several tweaks and adjustments, is now fully operational. The results and students’ appreciation are quite remarkable. Hence the authors considered the experience was worth sharing, so that colleagues in social and political science departments may feel encouraged to follow and replicate such an example….(More)”

Science Alone Can’t Solve Covid-19. The Humanities Must Help

Article by Anna Magdalena Elsner and Vanesa Rampton: “…To judge by news reports, the humanities are “nice to have” — think of the entertainment value of balcony music or an online book club — but not essential for helping resolve the crisis. But as the impacts of public health measures ripple through societies, languages, and cultures, thinking critically about our reaction to SARS-CoV-2 is as important as new scientific findings about the virus. The humanities can contribute to a deeper understanding of the entrenched mentalities and social dynamics that have informed society’s response to this crisis. And by encouraging us to turn a mirror on our own selves, they prompt us to question whether we are the rational individuals that we aspire to be, and whether we are sufficiently equipped, as a society, to solve our own problems.

WE ARE CREATURES of stories. Scholarship in the medical humanities has persistently emphasized that narratives are crucial for how humans experience illness. For instance, Felicity Callard, a professor of human geography, has written about how a lack of “narrative anchors” during the early days of the Covid-19 pandemic led to confusion over what counts as a “mild” symptom and what the “normal” course of the disease looks like, ultimately heightening the suffering the disease caused. Existing social conditions, previous illnesses and disabilities, a sense of precarity — all of these factors influence our attitude toward disease and how it affects the way we exist in the world.

We are entangled with nature. We tend to imagine a human world separate from natural laws, but the novel coronavirus reminds us of the extent to which we are intricately bound up with the life around us. As philosopher David Benatar has noted, the emergence of the new coronavirus is most likely a result of our treatment of nonhuman animals. The virus has forced us to alter our behavior, likely triggering higher rates of anxiety, depression, and other stress-related responses. In essence, it has shown how what we think of as “non-human” can become a fundamental part of our lives in unexpected ways.

We react to crises in predictable fashion, and with foreseeable cognitive and moral failings. A growing body of work suggests that, although we want to act on knowledge, it is our nature to react instinctively and short-sightedly. Images of overcapacity intensive care units, for example, galvanize us to comply with lockdown restrictions, even as we have much more difficulty acting prudentially to prevent the emergence of such viruses. The desire for a quick solution has fueled a race for a vaccine, even though — as historian of science David Jones has noted — failures and false starts have been recurring themes in past attempts to handle epidemics. Even if a vaccine were available, it wouldn’t erase the striking disparities in health outcomes across class, race, and gender…(More)”.

Eye-catching advances in some AI fields are not real

Matthew Hutson at Science: “Artificial intelligence (AI) just seems to get smarter and smarter. Each iPhone learns your face, voice, and habits better than the last, and the threats AI poses to privacy and jobs continue to grow. The surge reflects faster chips, more data, and better algorithms. But some of the improvement comes from tweaks rather than the core innovations their inventors claim—and some of the gains may not exist at all, says Davis Blalock, a computer science graduate student at the Massachusetts Institute of Technology (MIT). Blalock and his colleagues compared dozens of approaches to improving neural networks—software architectures that loosely mimic the brain. “Fifty papers in,” he says, “it became clear that it wasn’t obvious what the state of the art even was.”

The researchers evaluated 81 pruning algorithms, programs that make neural networks more efficient by trimming unneeded connections. All claimed superiority in slightly different ways. But they were rarely compared properly—and when the researchers tried to evaluate them side by side, there was no clear evidence of performance improvements over a 10-year period. The result, presented in March at the Machine Learning and Systems conference, surprised Blalock’s Ph.D. adviser, MIT computer scientist John Guttag, who says the uneven comparisons themselves may explain the stagnation. “It’s the old saw, right?” Guttag said. “If you can’t measure something, it’s hard to make it better.”

Researchers are waking up to the signs of shaky progress across many subfields of AI. A 2019 meta-analysis of information retrieval algorithms used in search engines concluded the “high-water mark … was actually set in 2009.” Another study in 2019 reproduced seven neural network recommendation systems, of the kind used by media streaming services. It found that six failed to outperform much simpler, nonneural algorithms developed years before, when the earlier techniques were fine-tuned, revealing “phantom progress” in the field. In another paper posted on arXiv in March, Kevin Musgrave, a computer scientist at Cornell University, took a look at loss functions, the part of an algorithm that mathematically specifies its objective. Musgrave compared a dozen of them on equal footing, in a task involving image retrieval, and found that, contrary to their developers’ claims, accuracy had not improved since 2006. “There’s always been these waves of hype,” Musgrave says….(More)”.

Why open science is critical to combatting COVID-19

Article by the OECD: “…In January 2020, 117 organisations – including journals, funding bodies, and centres for disease prevention – signed a statement titled “Sharing research data and findings relevant to the novel coronavirus outbreakcommitting to provide immediate open access for peer-reviewed publications at least for the duration of the outbreak, to make research findings available via preprint servers, and to share results immediately with the World Health Organization (WHO). This was followed in March by the Public Health Emergency COVID-19 Initiative, launched by 12 countries1 at the level of chief science advisors or equivalent, calling for open access to publications and machine-readable access to data related to COVID-19, which resulted in an even stronger commitment by publishers.

The Open COVID Pledge was launched in April 2020 by an international coalition of scientists, lawyers, and technology companies, and calls on authors to make all intellectual property (IP) under their control available, free of charge, and without encumbrances to help end the COVID-19 pandemic, and reduce the impact of the disease….

Remaining challenges

While clinical, epidemiological and laboratory data about COVID-19 is widely available, including genomic sequencing of the pathogen, a number of challenges remain:

  • All data is not sufficiently findable, accessible, interoperable and reusable (FAIR), or not yet FAIR data.
  • Sources of data tend to be dispersed, even though many pooling initiatives are under way, curation needs to be operated “on the fly”.
  • Providing access to personal health record sharing needs to be readily accessible, pending the patient’s consent. Legislation aimed at fostering interoperability and avoiding information blocking are yet to be passed in many OECD countries. Access across borders is even more difficult under current data protection frameworks in most OECD countries.
  • In order to achieve the dual objectives of respecting privacy while ensuring access to machine readable, interoperable and reusable clinical data, the Virus Outbreak Data Network (VODAN) proposes to create FAIR data repositories which could be used by incoming algorithms (virtual machines) to ask specific research questions.
  • In addition, many issues arise around the interpretation of data – this can be illustrated by the widely followed epidemiological statistics. Typically, the statistics concern “confirmed cases”, “deaths” and “recoveries”. Each of these items seem to be treated differently in different countries, and are sometimes subject to methodological changes within the same country.
  • Specific standards for COVID-19 data therefore need to be established, and this is one of the priorities of the UK COVID-19 Strategy. A working group within Research Data Alliance has been set up to propose such standards at an international level.
  • In some cases it could be inferred that the transparency of the statistics may have guided governments to restrict testing in order to limit the number of “confirmed cases” and avoid the rapid rise of numbers. Lower testing rates can in turn reduce the efficiency of quarantine measures, lowering the overall efficiency of combating the disease….(More)”.