Text as Data: A New Framework for Machine Learning and the Social Sciences


Book by Justin Grimmer, Margaret E. Roberts, and Brandon M. Stewart: “From social media posts and text messages to digital government documents and archives, researchers are bombarded with a deluge of text reflecting the social world. This textual data gives unprecedented insights into fundamental questions in the social sciences, humanities, and industry. Meanwhile new machine learning tools are rapidly transforming the way science and business are conducted. Text as Data shows how to combine new sources of data, machine learning tools, and social science research design to develop and evaluate new insights.

Text as Data is organized around the core tasks in research projects using text—representation, discovery, measurement, prediction, and causal inference. The authors offer a sequential, iterative, and inductive approach to research design. Each research task is presented complete with real-world applications, example methods, and a distinct style of task-focused research.

Bridging many divides—computer science and social science, the qualitative and the quantitative, and industry and academia—Text as Data is an ideal resource for anyone wanting to analyze large collections of text in an era when data is abundant and computation is cheap, but the enduring challenges of social science remain…(More)”.

Big data, computational social science, and other recent innovations in social network analysis


Paper by David Tindall, John McLevey, Yasmin Koop-Monteiro, Alexander Graham: “While sociologists have studied social networks for about one hundred years, recent developments in data, technology, and methods of analysis provide opportunities for social network analysis (SNA) to play a prominent role in the new research world of big data and computational social science (CSS). In our review, we focus on four broad topics: (1) Collecting Social Network Data from the Web, (2) Non-traditional and Bipartite/Multi-mode Networks, including Discourse and Semantic Networks, and Social-Ecological Networks, (3) Recent Developments in Statistical Inference for Networks, and (4) Ethics in Computational Network Research…(More)”

The giant plan to track diversity in research journals


Article by Holly Else & Jeffrey M. Perkel: “In the next year, researchers should expect to face a sensitive set of questions whenever they send their papers to journals, and when they review or edit manuscripts. More than 50 publishers representing over 15,000 journals globally are preparing to ask scientists about their race or ethnicity — as well as their gender — in an initiative that’s part of a growing effort to analyse researcher diversity around the world. Publishers say that this information, gathered and stored securely, will help to analyse who is represented in journals, and to identify whether there are biases in editing or review that sway which findings get published. Pilot testing suggests that many scientists support the idea, although not all.

The effort comes amid a push for a wider acknowledgement of racism and structural racism in science and publishing — and the need to gather more information about it. In any one country, such as the United States, ample data show that minority groups are under-represented in science, particularly at senior levels. But data on how such imbalances are reflected — or intensified — in research journals are scarce. Publishers haven’t systematically looked, in part because journals are international and there has been no measurement framework for race and ethnicity that made sense to researchers of many cultures.

“If you don’t have the data, it is very difficult to understand where you are at, to make changes, set goals and measure progress,” says Holly Falk-Krzesinski, vice-president of research intelligence at the Dutch publisher Elsevier, who is working with the joint group and is based in Chicago, Illinois.

In the absence of data, some scientists have started measuring for themselves. Computational researchers are scouring the literature using software that tries to estimate racial and ethnic diversity across millions of published research articles, and to examine biases in who is represented or cited. Separately, over the past two years, some researchers have criticized publishers for not having diversity data already, and especially for being slow to collate information about small groups of elite decision makers: journal editors and editorial boards. At least one scientist has started publicizing those numbers himself….(More)”.

NIH issues a seismic mandate: share data publicly


Max Kozlov at Nature: “In January 2023, the US National Institutes of Health (NIH) will begin requiring most of the 300,000 researchers and 2,500 institutions it funds annually to include a data-management plan in their grant applications — and to eventually make their data publicly available.

Researchers who spoke to Nature largely applaud the open-science principles underlying the policy — and the global example it sets. But some have concerns about the logistical challenges that researchers and their institutions will face in complying with it. Namely, they worry that the policy might exacerbate existing inequities in the science-funding landscape and could be a burden for early-career scientists, who do the lion’s share of data collection and are already stretched thin.

Because the vast majority of laboratories and institutions don’t have data managers who organize and curate data, the policy — although well-intentioned — will probably put a heavy burden on trainees and early-career principal investigators, says Lynda Coughlan, a vaccinologist at the University of Maryland School of Medicine in Baltimore, who has been leading a research team for fewer than two years and is worried about what the policy will mean for her.

Jorgenson says that, although the policy might require researchers to spend extra time organizing their data, it’s an essential part of conducting research, and the potential long-term boost in public trust for science will justify the extra effort…(More)”.

The chronic growing pains of communicating science online


Dominique Brossard and Dietram A. Scheufele at Science: “Almost a decade ago, we wrote, “Without applied research on how to best communicate science online, we risk creating a future where the dynamics of online communication systems have a stronger impact on public views about science than the specific research that we as scientists are trying to communicate”. Since then, the footprint of subscription- based news content has slowly shrunk. Meanwhile, microtargeted information increasingly dominates social media, curated and prioritized algorithmically on the basis of audience demographics, an abundance of digital trace data, and other consumer information. Partly as a result, hyperpolarized public attitudes on issues such as COVID-19 vaccines or climate change emerge and grow in separate echo chambers.

Scientists have been slow to adapt to a shift in power in the science information ecosystem—changes that are not likely to reverse.The business-as-usual response to this challenge from many parts of the scientific community—especially in science, technology, engineering, and mathematics fields— has been frustrating to those who conduct research on science communication. Many scientists-turned-communicators continue to see online communication environments mostly as tools for resolving information asymmetries between experts and lay audiences. As a result, they blog, tweet, and post podcasts and videos to promote public understanding and excitement about science. To be fair, this has been driven most recently by a demand from policy-makers and from audiences interested in policy and decision-relevant science during the COVID-19 pandemic.

Unfortunately, social science research suggests that rapidly evolving online information ecologies are likely to be minimally responsive to scientists who upload content—however engaging it may seem— to TikTok or YouTube. In highly contested national and global information environments, the scientific community is just one of many voices competing for attention and public buy-in about a range of issues, from COVID-19 to artificial intelligence to genetic engineering, among other topics. This competition for public attention has produced at least three urgent lessons that the scientific community must face as online information environments rapidly displace traditional, mainstream media….(More)”.

By focusing on outputs, rather than people, we misunderstand the real impact of research


Blog by Paul Nightingale and Rebecca Vine: “Increases in funding for research come with a growing expectation that researchers will do more to improve social welfare, economic prosperity and more broadly foster innovation. It is widely accepted that innovation is a key driver of long-term economic growth and that public funding for research complements private investment. What is more contested is how research delivers impact. Whether it comes from the kinds of linear processes of knowledge transfer from researcher to user, sought for and often narrated in REF impact case studies. Or, if the indirect effects of research such as expertise, networks, instrumentation, methods and trained students, are as important as the discoveries….

One reason research is so important, is that as the economy has changed and demand for experts has increased. As we noted in a Treasury report over 20 years ago, often the most valuable output of research is ‘talent, not technology’. The ‘post-graduate premium’ that having a Masters qualification adds to starting salaries is evidence of this. But why is expertise so valuable? Experts don’t just know more than novices, they understand things differently, drawing on more abstract, ‘deeper’ representations. Research on chess-grandmasters, for example, shows that they understand chess piece configurations by seeing patterns. They can see a Sicilian defence, while novices just see a selection of chess pieces. Their expertise enables them to configure chess positions more effectively and solve problems more rapidly. They draw different conclusions than novices, typically starting closer to more robust solutions, finding solutions faster, and exploring fewer dead-ends….

Research is extremely important because innovation requires more diverse and deeper stocks of knowledge. Academics with field expertise and highly developed research skills can play a valuable and important role co-producing research and creating impact. These observations are drawn from our ESRC-funded research collaboration with the UK government – known as Project X. Within a year Project X became the mechanism to coordinate the Cabinet Office’s areas of research interest (ARIs) for government major project delivery. This required a sophisticated governance structure and the careful coordination of a mixed portfolio of practice-focused and theoretical research…(More)”.

Crowdsourcing research questions in science


Paper by Susanne Beck, Tiare-Maria Brasseur, Marion Poetz and Henry Sauermann: “Scientists are increasingly crossing the boundaries of the professional system by involving the general public (the crowd) directly in their research. However, this crowd involvement tends to be confined to empirical work and it is not clear whether and how crowds can also be involved in conceptual stages such as formulating the questions that research is trying to address. Drawing on five different “paradigms” of crowdsourcing and related mechanisms, we first discuss potential merits of involving crowds in the formulation of research questions (RQs). We then analyze data from two crowdsourcing projects in the medical sciences to describe key features of RQs generated by crowd members and compare the quality of crowd contributions to that of RQs generated in the conventional scientific process. We find that the majority of crowd contributions are problem restatements that can be useful to assess problem importance but provide little guidance regarding potential causes or solutions. At the same time, crowd-generated research questions frequently cross disciplinary boundaries by combining elements from different fields within and especially outside medicine. Using evaluations by professional scientists, we find that the average crowd contribution has lower novelty and potential scientific impact than professional research questions, but comparable practical impact. Crowd contributions outperform professional RQs once we apply selection mechanisms at the level of individual contributors or across contributors. Our findings advance research on crowd and citizen science, crowdsourcing and distributed knowledge production, as well as the organization of science. We also inform ongoing policy debates around the involvement of citizens in research in general, and agenda setting in particular.Author links open overlay panel…(More)”.

Citizen science and the right to research: building local knowledge of climate change impacts


Paper by Sarita Albagli & Allan Yu Iwama: “The article presents results of a research project aiming to develop theoretical and empirical contributions on participatory approaches and methods of citizen science for risk mapping and adaptation to climate change. In the first part, the paper presents a review of the literature on key concepts and perspectives related to participatory citizen science, introducing the concept of the “right to research”. It highlights the mutual fertilization with participatory mapping methods to deal with disaster situations associated to climate change. In the second part, the paper describes and presents the results and conclusions of an action-research developed on the coastline between the states of São Paulo and Rio de Janeiro, Brazil in 2017–2018. It involved affected communities as protagonists in mapping and managing risks of natural disasters caused by extreme climate events, by combining citizen science approaches and methods with Participatory Geographic Information Systems (PGIS) and social cartography. The article concludes by pointing out the contributions and limits of the “right to research” as a relevant Social Science approach to reframe citizen science from a democratic view….(More)”.

Public Provides NASA with New Innovations through Prize Competitions, Crowdsourcing, Citizen Science Opportunities


NASA Report: “Whether problem-solving during the pandemic, establishing a long-term presence at the Moon, or advancing technology to adapt to life in space, NASA has leveraged open innovation tools to inspire solutions to some of our most timely challenges – while using the creativity of everyone from garage tinkerers to citizen scientists and students of all ages.

Open Innovation: Boosting NASA Higher, Faster, and Farther highlights some of those breakthroughs, which accelerate space technology development and discovery while giving the public a gateway to work with NASA. Open innovation initiatives include problem-focused challenges and prize competitions, data hackathons, citizen science, and crowdsourcing projects that invite the public to lend their skills, ideas, and time to support NASA research and development programs.

NASA engaged the public with 56 public prize competitions and challenges and 14 citizen science and crowdsourcing activities over fiscal years 2019 and 2020. NASA awarded $2.2 million in prize money, and members of the public submitted over 11,000 solutions during that period.

“NASA’s accomplishments have hardly been NASA’s alone. Tens of thousands more individuals from academic institutions, private companies, and other space agencies also contribute to these solutions. Open innovation expands the NASA community and broadens the agency’s capacity for innovation and discovery even further,” said Amy Kaminski, Prizes, Challenges, and Crowdsourcing program executive at NASA Headquarters in Washington. “We harness the perspectives, expertise, and enthusiasm of ‘the crowd’ to gain diverse solutions, speed up projects, and reduce costs.”

This edition of the publication highlights:

  • How NASA used open innovation tools to accelerate the pace of problem-solving during the COVID-19 pandemic, enabling a sprint of creativity to create valuable solutions in support of this global crisis
  • How NASA invited everyone to embrace the Moon as a technological testing ground through public prize competitions and challenges, sparking development that could help prolong human stays on the Moon and lay the foundation for human exploration to Mars and beyond  
  • How citizen scientists gather, sort, and upload data, resulting in fruitful partnerships between the public and NASA scientists
  • How NASA’s student-focused challenges have changed lives and positively impacted underserved communities…(More)”.

Evidence Commission issues wake-up call and path forward for relying on evidence


Press Release and Report by Global Commission on Evidence: ‘Slow burn’ societal challenges like educational achievement, health-system performance and climate change have taken a backseat to the global pandemic, now entering its third year. But a global commission report, released today, finds that decision-makers responding to present-day societal challenges and tomorrow’s crises have an unprecedented opportunity to build on what has worked in using evidence before and during the pandemic.

“Since the start of the COVID-19 pandemic, I’ve never before seen so much interest – from political leaders of many political persuasions and in diverse countries – in drawing on evidence to inform their response,” said John Lavis, co-lead of the secretariat for The Global Commission on Evidence to Address Societal Challenges. “This is an incredible opportunity to dramatically up our game in supporting political leaders to use evidence to address societal challenges at a global, national and local level.” .. Among its top eight recommendations are the following:
• Wake-up call — Decision-makers, evidence intermediaries and impact-oriented evidence producers should recognize the scale and nature of the problem.
• Resolution by multilateral organizations — The UN, the G20 and other multilateral organizations should endorse a resolution that commits these multilateral organizations and their member states to broaden their conception of evidence, and to support evidence-related global public goods and equitably distributed capacities to produce, share and use evidence.
• Landmark report — The World Bank should dedicate an upcoming World Development Report to providing the design of the evidence architecture needed globally, regionally and nationally, including the required investments in evidence related global public goods and in equitably distributed capacities to produce, share and use evidence.
• National (and sub-national) evidence-support systems — Every national (and sub-national) government should review their existing evidence-support system (and broader evidence infrastructure), fill the gaps both internally and through partnerships, and report publicly on their progress.
• Evidence in everyday life — Citizens should consider making decisions about their and their families’ well-being based on best evidence; spending their money on products and services that are backed by best evidence; volunteering their time and donating money to initiatives that use evidence to make decisions about what they do and how they do it; and supporting politicians who commit to using best evidence to address societal challenges and who commit (along with others) to supporting the use of evidence in everyday life.
• Dedicated evidence intermediaries — Dedicated evidence intermediaries should step forward to fill gaps left by government, provide continuity if staff turn-over in government is frequent, and leverage strong connections to global networks.
• News and social-media platforms — News and social-media platforms should build relationships with dedicated evidence intermediaries who can help leverage sources of best evidence, and with evidence producers who can help communicate evidence effectively, as well as ensure their algorithms present best evidence and combat misinformation….(More)”.