Research Project Management and Leadership


Book by P. Alison Paprica: “The project management approaches, which are used by millions of people internationally, are often too detailed or constraining to be applied to research. In this handbook, project management expert P. Alison Paprica presents guidance specifically developed to help with the planning, management, and leadership of research.

Research Project Management and Leadership provides simplified versions of globally utilized project management tools, such as the work breakdown structure to visualize scope, and offers guidance on processes, including a five-step process to identify and respond to risks. The complementary leadership guidance in the handbook is presented in the form of interview write-ups with 19 Canadian and international research leaders, each of whom describes a situation where leadership skills were important, how they responded, and what they learned. The accessible language and practical guidance in the handbook make it a valuable resource for everyone from principal investigators leading multimillion-dollar projects to graduate students planning their thesis research. The book aims to help readers understand which management and leadership tools, processes, and practices are helpful in different circumstances, and how to implement them in research settings…(More)”.

How tracking animal movement may save the planet


Article by Matthew Ponsford: “Researchers have been dreaming of an Internet of Animals. They’re getting closer to monitoring 100,000 creatures—and revealing hidden facets of our shared world….There was something strange about the way the sharks were moving between the islands of the Bahamas.

Tiger sharks tend to hug the shoreline, explains marine biologist Austin Gallagher, but when he began tagging the 1,000-pound animals with satellite transmitters in 2016, he discovered that these predators turned away from it, toward two ancient underwater hills made of sand and coral fragments that stretch out 300 miles toward Cuba. They were spending a lot of time “crisscrossing, making highly tortuous, convoluted movements” to be near them, Gallagher says. 

It wasn’t immediately clear what attracted sharks to the area: while satellite images clearly showed the subsea terrain, they didn’t pick up anything out of the ordinary. It was only when Gallagher and his colleagues attached 360-degree cameras to the animals that they were able to confirm what they were so drawn to: vast, previously unseen seagrass meadows—a biodiverse habitat that offered a smorgasbord of prey.   

The discovery did more than solve a minor mystery of animal behavior. Using the data they gathered from the sharks, the researchers were able to map an expanse of seagrass stretching across 93,000 square kilometers of Caribbean seabed—extending the total known global seagrass coverage by more than 40%, according to a study Gallagher’s team published in 2022. This revelation could have huge implications for efforts to protect threatened marine ecosystems—seagrass meadows are a nursery for one-fifth of key fish stocks and habitats for endangered marine species—and also for all of us above the waves, as seagrasses can capture carbon up to 35 times faster than tropical rainforests. 

Animals have long been able to offer unique insights about the natural world around us, acting as organic sensors picking up phenomena that remain invisible to humans. More than 100 years ago, leeches signaled storms ahead by slithering out of the water; canaries warned of looming catastrophe in coal mines until the 1980s; and mollusks that close when exposed to toxic substances are still used to trigger alarms in municipal water systems in Minneapolis and Poland…(More)”.

Language Machinery


Essay by Richard Hughes Gibson: “… current debates about writing machines are not as fresh as they seem. As is quietly acknowledged in the footnotes of scientific papers, much of the intellectual infrastructure of today’s advances was laid decades ago. In the 1940s, the mathematician Claude Shannon demonstrated that language use could be both described by statistics and imitated with statistics, whether those statistics were in human heads or a machine’s memory. Shannon, in other words, was the first statistical language modeler, which makes ChatGPT and its ilk his distant brainchildren. Shannon never tried to build such a machine, but some astute early readers of his work recognized that computers were primed to translate his paper-and-ink experiments into a powerful new medium. In writings now discussed largely in niche scholarly and computing circles, these readers imagined—and even made preliminary sketches of—machines that would translate Shannon’s proposals into reality. These readers likewise raised questions about the meaning of such machines’ outputs and wondered what the machines revealed about our capacity to write.

The current barrage of commentary has largely neglected this backstory, and our discussions suffer for forgetting that issues that appear novel to us belong to the mid-twentieth century. Shannon and his first readers were the original residents of the headspace in which so many of us now find ourselves. Their ambitions and insights have left traces on our discourse, just as their silences and uncertainties haunt our exchanges. If writing machines constitute a “philosophical event” or a “prompt for philosophizing,” then I submit that we are already living in the event’s aftermath, which is to say, in Shannon’s aftermath. Amid the rampant speculation about a future dominated by writing machines, I propose that we turn in the other direction to listen to field reports from some of the first people to consider what it meant to read and write in Shannon’s world…(More)”.

Toward a 21st Century National Data Infrastructure: Managing Privacy and Confidentiality Risks with Blended Data


Report by the National Academies of Sciences, Engineering, and Medicine: “Protecting privacy and ensuring confidentiality in data is a critical component of modernizing our national data infrastructure. The use of blended data – combining previously collected data sources – presents new considerations for responsible data stewardship. Toward a 21st Century National Data Infrastructure: Managing Privacy and Confidentiality Risks with Blended Data provides a framework for managing disclosure risks that accounts for the unique attributes of blended data and poses a series of questions to guide considered decision-making.

Technical approaches to manage disclosure risk have advanced. Recent federal legislation, regulation and guidance has described broadly the roles and responsibilities for stewardship of blended data. The report, drawing from the panel review of both technical and policy approaches, addresses these emerging opportunities and the new challenges and responsibilities they present. The report underscores that trade-offs in disclosure risks, disclosure harms, and data usefulness are unavoidable and are central considerations when planning data-release strategies, particularly for blended data…(More)”.

Enabling Data-Driven Innovation : Learning from Korea’s Data Policies and Practices for Harnessing AI 


Report by the World Bank: “Over the past few decades, the Republic of Korea has consciously undertaken initiatives to transform its economy into a competitive, data-driven system. The primary objectives of this transition were to stimulate economic growth and job creation, enhance the nation’s capacity to withstand adversities such as the aftermath of COVID-19, and position it favorably to capitalize on emerging technologies, particularly artificial intelligence (AI). The Korean government has endeavored to accomplish these objectives through establishing a dependable digital data infrastructure and a comprehensive set of national data policies. This policy note aims to present a comprehensive synopsis of Korea’s extensive efforts to establish a robust digital data infrastructure and utilize data as a key driver for innovation and economic growth. The note additionally addresses the fundamental elements required to realize these benefits of data, including data policies, data governance, and data infrastructure. Furthermore, the note highlights some key results of Korea’s data policies, including the expansion of public data opening, the development of big data platforms, and the growth of the AI Hub. It also mentions the characteristics and success factors of Korea’s data policy, such as government support and the reorganization of institutional infrastructures. However, it acknowledges that there are still challenges to overcome, such as in data collection and utilization as well as transitioning from a government-led to a market-friendly data policy. The note concludes by providing developing countries and emerging economies with specific insights derived from Korea’s forward-thinking policy making that can assist them in harnessing the potential and benefits of data…(More)”.

Why Machines Learn: The Elegant Maths Behind Modern AI


Book by Anil Ananthaswamy: “Machine-learning systems are making life-altering decisions for us: approving mortgage loans, determining whether a tumour is cancerous, or deciding whether someone gets bail. They now influence discoveries in chemistry, biology and physics – the study of genomes, extra-solar planets, even the intricacies of quantum systems.

We are living through a revolution in artificial intelligence that is not slowing down. This major shift is based on simple mathematics, some of which goes back centuries: linear algebra and calculus, the stuff of eighteenth-century mathematics. Indeed by the mid-1850s, a lot of the groundwork was all done. It took the development of computer science and the kindling of 1990s computer chips designed for video games to ignite the explosion of AI that we see all around us today. In this enlightening book, Anil Ananthaswamy explains the fundamental maths behind AI, which suggests that the basics of natural and artificial intelligence might follow the same mathematical rules…(More)”.

Do disappearing data repositories pose a threat to open science and the scholarly record?


Article by Dorothea Strecker, Heinz Pampel, Rouven Schabinger and Nina Leonie Weisweiler: “Research data repositories, such as Zenodo or the UK Data Archive, are specialised information infrastructures that focus on the curation and dissemination of research data. One of repositories’ main tasks is maintaining their collections long-term, see for example the TRUST Principles, or the requirements of the certification organization CoreTrustSeal. Long-term preservation is also a prerequisite for several data practices that are getting increasing attention, such as data reuse and data citation.

For data to remain usable, the infrastructures that host them also have to be kept operational. However, the long-term operation of research data repositories is challenging, and sometimes, for varying reasons and despite best efforts, they are shut down….

In a recent study we therefore set out to take an infrastructure perspective on the long-term preservation of research data by investigating repositories across disciplines and types that were shut down. We also tried to estimate the impact of repository shutdown on data availability…

We found that repository shutdown was not rare: 6.2% of all repositories listed in re3data were shut down. Since the launch of the registry in 2012, at least one repository has been shut down each year (see Fig.1). The median age of a repository when shutting down was 12 years…(More)”.

How Much of the World Is It Possible to Model?


Article by Dan Rockmore: “…Modelling, in general, is now routine. We model everything, from elections to economics, from the climate to the coronavirus. Like model cars, model airplanes, and model trains, mathematical models aren’t the real thing—they’re simplified representations that get the salient parts right. Like fashion models, model citizens, and model children, they’re also idealized versions of reality. But idealization and abstraction can be forms of strength. In an old mathematical-modelling joke, a group of experts is hired to improve milk production on a dairy farm. One of them, a physicist, suggests, “Consider a spherical cow.” Cows aren’t spheres any more than brains are jiggly sponges, but the point of modelling—in some ways, the joy of it—is to see how far you can get by using only general scientific principles, translated into mathematics, to describe messy reality.

To be successful, a model needs to replicate the known while generalizing into the unknown. This means that, as more becomes known, a model has to be improved to stay relevant. Sometimes new developments in math or computing enable progress. In other cases, modellers have to look at reality in a fresh way. For centuries, a predilection for perfect circles, mixed with a bit of religious dogma, produced models that described the motion of the sun, moon, and planets in an Earth-centered universe; these models worked, to some degree, but never perfectly. Eventually, more data, combined with more expansive thinking, ushered in a better model—a heliocentric solar system based on elliptical orbits. This model, in turn, helped kick-start the development of calculus, reveal the law of gravitational attraction, and fill out our map of the solar system. New knowledge pushes models forward, and better models help us learn.

Predictions about the universe are scientifically interesting. But it’s when models make predictions about worldly matters that people really pay attention.We anxiously await the outputs of models run by the Weather Channel, the Fed, and fivethirtyeight.com. Models of the stock market guide how our pension funds are invested; models of consumer demand drive production schedules; models of energy use determine when power is generated and where it flows. Insurers model our fates and charge us commensurately. Advertisers (and propagandists) rely on A.I. models that deliver targeted information (or disinformation) based on predictions of our reactions.

But it’s easy to get carried away..(More)”

Missing Evidence : Tracking Academic Data Use around the World


Worldbank Report: “Data-driven research on a country is key to producing evidence-based public policies. Yet little is known about where data-driven research is lacking and how it could be expanded. This paper proposes a method for tracking academic data use by country of subject, applying natural language processing to open-access research papers. The model’s predictions produce country estimates of the number of articles using data that are highly correlated with a human-coded approach, with a correlation of 0.99. Analyzing more than 1 million academic articles, the paper finds that the number of articles on a country is strongly correlated with its gross domestic product per capita, population, and the quality of its national statistical system. The paper identifies data sources that are strongly associated with data-driven research and finds that availability of subnational data appears to be particularly important. Finally, the paper classifies countries into groups based on whether they could most benefit from increasing their supply of or demand for data. The findings show that the former applies to many low- and lower-middle-income countries, while the latter applies to many upper-middle- and high-income countries…(More)”.

Ground Truths Are Human Constructions


Article by Florian Jaton: “Artificial intelligence algorithms are human-made, cultural constructs, something I saw first-hand as a scholar and technician embedded with AI teams for 30 months. Among the many concrete practices and materials these algorithms need in order to come into existence are sets of numerical values that enable machine learning. These referential repositories are often called “ground truths,” and when computer scientists construct or use these datasets to design new algorithms and attest to their efficiency, the process is called “ground-truthing.”

Understanding how ground-truthing works can reveal inherent limitations of algorithms—how they enable the spread of false information, pass biased judgments, or otherwise erode society’s agency—and this could also catalyze more thoughtful regulation. As long as ground-truthing remains clouded and abstract, society will struggle to prevent algorithms from causing harm and to optimize algorithms for the greater good.

Ground-truth datasets define AI algorithms’ fundamental goal of reliably predicting and generating a specific output—say, an image with requested specifications that resembles other input, such as web-crawled images. In other words, ground-truth datasets are deliberately constructed. As such, they, along with their resultant algorithms, are limited and arbitrary and bear the sociocultural fingerprints of the teams that made them…(More)”.