How many yottabytes in a quettabyte? Extreme numbers get new names


Article by Elizabeth Gibney: “By the 2030s, the world will generate around a yottabyte of data per year — that’s 1024 bytes, or the amount that would fit on DVDs stacked all the way to Mars. Now, the booming growth of the data sphere has prompted the governors of the metric system to agree on new prefixes beyond that magnitude, to describe the outrageously big and small.

Representatives from governments worldwide, meeting at the General Conference on Weights and Measures (CGPM) outside Paris on 18 November, voted to introduce four new prefixes to the International System of Units (SI) with immediate effect. The prefixes ronna and quetta represent 1027 and 1030, and ronto and quecto signify 10−27 and 10−30. Earth weighs around one ronnagram, and an electron’s mass is about one quectogram.

This is the first update to the prefix system since 1991, when the organization added zetta (1021), zepto (1021), yotta (1024) and yocto (10−24). In that case, metrologists were adapting to fit the needs of chemists, who wanted a way to express SI units on the scale of Avogadro’s number — the 6 × 1023 units in a mole, a measure of the quantity of substances. The more familiar prefixes peta and exa were added in 1975 (see ‘Extreme figures’).

Extreme figures

Advances in scientific fields have led to increasing need for prefixes to describe very large and very small numbers.

FactorNameSymbolAdopted
1030quettaQ2022
1027ronnaR2022
1024yottaY1991
1021zettaZ1991
1018exaE1975
1015petaP1975
10−15femtof1964
10−18attoa1964
10−21zeptoz1991
10−24yoctoy1991
10−27rontor2022
10−30quectoq2022

Prefixes are agreed at the General Conference on Weights and Measures.

Today, the driver is data science, says Richard Brown, a metrologist at the UK National Physical Laboratory in Teddington. He has been working on plans to introduce the latest prefixes for five years, and presented the proposal to the CGPM on 17 November. With the annual volume of data generated globally having already hit zettabytes, informal suggestions for 1027 — including ‘hella’ and ‘bronto’ — were starting to take hold, he says. Google’s unit converter, for example, already tells users that 1,000 yottabytes is 1 hellabyte, and at least one UK government website quotes brontobyte as the correct term….(More)”

Is Facebook’s advertising data accurate enough for use in social science research? Insights from a cross-national online survey


Paper by André Grow et al: “Social scientists increasingly use Facebook’s advertising platform for research, either in the form of conducting digital censuses of the general population, or for recruiting participants for survey research. Both approaches depend on the accuracy of the data that Facebook provides about its users, but little is known about how accurate these data are. We address this gap in a large-scale, cross-national online survey (N = 137,224), in which we compare self-reported and Facebook-classified demographic information (sex, age and region of residence). Our results suggest that Facebook’s advertising platform can be fruitfully used for conducing social science research if additional steps are taken to assess the accuracy of the characteristics under consideration…(More)”.

Humanizing Science and Engineering for the Twenty-First Century


Essay by Kaye Husbands Fealing, Aubrey Deveny Incorvaia and Richard Utz: “Solving complex problems is never a purely technical or scientific matter. When science or technology advances, insights and innovations must be carefully communicated to policymakers and the public. Moreover, scientists, engineers, and technologists must draw on subject matter expertise in other domains to understand the full magnitude of the problems they seek to solve. And interdisciplinary awareness is essential to ensure that taxpayer-funded policy and research are efficient and equitable and are accountable to citizens at large—including members of traditionally marginalized communities…(More)”.

Science and the World Cup: how big data is transforming football


Essay by David Adam: “The scowl on Cristiano Ronaldo’s face made international headlines last month when the Portuguese superstar was pulled from a match between Manchester United and Newcastle with 18 minutes left to play. But he’s not alone in his sentiment. Few footballers agree with a manager’s decision to substitute them in favour of a fresh replacement.

During the upcoming football World Cup tournament in Qatar, players will have a more evidence-based way to argue for time on the pitch. Within minutes of the final whistle, tournament organizers will send each player a detailed breakdown of their performance. Strikers will be able to show how often they made a run and were ignored. Defenders will have data on how much they hassled and harried the opposing team when it had possession.

It’s the latest incursion of numbers into the beautiful game. Data analysis now helps to steer everything from player transfers and the intensity of training, to targeting opponents and recommending the best direction to kick the ball at any point on the pitch.

Meanwhile, footballers face the kind of data scrutiny more often associated with an astronaut. Wearable vests and straps can now sense motion, track position with GPS and count the number of shots taken with each foot. Cameras at multiple angles capture everything from headers won to how long players keep the ball. And to make sense of this information, most elite football teams now employ data analysts, including mathematicians, data scientists and physicists plucked from top companies and labs such as computing giant Microsoft and CERN, Europe’s particle-physics laboratory near Geneva, Switzerland….(More)”.

The network science of collective intelligence


Article by Damon Centola: “In the last few years, breakthroughs in computational and experimental techniques have produced several key discoveries in the science of networks and human collective intelligence. This review presents the latest scientific findings from two key fields of research: collective problem-solving and the wisdom of the crowd. I demonstrate the core theoretical tensions separating these research traditions and show how recent findings offer a new synthesis for understanding how network dynamics alter collective intelligence, both positively and negatively. I conclude by highlighting current theoretical problems at the forefront of research on networked collective intelligence, as well as vital public policy challenges that require new research efforts…(More)”.

Meaningful public engagement in the context of open science: reflections from early and mid-career academics


Paper by Wouter Boon et al: “How is public engagement perceived to contribute to open science? This commentary highlights common reflections on this question from interviews with 12 public engagement fellows in Utrecht University’s Open Science Programme in the Netherlands. We identify four reasons why public engagement is an essential enabler of open science. Interaction between academics and society can: (1) better align science with the needs of society; (2) secure a relationship of trust between science and society; (3) increase the quality and impact of science; and (4) support the impact of open access and FAIR data practices (data which meet principles of findability, accessibility, interoperability and reusability). To be successful and sustainable, such public engagement requires support in skills training and a form of institutionalisation in a university-wide system, but, most of all, the fellows express the importance of a formal and informal recognition and rewards system. Our findings suggest that in order to make public engagement an integral part of open science, universities should invest in institutional support, create awareness, and stimulate dialogue among staff members on how to ‘do’ good public engagement….(More)”.

Data Structures the Fun Way


Book by Jeremy Kubica: “This accessible and entertaining book provides an in-depth introduction to computational thinking through the lens of data structures — a critical component in any programming endeavor. Through diagrams, pseudocode, and humorous analogies, you’ll learn how the structure of data drives algorithmic operations, gaining insight into not just how to build data structures, but precisely how and when to use them. 

This book will give you a strong background in implementing and working with more than 15 key data structures, from stacks, queues, and caches to bloom filters, skip lists, and graphs. Master linked lists by standing in line at a cafe, hash tables by cataloging the history of the summer Olympics, and Quadtrees by neatly organizing your kitchen cabinets. Along with basic computer science concepts like recursion and iteration, you’ll learn: 

  • The complexity and power of pointers
  • The branching logic of tree-based data structures
  • How different data structures insert and delete data in memory
  • Why mathematical mappings and randomization are useful
  • How to make tradeoffs between speed, flexibility, and memory usage

Data Structures the Fun Way shows how to efficiently apply these ideas to real-world problems—a surprising number of which focus on procuring a decent cup of coffee. At any level, fully understanding data structures will teach you core skills that apply across multiple programming languages, taking your career to the next level….(More)”.

Everything dies, including information


Article by Erik Sherman: “Everything dies: people, machines, civilizations. Perhaps we can find some solace in knowing that all the meaningful things we’ve learned along the way will survive. But even knowledge has a life span. Documents fade. Art goes missing. Entire libraries and collections can face quick and unexpected destruction. 

Surely, we’re at a stage technologically where we might devise ways to make knowledge available and accessible forever. After all, the density of data storage is already incomprehensibly high. In the ever-­growing museum of the internet, one can move smoothly from images from the James Webb Space Telescope through diagrams explaining Pythagoras’s philosophy on the music of the spheres to a YouTube tutorial on blues guitar soloing. What more could you want?

Quite a bit, according to the experts. For one thing, what we think is permanent isn’t. Digital storage systems can become unreadable in as little as three to five years. Librarians and archivists race to copy things over to newer formats. But entropy is always there, waiting in the wings. “Our professions and our people often try to extend the normal life span as far as possible through a variety of techniques, but it’s still holding back the tide,” says Joseph Janes, an associate professor at the University of Washington Information School. 

To complicate matters, archivists are now grappling with an unprecedented deluge of information. In the past, materials were scarce and storage space limited. “Now we have the opposite problem,” Janes says. “Everything is being recorded all the time.”…(More)”.

Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty


Paper by Nate Breznau et al: “This study explores how researchers’ analytical choices affect the reliability of scientific findings. Most discussions of reliability problems in science focus on systematic biases. We broaden the lens to include conscious and unconscious decisions that researchers make during data analysis and that may lead to diverging results. We coordinated 161 researchers in 73 research teams and observed their research decisions as they used the same data to independently test the same prominent social science hypothesis: that greater immigration reduces support for social policies among the public. In this typical case of research based on secondary data, we find that research teams reported widely diverging numerical findings and substantive conclusions despite identical start conditions. Researchers’ expertise, prior beliefs, and expectations barely predicted the wide variation in research outcomes. More than 90% of the total variance in numerical results remained unexplained even after accounting for research decisions identified via qualitative coding of each team’s workflow. This reveals a universe of uncertainty that is hidden when considering a single study in isolation. The idiosyncratic nature of how researchers’ results and conclusions varied is a new explanation for why many scientific hypotheses remain contested. It calls for greater humility and clarity in reporting scientific findings..(More)”.

Public Access to Advance Equity


Essay by Alondra Nelson, Christopher Marcum and Jedidah Isler: “Open science began in the scientific community as a movement committed to making all aspects of research freely available to all members of society. 

As a member of the Organisation for Economic Co-operation and Development (OECD), the United States is committed to promoting open science, which the OECD defines as “unhindered access to scientific articles, access to data from public research, and collaborative research enabled by information and communication technology tools and incentives.”

At the White House Office of Science and Technology Policy (OSTP), we have been inspired by the movement to push for openness in research by community activists, researchers, publishers, higher-education leaders, policymakers, patient advocates, scholarly associations, librarians, open-government proponents, philanthropic organizations, and the public. 

Open science is an essential part of the Biden-Harris administration’s broader commitment to providing public access to data, publications, and the other important products of the nation’s taxpayer-supported research and innovation enterprise. We look to the lessons, methods, and products of open science to deliver on this commitment to policy that advances equity, accelerates discovery and innovation, provides opportunities for all to participate in research, promotes public trust, and is evidence-based. Here, we detail some of the ways OSTP is working to expand the American public’s access to the federal research and development ecosystem, and to ensure it is open, equitable, and secure…(More)”.