blog post

The Uselessness of Useful Knowledge

Curated on October 26, 2021October 26, 2021 by Stefaan Verhulst

Maggie Chiang for Quanta Magazine: “Is artificial intelligence the new alchemy? That is, are the powerful algorithms that control so much of our lives — from internet searches to social media feeds — the modern equivalent of turning lead into gold? Moreover: Would that be such a bad thing?

According to the prominent AI researcher Ali Rahimi and others, today’s fashionable neural networks and deep learning techniques are based on a collection of tricks, topped with a good dash of optimism, rather than systematic analysis. Modern engineers, the thinking goes, assemble their codes with the same wishful thinking and misunderstanding that the ancient alchemists had when mixing their magic potions.

It’s true that we have little fundamental understanding of the inner workings of self-learning algorithms, or of the limits of their applications. These new forms of AI are very different from traditional computer codes that can be understood line by line. Instead, they operate within a black box, seemingly unknowable to humans and even to the machines themselves.

This discussion within the AI community has consequences for all the sciences. With deep learning impacting so many branches of current research — from drug discovery to the design of smart materials to the analysis of particle collisions — science itself may be at risk of being swallowed by a conceptual black box. It would be hard to have a computer program teach chemistry or physics classes. By deferring so much to machines, are we discarding the scientific method that has proved so successful, and reverting to the dark practices of alchemy?

Not so fast, says Yann LeCun, co-recipient of the 2018 Turing Award for his pioneering work on neural networks. He argues that the current state of AI research is nothing new in the history of science. It is just a necessary adolescent phase that many fields have experienced, characterized by trial and error, confusion, overconfidence and a lack of overall understanding. We have nothing to fear and much to gain from embracing this approach. It’s simply that we’re more familiar with its opposite.

After all, it’s easy to imagine knowledge flowing downstream, from the source of an abstract idea, through the twists and turns of experimentation, to a broad delta of practical applications. This is the famous “usefulness of useless knowledge,” advanced by Abraham Flexner in his seminal 1939 essay (itself a play on the very American concept of “useful knowledge” that emerged during the Enlightenment).

A canonical illustration of this flow is Albert Einstein’s general theory of relativity. It all began with the fundamental idea that the laws of physics should hold for all observers, independent of their movements. He then translated this concept into the mathematical language of curved space-time and applied it to the force of gravity and the evolution of the cosmos. Without Einstein’s theory, the GPS in our smartphones would drift off course by about 7 miles a day…(More)”.

Nonprofit Websites Are Riddled With Ad Trackers

Curated on October 26, 2021October 28, 2021 by Stefaan Verhulst

Article by By Alfred Ng and Maddy Varner: “Last year, nearly 200 million people visited the website of Planned Parenthood, a nonprofit that many people turn to for very private matters like sex education, access to contraceptives, and access to abortions. What those visitors may not have known is that as soon as they opened plannedparenthood.org, some two dozen ad trackers embedded in the site alerted a slew of companies whose business is not reproductive freedom but gathering, selling, and using browsing data.

The Markup ran Planned Parenthood’s website through our Blacklight tool and found 28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called “session recorders” that could be capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions. The site also contained trackers that tell Facebook and Google if users visited the site.

The Markup’s scan found Planned Parenthood’s site communicating with companies like Oracle, Verizon, LiveRamp, TowerData, and Quantcast—some of which have made a business of assembling and selling access to masses of digital data about people’s habits.

Katie Skibinski, vice president for digital products at Planned Parenthood, said the data collected on its website is “used only for internal purposes by Planned Parenthood and our affiliates,” and the company doesn’t “sell” data to third parties.

“While we aim to use data to learn how we can be most impactful, at Planned Parenthood, data-driven learning is always thoughtfully executed with respect for patient and user privacy,” Skibinski said. “This means using analytics platforms to collect aggregate data to gather insights and identify trends that help us improve our digital programs.”

Skibinski did not dispute that the organization shares data with third parties, including data brokers.

A Blacklight scan of Planned Parenthood Gulf Coast—a localized website specifically for people in the Gulf region, including Texas, where abortion has been essentially outlawed—churned up similar results.

Planned Parenthood is not alone when it comes to nonprofits, some operating in sensitive areas like mental health and addiction, gathering and sharing data on website visitors.

Using our Blacklight tool, The Markup scanned more than 23,000 websites of nonprofit organizations, including those belonging to abortion providers and nonprofit addiction treatment centers. The Markup used the IRS’s nonprofit master file to identify nonprofits that have filed a tax return since 2019 and that the agency categorizes as focusing on areas like mental health and crisis intervention, civil rights, and medical research. We then examined each nonprofit’s website as publicly listed in GuideStar. We found that about 86 percent of them had third-party cookies or tracking network requests. By comparison, when The Markup did a survey of the top 80,000 websites in 2020, we found 87 percent used some type of third-party tracking.

About 11 percent of the 23,856 nonprofit websites we scanned had a Facebook pixel embedded, while 18 percent used the Google Analytics “Remarketing Audiences” feature.

The Markup found that 439 of the nonprofit websites loaded scripts called session recorders, which can monitor visitors’ clicks and keystrokes. Eighty-nine of those were for websites that belonged to nonprofits that the IRS categorizes as primarily focusing on mental health and crisis intervention issues…(More)”.

What Do Teachers Know About Student Privacy? Not Enough, Researchers Say

Curated on October 23, 2021October 27, 2021 by Stefaan Verhulst

Nadia Tamez-Robledo at EdTech: “What should teachers be expected to know about student data privacy and ethics?

Considering so much of their jobs now revolve around student data, it’s a simple enough question—and one that researcher Ellen B. Mandinach and a colleague were tasked with answering. More specifically, they wanted to know what state guidelines had to say on the matter. Was that information included in codes of education ethics? Or perhaps in curriculum requirements for teacher training programs?

“The answer is, ‘Not really,’” says Mandinach, a senior research scientist at the nonprofit WestEd. “Very few state standards have anything about protecting privacy, or even much about data,” she says, aside from policies touching on FERPA or disposing of data properly.

While it seems to Mandinach that institutions have historically played hot potato over who is responsible for teaching educators about data privacy, the pandemic and its supercharged push to digital learning have brought new awareness to the issue.

The application of data ethics has real consequences for students, says Mandinach, like an Atlanta sixth grader who was accused of “Zoombombing” based on his computer’s IP address or the Dartmouth students who were exonerated from cheating accusations.

“There are many examples coming up as we’re in this uncharted territory, particularly as we’re virtual,” Mandinach says. “Our goal is to provide resources and awareness building to the education community and professional organization…so [these tools] can be broadly used to help better prepare educators, both current and future.”

This week, Mandinach and her partners at the Future of Privacy Forum released two training resources for K-12 teachers: the Student Privacy Primer and a guide to working through data ethics scenarios. The curriculum is based on their report examining how much data privacy and ethics preparation teachers receive while in college….(More)”.

A crowdsourced spreadsheet is the latest tool in Chinese tech worker organizing

Curated on October 20, 2021October 20, 2021 by Stefaan Verhulst

Article by JS: “This week, thousands of Chinese tech workers are sharing information about their working schedules in an online spreadsheet. Their goal is to inform each other and new employees about overtime practices at different companies.

This initiative for work-schedule transparency, titled Working Time, has gone viral. As of Friday—just three days after the project launched—the spreadsheet has already had millions of views and over 6000 entries. The creators also set up group chats on the Tencent-owned messaging platform, QQ, to invite discussion about the project—over 10000 people have joined as participants.

This initiative comes after the explosive 996.ICU campaign which took place in 2019 where hundreds of thousands of tech workers in the country participated in an online effort to demand the end of the 72-hour work week—9am to 9pm, 6 days a week.

This year, multiple tech companies—with encouragement from the government—have ended overtime work practices that forced employees to work on Saturdays (or in some cases, alternating Saturdays). This has effectively ended 996, which was illegal to begin with. While an improvement, the data collected from this online spreadsheet shows that most tech workers still work long hours, either “1095” or “11105” (10am to 9pm or 11am to 10pm, 5 days a week). The spreadsheet also shows a non-negligible number of workers still working 6 days week.

Like the 996.ICU campaign, the creators of this spreadsheet are using GitHub to circulate and share info about the project. The first commit was made on Tuesday, October 12th. Only a few days later, the repo has been starred over 9500 times….(More)”.

False Positivism

Curated on October 19, 2021October 19, 2021 by Stefaan Verhulst

Essay by Peter Polack: “During the pandemic, the everyday significance of modeling — data-driven representations of reality designed to inform planning — became inescapable. We viewed our plans, fears, and desires through the lens of statistical aggregates: Infection-rate graphs became representations not only of the virus’s spread but also of shattered plans, anxieties about lockdowns, concern for the fate of our communities.

But as epidemiological models became more influential, their implications were revealed as anything but absolute. One model, the Recidiviz Covid-19 Model for Incarceration, predicted high infection rates in prisons and consequently overburdened hospitals. While these predictions were used as the basis to release some prisoners early, the model has also been cited by those seeking to incorporate more data-driven surveillance technologies into prison management — a trend new AI startups like Blue Prism and Staqu are eager to get in on. Thus the same model supports both the call to downsize prisons and the demand to expand their operations, even as both can claim a focus on flattening the curve. …

The ethics and effects of interventions depend not only on facts in themselves, but also on how facts are construed — and on what patterns of organization, existing or speculative, they are mobilized to justify. Yet the idea persists that data collection and fact finding should override concerns about surveillance, and not only in the most technocratic circles and policy think tanks. It also has defenders in the world of design theory and political philosophy. Benjamin Bratton, known for his theory of global geopolitics as an arrangement of computational technologies he calls “theStack,” sees in data-driven modeling the only political rationality capable of responding to difficult social and environmental problems like pandemics and climate change. In his latest book, The Revenge of the Real: Politics for a Post-Pandemic World, he argues that expansive models — enabled by what he theorizes as “planetary-scale computation” — can transcend individualistic perspectives and politics and thereby inaugurate a more inclusive and objective regime of governance. Against a politically fragmented world of polarized opinions and subjective beliefs, these models, Bratton claims, would unite politics and logistics under a common representation of the world. In his view, this makes longstanding social concerns about personal privacy and freedom comparatively irrelevant and those who continue to raise them irrational…(More)”.

Building the Behavior Change Toolkit: Designing and Testing a Nudge and a Boost

Curated on October 18, 2021October 18, 2021 by Stefaan Verhulst

Blog by Henrico van Roekel, Joanne Reinhard, and Stephan Grimmelikhuijsen: “Changing behavior is challenging, so behavioral scientists and designers better have a large toolkit. Nudges—subtle changes to the choice environment that don’t remove options or offer a financial incentive—are perhaps the most widely used tool. But they’re not the only tool.

More recently, researchers have advocated a different type of behavioral intervention: boosting. In contrast to nudges, which aim to change behavior through changing the environment, boosts aim to empower individuals to better exert their own agency.

Underpinning each approach are different perspectives on how humans deal with bounded rationality—the idea that we don’t always behave in a way that aligns with our intentions because our decision-making is subject to biases and flaws.

A nudge approach generally assumes that bounded rationality is a constant, a fact of life. Therefore, to change behavior we best change the decision environment (the so-called choice architecture) to gently guide people into the desired direction. Boosting holds that bounded rationality is malleable and people can learn how to overcome their cognitive pitfalls. Therefore, to change behavior we must focus on the decision maker and increasing their agency.

In practice, a nudge and a boost can look quite similar, as we describe below. But their theoretical distinctions are important and useful for behavioral scientists and designers working on behavior change interventions, as each approach has pros and cons. For instance, one criticism of nudging is the paternalism part of Thaler and Sunstein’s “libertarian paternalism,” as some worry nudges remove autonomy of decision makers (though the extent to which nudges are paternalistic, and the extent to which this is solvable, are debated). Additionally, if the goal of an intervention isn’t just to change behavior but to change the cognitive process of the individual, then nudges aren’t likely to be the best tool. Boosts, in contrast, require some motivation and teachability on the part of the boostee, so there may well be contexts unfit for boosting interventions where nudges come in handy….(More)”.

Exponential Tech Doesn’t Serve Social Good

Curated on October 17, 2021October 20, 2021 by Stefaan Verhulst

Essay by Douglas Rushkoff: “We all want to do good. Well, a great many of us want to do good. We recognize that the climate is in peril, corporations are dangerously extractive, wealth disparity is at all-time highs, our kids are self-destructively addicted to social media, politics has descended into a reality TV show with paranoid features, and that civilization itself has only about another 20 years before some combination of the above threats makes life unrecognizable or even unsustainable.

The good news, at least according to the majority of invitations I get to participate in conferences and with organizations, is that there’s an app or technology or network or platform or, most often these days, a token that can fix some or all of this. The latest frame around the technosolutionist frenzy is called web 3.0, which has come to mean the decentralized sort of internet characterized by TOR networks (basically, Napster and BitTorrent where everything is hosted everywhere) and the blockchain (a new form of automated ledger).

This generation of distributed technology, it is hoped, will engender and facilitate a new era of environmental stewardship, economic equality, racial justice, democratic values, and civic harmony. Even better, the tokens at the heart of these blockchains for global good will make people rich. Sure, those who get in early will do the best, but the magic of smart contracts will somehow raise all boats, letting us feel great about doing well by doing good — secure in the knowledge that the returns we’re seeing on all this “impact investing” are the deserved surplus fruits of our having dedicated our careers and portfolios to public service.

But getting such efforts off the ground is always the hard part. There are so many terrific people and organizations out there, each building platforms and apps and networks and blockchains, too. How do we break through the noise and bring all those many decentralized organizations together into the one, single centralized decentralized network? And once we do, how do we prove that ours really is the one that will solve the problems? Because without enough buy-in, figuratively and literally, the token won’t be worth anything and all that good won’t get done….(More)”.

Towards Better Governance of Urban Data: Concrete Examples of Success

Curated on October 16, 2021October 19, 2021 by Stefaan Verhulst

Blogpost by Naysan Saran: “Since the Sumerians of the fourth millennium BCE, governments have kept records. These records have, of course, evolved from a few hundred cuneiform symbols engraved on clay tablets to terabytes of data hosted on cloud servers. However, their primary goal remains the same: to improve land management.

That being said, the six thousand years of civilization separating us from the Sumerians has seen the birth of democracy, and with that birth, a second goal has been grafted onto the first: cities must now earn the trust of their citizens with respect to how they manage those citizens’ data. This goal cannot be achieved without good data governance, which defines strategies for the efficient and transparent use and distribution of information.

To learn more about the state of the art in municipal data management, both internally and externally, we went to meet with two experts who agreed to share their experiences and best practices: François Robitaille, business architect for the city of Laval; and Adrienne Schmoeker, former Deputy Chief Analytics Officer for the City of New York, and Director at the New York City Mayor’s Office of Data Analytics where she managed the Open Data Program for three years….(More)”

Why the world needs more Mavericks

Curated on October 16, 2021October 20, 2021 by Stefaan Verhulst

Essay by Ian Burbidge: “As we turn our minds to the work that will be needed following the global pandemic of Covid-19, the challenges that confront society remain significant: from preventing the climate crisis to tackling racial justice, improving mental health to ending poverty. We need new ways of thinking and acting in the world.

For those of us of a certain age ’Maverick’ will evoke images of an uber-confident fighter pilot buzzing control towers and generally flouting the rules of the navy in the 1980s Top Gun movie. A common feature of those at the top of their game, who are seemingly untouchable in their work, or who simply see things differently, is that the unusual perspective and skillset from which they operate enables them to push the boundaries of what’s common or acceptable practice.

Without someone stretching the realms of what’s possible in the first place there isn’t room for the rest of us to experiment or play with the new possibilities that open up as a result.

Sometimes this can stretch a limiting mindset, just as the 4 minute mile was once thought to be beyond the possibility of the human body. Sometimes it simply breaks new ground, as Kevin Peterson’s switch hit in cricket did. Sometimes it confounds social expectations, as Emilia Earhart being the first woman to make a solo non-stop transatlantic flight. Some are well known, others less so. Wynne Fletcher, my Nan, trained during the second world war as a wireless mechanic yet on arrival at a Lincolnshire airbase was sent to the typing pool. She soon found a way to overcome the assumptions of the base commander and do the job she was really there for, ensuring the equipment on board the aircraft was operating effectively, on occasion accompanying the crew on bombing sorties.

Conformists don’t tend to push boundaries or challenge convention because the very act of operating within the rules suggests a mindset or approach that is not going to be comfortable outside of them.

Of course, we need people like this; they provide the stable foundation that creates the order without which everything falls apart. Dynamic change happens from a place of stability. But it can’t happen without those who catalyse such change. To do that, they have to see and do things differently, and that’s a core characteristic of those we call mavericks.

Samuel Maverick was a Texas lawyer, politician and land baron who refused to brand his calves. His logic was if all the other cattle owners branded theirs then any unbranded animals would be recognised as his. This created a new kind of unbranded brand, and inadvertently increased his stock as a result. Samuel went against the grain. It’s from him that we draw the Oxford English dictionary definition of a maverick as someone who “thinks and acts in an independent way, often behaving differently from the expected or usual way”.

Understanding the value that such people bring to the world is therefore an important contribution to our account of what it takes to create change and bring value to the world, especially if we care about pushing humanity forward….(More)”.

Data Stewardship Re-Imagined — Capacities and Competencies

Curated on October 13, 2021October 13, 2021 by Stefaan Verhulst

Blog and presentation by Stefaan Verhulst: “In ways both large and small, COVID-OVID-19 has forced us to re-examine every aspect of our political, social, and economic systems. Among the many lessons, policymakers have learned is that existing methods for using data are often insufficient for our most pressing challenges. In particular, we need to find new, innovative ways of tapping into the potential of privately held and siloed datasets that nonetheless contain tremendous public good potential, including complementing and extending official statistics. Data collaboratives are an emerging set of methods for accessing and reusing data that offer tremendous opportunities in this regard. In the last five years, we have studied and initiated numerous data collaboratives, in the process assembling a collection of over 200 example case studies to better understand their possibilities.

Among our key findings is the vital importance and essential role that needs to be played by Data Stewards.

Data stewards do not represent an entirely new profession; rather, their role could be understood as an extension and re-definition of existing organizational positions that manage and interact with data. Traditionally, the role of a data officer was limited either to data integrity or the narrow context of internal data governance and management, with a strong emphasis on technical competencies. This narrow conception is no longer sufficient, especially given the proliferation of data and the increasing potential of data sharing and collaboration. As such, we call for a re-imagination of data stewardship to encompass a wider range of functions and responsibilities, directed at leveraging data assets toward addressing societal challenges and improving people’s lives.

DATA STEWARDSHIP: functions and competencies to enable access to and re-use of data for public benefit in a systematic, sustainable, and responsible way.

In our vision, data stewards are professionals empowered to create public value (including official statistics) by re-using data and data expertise, identifying opportunities for productive cross-sectoral collaboration, and proactively requesting or enabling functional access to data, insights, and expertise. Data stewards are active in both the public and private sectors, promoting trust within and outside their organizations. They are essential to data collaboratives by providing functional access to unlock the potential of siloed data sets. In short, data stewards form a new — and essential — link in the data value chain….(More)”.