Falling in love with the problem, not the solution


Blog by Kyle Novak: “Fall in love with the problem, not your solution.”  It’s a maxim that I first heard spoken a few years ago by USAID’s former Chief Innovation Officer Ann Mei Chang. I’ve found myself frequently reflecting on those words as I’ve been thinking about the challenges of implementing public policy. I spent the past year on Capitol Hill in Washington, D.C. working as a legislative fellow, funded through a grant to bring scientists to improve evidence-based policymaking within the federal government. I spent much of the year trying to better understand how legislation and oversight work together in context of policy and politics. To learn what makes good public policy, I wanted to understand how to better implement it. Needless to say, I took a course in Problem Driven Iterative Adaptation (PDIA), a framework to manage risk in complex policy challenges by embracing experimentation and “learning through doing.”

Congress primarily uses legislation and budget to control and implement policy initiatives through the federal agencies. Legislation is drafted and introduced by lawmakers with input from constituents, interest groups, and agencies; the Congressional budget is explicitly planned out each year based on input from the agencies; and accountability is built into the process through oversight mechanisms. Congress largely provides the planning and lock-in of “plan and control” management based on majority political party control and congruence with policy priorities of the Administration.  But, it is difficult to successfully implement a plan-and-control approach when political, social, or economic situations are changing.

Take the problem of data privacy and protection. A person’s identity is becoming largely digital. Every day each of us produces almost a gigabyte of information—our location is shared by our mobile phones, our preferences and interpersonal connections are tagged on social media, our purchases analyzed, and our actions recorded on increasingly ubiquitous surveillance cameras. Monetization of this information, often bought and sold through data brokers, enables an invasive and oppressive system that affects all aspects of our lives.  Algorithms mine our data to make decisions about our employment, healthcare, education, credit, and policing. Machine learning and digital redlining skirts protections that prohibit discrimination on basis of race, gender, and religion. Targeted and automated disinformation campaigns suppress fundamental rights of speech and expression. And digital technologies magnify existing inequities. While misuse of personal data has the potential to do incredible harm, responsible use of that data has the power to do incredible good. The challenge of data privacy and protection is one that impacts all of us, our civil liberties, and the foundations of a democratic society.

The success of members of Congress are often measured in the solutions they propose, not the problems that they identify….(More)”

Has COVID-19 been the making of Open Science?


Article by Lonni Besançon, Corentin Segalas and Clémence Leyrat: “Although many concepts fall under the umbrella of Open Science, some of its key concepts are: Open Access, Open Data, Open Source, and Open Peer Review. How far these four principles were embraced by researchers during the pandemic and where there is room for improvement, is what we, as early career researchers, set out to assess by looking at data on scientific articles published during the Covid-19 pandemic….Open Source and Open Data practices consist in making all the data and materials used to gather or analyse data available on relevant repositories. While we can find incredibly useful datasets shared publicly on COVID-19 (for instance those provided by the European Centre for Disease Control), they remain the exception rather than the norm. A spectacular example of this were the papers utilising data from the company Surgisphere, that led to retracted papers in The Lancet and The New England Journal of Medicine. In our paper, we highlight 4 papers that could have been retracted much earlier (and perhaps would never have been accepted) had the data been made accessible from the time of publication. As we argue in our paper, this presents a clear case for making open data and open source the default, with exceptions for privacy and safety. While some journals already have such policies, we go further in asking that, when data cannot be shared publicly, editors/publishers and authors/institutions should agree on a third party to check the existence and reliability/validity of the data and the results presented. This not only would strengthen the review process, but also enhance the reproducibility of research and further accelerate the production of new knowledge through data and code sharing…(More)”.

The Uselessness of Useful Knowledge


Maggie Chiang for Quanta Magazine: “Is artificial intelligence the new alchemy? That is, are the powerful algorithms that control so much of our lives — from internet searches to social media feeds — the modern equivalent of turning lead into gold? Moreover: Would that be such a bad thing?

According to the prominent AI researcher Ali Rahimi and others, today’s fashionable neural networks and deep learning techniques are based on a collection of tricks, topped with a good dash of optimism, rather than systematic analysis. Modern engineers, the thinking goes, assemble their codes with the same wishful thinking and misunderstanding that the ancient alchemists had when mixing their magic potions.

It’s true that we have little fundamental understanding of the inner workings of self-learning algorithms, or of the limits of their applications. These new forms of AI are very different from traditional computer codes that can be understood line by line. Instead, they operate within a black box, seemingly unknowable to humans and even to the machines themselves.

This discussion within the AI community has consequences for all the sciences. With deep learning impacting so many branches of current research — from drug discovery to the design of smart materials to the analysis of particle collisions — science itself may be at risk of being swallowed by a conceptual black box. It would be hard to have a computer program teach chemistry or physics classes. By deferring so much to machines, are we discarding the scientific method that has proved so successful, and reverting to the dark practices of alchemy?

Not so fast, says Yann LeCun, co-recipient of the 2018 Turing Award for his pioneering work on neural networks. He argues that the current state of AI research is nothing new in the history of science. It is just a necessary adolescent phase that many fields have experienced, characterized by trial and error, confusion, overconfidence and a lack of overall understanding. We have nothing to fear and much to gain from embracing this approach. It’s simply that we’re more familiar with its opposite.

After all, it’s easy to imagine knowledge flowing downstream, from the source of an abstract idea, through the twists and turns of experimentation, to a broad delta of practical applications. This is the famous “usefulness of useless knowledge,” advanced by Abraham Flexner in his seminal 1939 essay (itself a play on the very American concept of “useful knowledge” that emerged during the Enlightenment).

A canonical illustration of this flow is Albert Einstein’s general theory of relativity. It all began with the fundamental idea that the laws of physics should hold for all observers, independent of their movements. He then translated this concept into the mathematical language of curved space-time and applied it to the force of gravity and the evolution of the cosmos. Without Einstein’s theory, the GPS in our smartphones would drift off course by about 7 miles a day…(More)”.

Nonprofit Websites Are Riddled With Ad Trackers


Article by By Alfred Ng and Maddy Varner: “Last year, nearly 200 million people visited the website of Planned Parenthood, a nonprofit that many people turn to for very private matters like sex education, access to contraceptives, and access to abortions. What those visitors may not have known is that as soon as they opened plannedparenthood.org, some two dozen ad trackers embedded in the site alerted a slew of companies whose business is not reproductive freedom but gathering, selling, and using browsing data.

The Markup ran Planned Parenthood’s website through our Blacklight tool and found 28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called “session recorders” that could be capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions. The site also contained trackers that tell Facebook and Google if users visited the site.

The Markup’s scan found Planned Parenthood’s site communicating with companies like Oracle, Verizon, LiveRamp, TowerData, and Quantcast—some of which have made a business of assembling and selling access to masses of digital data about people’s habits.

Katie Skibinski, vice president for digital products at Planned Parenthood, said the data collected on its website is “used only for internal purposes by Planned Parenthood and our affiliates,” and the company doesn’t “sell” data to third parties.

“While we aim to use data to learn how we can be most impactful, at Planned Parenthood, data-driven learning is always thoughtfully executed with respect for patient and user privacy,” Skibinski said. “This means using analytics platforms to collect aggregate data to gather insights and identify trends that help us improve our digital programs.”

Skibinski did not dispute that the organization shares data with third parties, including data brokers.

Blacklight scan of Planned Parenthood Gulf Coast—a localized website specifically for people in the Gulf region, including Texas, where abortion has been essentially outlawed—churned up similar results.

Planned Parenthood is not alone when it comes to nonprofits, some operating in sensitive areas like mental health and addiction, gathering and sharing data on website visitors.

Using our Blacklight tool, The Markup scanned more than 23,000 websites of nonprofit organizations, including those belonging to abortion providers and nonprofit addiction treatment centers. The Markup used the IRS’s nonprofit master file to identify nonprofits that have filed a tax return since 2019 and that the agency categorizes as focusing on areas like mental health and crisis intervention, civil rights, and medical research. We then examined each nonprofit’s website as publicly listed in GuideStar. We found that about 86 percent of them had third-party cookies or tracking network requests. By comparison, when The Markup did a survey of the top 80,000 websites in 2020, we found 87 percent used some type of third-party tracking.

About 11 percent of the 23,856 nonprofit websites we scanned had a Facebook pixel embedded, while 18 percent used the Google Analytics “Remarketing Audiences” feature.

The Markup found that 439 of the nonprofit websites loaded scripts called session recorders, which can monitor visitors’ clicks and keystrokes. Eighty-nine of those were for websites that belonged to nonprofits that the IRS categorizes as primarily focusing on mental health and crisis intervention issues…(More)”.

What Do Teachers Know About Student Privacy? Not Enough, Researchers Say


Nadia Tamez-Robledo at EdTech: “What should teachers be expected to know about student data privacy and ethics?

Considering so much of their jobs now revolve around student data, it’s a simple enough question—and one that researcher Ellen B. Mandinach and a colleague were tasked with answering. More specifically, they wanted to know what state guidelines had to say on the matter. Was that information included in codes of education ethics? Or perhaps in curriculum requirements for teacher training programs?

“The answer is, ‘Not really,’” says Mandinach, a senior research scientist at the nonprofit WestEd. “Very few state standards have anything about protecting privacy, or even much about data,” she says, aside from policies touching on FERPA or disposing of data properly.

While it seems to Mandinach that institutions have historically played hot potato over who is responsible for teaching educators about data privacy, the pandemic and its supercharged push to digital learning have brought new awareness to the issue.

The application of data ethics has real consequences for students, says Mandinach, like an Atlanta sixth grader who was accused of “Zoombombing” based on his computer’s IP address or the Dartmouth students who were exonerated from cheating accusations.

“There are many examples coming up as we’re in this uncharted territory, particularly as we’re virtual,” Mandinach says. “Our goal is to provide resources and awareness building to the education community and professional organization…so [these tools] can be broadly used to help better prepare educators, both current and future.”

This week, Mandinach and her partners at the Future of Privacy Forum released two training resources for K-12 teachers: the Student Privacy Primer and a guide to working through data ethics scenarios. The curriculum is based on their report examining how much data privacy and ethics preparation teachers receive while in college….(More)”.

A crowdsourced spreadsheet is the latest tool in Chinese tech worker organizing


Article by JS: “This week, thousands of Chinese tech workers are sharing information about their working schedules in an online spreadsheet. Their goal is to inform each other and new employees about overtime practices at different companies. 

This initiative for work-schedule transparency, titled Working Time, has gone viral. As of Friday—just three days after the project launched—the spreadsheet has already had millions of views and over 6000 entries. The creators also set up group chats on the Tencent-owned messaging platform, QQ, to invite discussion about the project—over 10000 people have joined as participants.

This initiative comes after the explosive 996.ICU campaign which took place in 2019 where hundreds of thousands of tech workers in the country participated in an online effort to demand the end of the 72-hour work week—9am to 9pm, 6 days a week.

This year, multiple tech companies—with encouragement from the government—have ended overtime work practices that forced employees to work on Saturdays (or in some cases, alternating Saturdays). This has effectively ended 996, which was illegal to begin with. While an improvement, the data collected from this online spreadsheet shows that most tech workers still work long hours, either “1095” or “11105” (10am to 9pm or 11am to 10pm, 5 days a week). The spreadsheet also shows a non-negligible number of workers still working 6 days week.

Like the 996.ICU campaign, the creators of this spreadsheet are using GitHub to circulate and share info about the project. The first commit was made on Tuesday, October 12th. Only a few days later, the repo has been starred over 9500 times….(More)”.

False Positivism


Essay by Peter Polack: “During the pandemic, the everyday significance of modeling — data-driven representations of reality designed to inform planning — became inescapable. We viewed our plans, fears, and desires through the lens of statistical aggregates: Infection-rate graphs became representations not only of the virus’s spread but also of shattered plans, anxieties about lockdowns, concern for the fate of our communities. 

But as epidemiological models became more influential, their implications were revealed as anything but absolute. One model, the Recidiviz Covid-19 Model for Incarceration, predicted high infection rates in prisons and consequently overburdened hospitals. While these predictions were used as the basis to release some prisoners early, the model has also been cited by those seeking to incorporate more data-driven surveillance technologies into prison management — a trend new AI startups like Blue Prism and Staqu are eager to get in on. Thus the same model supports both the call to downsize prisons and the demand to expand their operations, even as both can claim a focus on flattening the curve. …

The ethics and effects of interventions depend not only on facts in themselves, but also on how facts are construed — and on what patterns of organization, existing or speculative, they are mobilized to justify. Yet the idea persists that data collection and fact finding should override concerns about surveillance, and not only in the most technocratic circles and policy think tanks. It also has defenders in the world of design theory and political philosophy. Benjamin Bratton, known for his theory of global geopolitics as an arrangement of computational technologies he calls “theStack,” sees in data-driven modeling the only political rationality capable of responding to difficult social and environmental problems like pandemics and climate change. In his latest book, The Revenge of the Real: Politics for a Post-Pandemic World, he argues that expansive models — enabled by what he theorizes as “planetary-scale computation” — can transcend individualistic perspectives and politics and thereby inaugurate a more inclusive and objective regime of governance. Against a politically fragmented world of polarized opinions and subjective beliefs, these models, Bratton claims, would unite politics and logistics under a common representation of the world. In his view, this makes longstanding social concerns about personal privacy and freedom comparatively irrelevant and those who continue to raise them irrational…(More)”.

Building the Behavior Change Toolkit: Designing and Testing a Nudge and a Boost


Blog by Henrico van Roekel, Joanne Reinhard, and Stephan Grimmelikhuijsen: “Changing behavior is challenging, so behavioral scientists and designers better have a large toolkit. Nudges—subtle changes to the choice environment that don’t remove options or offer a financial incentive—are perhaps the most widely used tool. But they’re not the only tool.

More recently, researchers have advocated a different type of behavioral intervention: boosting. In contrast to nudges, which aim to change behavior through changing the environment, boosts aim to empower individuals to better exert their own agency.

Underpinning each approach are different perspectives on how humans deal with bounded rationality—the idea that we don’t always behave in a way that aligns with our intentions because our decision-making is subject to biases and flaws.

A nudge approach generally assumes that bounded rationality is a constant, a fact of life. Therefore, to change behavior we best change the decision environment (the so-called choice architecture) to gently guide people into the desired direction. Boosting holds that bounded rationality is malleable and people can learn how to overcome their cognitive pitfalls. Therefore, to change behavior we must focus on the decision maker and increasing their agency.

In practice, a nudge and a boost can look quite similar, as we describe below. But their theoretical distinctions are important and useful for behavioral scientists and designers working on behavior change interventions, as each approach has pros and cons. For instance, one criticism of nudging is the paternalism part of Thaler and Sunstein’s “libertarian paternalism,” as some worry nudges remove autonomy of decision makers (though the extent to which nudges are paternalistic, and the extent to which this is solvable, are debated). Additionally, if the goal of an intervention isn’t just to change behavior but to change the cognitive process of the individual, then nudges aren’t likely to be the best tool. Boosts, in contrast, require some motivation and teachability on the part of the boostee, so there may well be contexts unfit for boosting interventions where nudges come in handy….(More)”.

Exponential Tech Doesn’t Serve Social Good


Essay by Douglas Rushkoff: “We all want to do good. Well, a great many of us want to do good. We recognize that the climate is in peril, corporations are dangerously extractive, wealth disparity is at all-time highs, our kids are self-destructively addicted to social media, politics has descended into a reality TV show with paranoid features, and that civilization itself has only about another 20 years before some combination of the above threats makes life unrecognizable or even unsustainable.

The good news, at least according to the majority of invitations I get to participate in conferences and with organizations, is that there’s an app or technology or network or platform or, most often these days, a token that can fix some or all of this. The latest frame around the technosolutionist frenzy is called web 3.0, which has come to mean the decentralized sort of internet characterized by TOR networks (basically, Napster and BitTorrent where everything is hosted everywhere) and the blockchain (a new form of automated ledger).

This generation of distributed technology, it is hoped, will engender and facilitate a new era of environmental stewardship, economic equality, racial justice, democratic values, and civic harmony. Even better, the tokens at the heart of these blockchains for global good will make people rich. Sure, those who get in early will do the best, but the magic of smart contracts will somehow raise all boats, letting us feel great about doing well by doing good — secure in the knowledge that the returns we’re seeing on all this “impact investing” are the deserved surplus fruits of our having dedicated our careers and portfolios to public service.

But getting such efforts off the ground is always the hard part. There are so many terrific people and organizations out there, each building platforms and apps and networks and blockchains, too. How do we break through the noise and bring all those many decentralized organizations together into the one, single centralized decentralized network? And once we do, how do we prove that ours really is the one that will solve the problems? Because without enough buy-in, figuratively and literally, the token won’t be worth anything and all that good won’t get done….(More)”.

Towards Better Governance of Urban Data: Concrete Examples of Success


Blogpost by Naysan Saran: “Since the Sumerians of the fourth millennium BCE, governments have kept records. These records have, of course, evolved from a few hundred cuneiform symbols engraved on clay tablets to terabytes of data hosted on cloud servers. However, their primary goal remains the same: to improve land management.

That being said, the six thousand years of civilization separating us from the Sumerians has seen the birth of democracy, and with that birth, a second goal has been grafted onto the first: cities must now earn the trust of their citizens with respect to how they manage those citizens’ data. This goal cannot be achieved without good data governance, which defines strategies for the efficient and transparent use and distribution of information.

To learn more about the state of the art in municipal data management, both internally and externally, we went to meet with two experts who agreed to share their experiences and best practices: François Robitaille, business architect for the city of Laval; and Adrienne Schmoeker, former Deputy Chief Analytics Officer for the City of New York, and Director at the New York City Mayor’s Office of Data Analytics where she managed the Open Data Program for three years….(More)”