Can data die?


Article by Jennifer Ding: “…To me, the crux of the Lenna story is how little power we have over our data and how it is used and abused. This threat seems disproportionately higher for women who are often overrepresented in internet content, but underrepresented in internet company leadership and decision making. Given this reality, engineering and product decisions will continue to consciously (and unconsciously) exclude our needs and concerns.

While social norms are changing towards non-consensual data collection and data exploitation, digital norms seem to be moving in the opposite direction. Advancements in machine learning algorithms and data storage capabilities are only making data misuse easier. Whether the outcome is revenge porn or targeted ads, surveillance or discriminatory AI, if we want a world where our data can retire when it’s outlived its time, or when it’s directly harming our lives, we must create the tools and policies that empower data subjects to have a say in what happens to their data… including allowing their data to die…(More)”

Nonprofit Websites Are Riddled With Ad Trackers


Article by By Alfred Ng and Maddy Varner: “Last year, nearly 200 million people visited the website of Planned Parenthood, a nonprofit that many people turn to for very private matters like sex education, access to contraceptives, and access to abortions. What those visitors may not have known is that as soon as they opened plannedparenthood.org, some two dozen ad trackers embedded in the site alerted a slew of companies whose business is not reproductive freedom but gathering, selling, and using browsing data.

The Markup ran Planned Parenthood’s website through our Blacklight tool and found 28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called “session recorders” that could be capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions. The site also contained trackers that tell Facebook and Google if users visited the site.

The Markup’s scan found Planned Parenthood’s site communicating with companies like Oracle, Verizon, LiveRamp, TowerData, and Quantcast—some of which have made a business of assembling and selling access to masses of digital data about people’s habits.

Katie Skibinski, vice president for digital products at Planned Parenthood, said the data collected on its website is “used only for internal purposes by Planned Parenthood and our affiliates,” and the company doesn’t “sell” data to third parties.

“While we aim to use data to learn how we can be most impactful, at Planned Parenthood, data-driven learning is always thoughtfully executed with respect for patient and user privacy,” Skibinski said. “This means using analytics platforms to collect aggregate data to gather insights and identify trends that help us improve our digital programs.”

Skibinski did not dispute that the organization shares data with third parties, including data brokers.

Blacklight scan of Planned Parenthood Gulf Coast—a localized website specifically for people in the Gulf region, including Texas, where abortion has been essentially outlawed—churned up similar results.

Planned Parenthood is not alone when it comes to nonprofits, some operating in sensitive areas like mental health and addiction, gathering and sharing data on website visitors.

Using our Blacklight tool, The Markup scanned more than 23,000 websites of nonprofit organizations, including those belonging to abortion providers and nonprofit addiction treatment centers. The Markup used the IRS’s nonprofit master file to identify nonprofits that have filed a tax return since 2019 and that the agency categorizes as focusing on areas like mental health and crisis intervention, civil rights, and medical research. We then examined each nonprofit’s website as publicly listed in GuideStar. We found that about 86 percent of them had third-party cookies or tracking network requests. By comparison, when The Markup did a survey of the top 80,000 websites in 2020, we found 87 percent used some type of third-party tracking.

About 11 percent of the 23,856 nonprofit websites we scanned had a Facebook pixel embedded, while 18 percent used the Google Analytics “Remarketing Audiences” feature.

The Markup found that 439 of the nonprofit websites loaded scripts called session recorders, which can monitor visitors’ clicks and keystrokes. Eighty-nine of those were for websites that belonged to nonprofits that the IRS categorizes as primarily focusing on mental health and crisis intervention issues…(More)”.

PrivaSeer


About: “PrivaSeer is an evolving privacy policy search engine. It aims to make privacy policies transparant, discoverable and searchable. Various faceted search features aim to help users get novel insights into the nature of privacy policies. PrivaSeer can be used to search for privacy policy text or URLs.

PrivaSeer currently has over 1.4 million privacy policies indexed and we are always looking to add more. We crawled privacy policies based on URLs obtained from Common Crawl and the Free Company Dataset.

We are working to add faceted search features like readability, sector of activity, personal information type etc. These will help users refine their search results….(More)”.

Data for Children Collaborative Designs Responsible Data Solutions for Cross-Sector Services


Impact story by data.org: “That is the question that the Collaborative set out to answer: how do we define and support strong data ethics in a way that ensures it is no longer an afterthought? How do we empower organizations to make it their priority?…

Fassio, Data for Children Collaborative Director Alex Hutchison, and the rest of their five-person team set out to create a roadmap for data responsibility. They started with their own experiences and followed the lifecycle of a non-profit project from conception to communicating results.

The journey begins – for project leaders and for the Collaborative – with an ethical assessment before any research or intervention has been conducted. The assessment calls on project teams to reflect on their motivations and ethical issues at the start, midpoint, and results stages of a project, ensuring that the priority stakeholder remains at the center. Some of the elements are directly tied to data, like data collection, security, and anonymization, but the assessment goes beyond the hard data and into its applications and analysis, including understanding stakeholder landscape and even the appropriate language to use when communicating outputs.

For the Collaborative, that priority is children. But they’ve designed the assessment, which maps across to UNICEF’s Responsible Data for Children (RD4C) toolkit, and other responsible innovation resources to be adaptable for other sectors.

“We wanted to make it really accessible for people with no background in ethics or data. We wanted anyone to be able to approach it,” Fassio said. “Because it is data-focused, there’s actually a very wide application. A lot of the questions we ask are very transferable to other groups.”

The same is true for their youth participation workbook – another resource in the toolkit. The team engaged young people to help co-create the process, staying open to revisions and iterations based on people’s experiences and feedback….(More)”

What Do Teachers Know About Student Privacy? Not Enough, Researchers Say


Nadia Tamez-Robledo at EdTech: “What should teachers be expected to know about student data privacy and ethics?

Considering so much of their jobs now revolve around student data, it’s a simple enough question—and one that researcher Ellen B. Mandinach and a colleague were tasked with answering. More specifically, they wanted to know what state guidelines had to say on the matter. Was that information included in codes of education ethics? Or perhaps in curriculum requirements for teacher training programs?

“The answer is, ‘Not really,’” says Mandinach, a senior research scientist at the nonprofit WestEd. “Very few state standards have anything about protecting privacy, or even much about data,” she says, aside from policies touching on FERPA or disposing of data properly.

While it seems to Mandinach that institutions have historically played hot potato over who is responsible for teaching educators about data privacy, the pandemic and its supercharged push to digital learning have brought new awareness to the issue.

The application of data ethics has real consequences for students, says Mandinach, like an Atlanta sixth grader who was accused of “Zoombombing” based on his computer’s IP address or the Dartmouth students who were exonerated from cheating accusations.

“There are many examples coming up as we’re in this uncharted territory, particularly as we’re virtual,” Mandinach says. “Our goal is to provide resources and awareness building to the education community and professional organization…so [these tools] can be broadly used to help better prepare educators, both current and future.”

This week, Mandinach and her partners at the Future of Privacy Forum released two training resources for K-12 teachers: the Student Privacy Primer and a guide to working through data ethics scenarios. The curriculum is based on their report examining how much data privacy and ethics preparation teachers receive while in college….(More)”.

False Positivism


Essay by Peter Polack: “During the pandemic, the everyday significance of modeling — data-driven representations of reality designed to inform planning — became inescapable. We viewed our plans, fears, and desires through the lens of statistical aggregates: Infection-rate graphs became representations not only of the virus’s spread but also of shattered plans, anxieties about lockdowns, concern for the fate of our communities. 

But as epidemiological models became more influential, their implications were revealed as anything but absolute. One model, the Recidiviz Covid-19 Model for Incarceration, predicted high infection rates in prisons and consequently overburdened hospitals. While these predictions were used as the basis to release some prisoners early, the model has also been cited by those seeking to incorporate more data-driven surveillance technologies into prison management — a trend new AI startups like Blue Prism and Staqu are eager to get in on. Thus the same model supports both the call to downsize prisons and the demand to expand their operations, even as both can claim a focus on flattening the curve. …

The ethics and effects of interventions depend not only on facts in themselves, but also on how facts are construed — and on what patterns of organization, existing or speculative, they are mobilized to justify. Yet the idea persists that data collection and fact finding should override concerns about surveillance, and not only in the most technocratic circles and policy think tanks. It also has defenders in the world of design theory and political philosophy. Benjamin Bratton, known for his theory of global geopolitics as an arrangement of computational technologies he calls “theStack,” sees in data-driven modeling the only political rationality capable of responding to difficult social and environmental problems like pandemics and climate change. In his latest book, The Revenge of the Real: Politics for a Post-Pandemic World, he argues that expansive models — enabled by what he theorizes as “planetary-scale computation” — can transcend individualistic perspectives and politics and thereby inaugurate a more inclusive and objective regime of governance. Against a politically fragmented world of polarized opinions and subjective beliefs, these models, Bratton claims, would unite politics and logistics under a common representation of the world. In his view, this makes longstanding social concerns about personal privacy and freedom comparatively irrelevant and those who continue to raise them irrational…(More)”.

Data Stewardship Re-Imagined — Capacities and Competencies


Blog and presentation by Stefaan Verhulst: “In ways both large and small, COVID-OVID-19 has forced us to re-examine every aspect of our political, social, and economic systems. Among the many lessons, policymakers have learned is that existing methods for using data are often insufficient for our most pressing challenges. In particular, we need to find new, innovative ways of tapping into the potential of privately held and siloed datasets that nonetheless contain tremendous public good potential, including complementing and extending official statistics. Data collaboratives are an emerging set of methods for accessing and reusing data that offer tremendous opportunities in this regard. In the last five years, we have studied and initiated numerous data collaboratives, in the process assembling a collection of over 200 example case studies to better understand their possibilities.

Among our key findings is the vital importance and essential role that needs to be played by Data Stewards.

Data stewards do not represent an entirely new profession; rather, their role could be understood as an extension and re-definition of existing organizational positions that manage and interact with data. Traditionally, the role of a data officer was limited either to data integrity or the narrow context of internal data governance and management, with a strong emphasis on technical competencies. This narrow conception is no longer sufficient, especially given the proliferation of data and the increasing potential of data sharing and collaboration. As such, we call for a re-imagination of data stewardship to encompass a wider range of functions and responsibilities, directed at leveraging data assets toward addressing societal challenges and improving people’s lives.

DATA STEWARDSHIP: functions and competencies to enable access to and re-use of data for public benefit in a systematic, sustainable, and responsible way.

In our vision, data stewards are professionals empowered to create public value (including official statistics) by re-using data and data expertise, identifying opportunities for productive cross-sectoral collaboration, and proactively requesting or enabling functional access to data, insights, and expertise. Data stewards are active in both the public and private sectors, promoting trust within and outside their organizations. They are essential to data collaboratives by providing functional access to unlock the potential of siloed data sets. In short, data stewards form a new — and essential — link in the data value chain….(More)”.

How do we ensure anonymisation is effective?


Chapter by the Information Commissioner’s Office (UK): “Effective anonymisation reduces identifiability risk to a sufficiently remote level.
• Identifiability is about whether someone is “identified or identifiable”. This doesn’t just concern someone’s name, but other information and factors that can distinguish them from someone else.
• Identifiability exists on a spectrum, where the status of information can change depending on the circumstances of its processing.
• When assessing whether someone is identifiable, you need to take account of the “means reasonably likely to be used”. You should base this on objective factors such as the costs and time required to identify, the available technologies, and the state of technological development over time.
• However, you do not need to take into account any purely hypothetical or theoretical chance of identifiability. The key is what is reasonably likely relative to the circumstances, not what is conceivably likely in absolute.
• You also need to consider both the information itself as well as the environment in which it is processed. This will be impacted by the type of data release (to the public, to a defined group, etc) and the status of the information in the other party’s hands.
• When considering releasing anonymous information to the world at large, you may have to implement more robust techniques to achieve effective anonymisation than when releasing to particular groups or individual organisations.
• There are likely to be many borderline cases where you need to use careful judgement based on the specific circumstances of the case.
• Applying a “motivated intruder” test is a good starting point to consider identifiability risk.
• You should review your risk assessments and decision-making processes at appropriate intervals. The appropriate time for, and frequency of, any reviews depends on the circumstances…(More)”.

Using location data responsibly in cities and local government


Article by Ben Hawes: “City and local governments increasingly recognise the power of location data to help them deliver more and better services, exactly where and when they are needed. The use of this data is going to grow, with more pressure to manage resources and emerging challenges including responding to extreme weather events and other climate impacts.

But using location data to target and manage local services comes with risks to the equitable delivery of services, privacy and accountability. To make the best use of these growing data resources, city leaders and their teams need to understand those risks and address them, and to be able to explain their uses of data to citizens.

The Locus Charter, launched earlier this year, is a set of common principles to promote responsible practice when using location data. The Charter could be very valuable to local governments, to help them navigate the challenges and realise the rewards offered by data about the places they manage….

Compared to private companies, local public bodies already have special responsibilities to ensure transparency and fairness. New sources of data can help, but can also generate new questions. Local governments have generally been able to improve services as they learned more about the people they served. Now they must manage the risks of knowing too much about people, and acting intrusively. They can also risk distorting service provision because their data about people in places is uneven or unrepresentative.

Many city and local governments fully recognise that data-driven delivery comes with risks, and are developing specific local data ethics frameworks to guide their work. Some of these, like Kansas City’s, are specifically aimed at managing data privacy. Others cover broader uses of data, like Greater Manchester’s Declaration for Intelligent and Responsible Data Practice (DTPR). DTPR is an open source communication standard that helps people understand how data is being used in public places.

London is engaging citizens on an Emerging Technology Charter, to explore new and ethically charged questions around data. Govlab supports an AI Localism repository of actions taken by local decision-makers to address the use of AI within a city or community. The EU Sherpa programme (Shaping the Ethical Dimensions of Smart Information Systems) includes a smart cities strand, and has published a case-study on the Ethics of Using Smart City AI and Big Data.

Smart city applications make it potentially possible to collect data in many ways, for many purposes, but the technologies cannot answer questions about what is appropriate. In The Smart Enough City: Putting Technology in its Place to Reclaim Our Urban Future (2019), author Ben Green describes examples when some cities have failed and others succeeded in judging what smart applications should be used.

Attention to what constitutes ethical practice with location data can give additional help to leaders making that kind of judgement….(More)”

Building Consumer Confidence Through Transparency and Control


Cisco 2021 Consumer Privacy Survey: “Protecting privacy continues to be a critical issue for individuals, organizations, and governments around the world. Eighteen months into the COVID-19 pandemic, our health information and vaccination status are needed more than ever to understand the virus, control the spread, and enable safer environments for work, learning, recreation, and other activities. Nonetheless, people want privacy protections to be maintained, and they expect organizations and governments to keep their data safe and used only for pandemic response. Individuals are also increasingly taking action to protect themselves and their data. This report, our third annual review of consumer privacy, explores current trends, challenges, and opportunities in privacy for consumers.

The report draws upon data gathered from a June 2021 survey where respondents were not informed of who was conducting the study and respondents were anonymous to the researchers. Respondents included 2600 adults (over the age of 18) in 12 countries (5 Europe, 4 Asia Pacific, and 3 Americas). Participants were asked about their attitudes and activities regarding companies’ use of their personal data, level of support for COVID-19 related information sharing, awareness and reaction to privacy legislation, and attitudes regarding artificial intelligence (AI) and automated decision making.

The findings from this research demonstrates the growing importance of privacy to the individual and its implications on the businesses and governments that serve them. Key highlights of this report

  1. Consumers want transparency and control with respect to business data practices – an increasing number will act to protect their data
  2. Privacy laws are viewed very positively around the world, but awareness of these laws remains low
  3. Despite the ongoing pandemic, most consumers want little or no reduction in privacy protections, while still supporting public health and safety efforts
  4. Consumers are very concerned about the use of their personal information in AI and abuse has eroded trust…(More)”.