#Kremlin: Using Hashtags to Analyze Russian Disinformation Strategy and Dissemination on Twitter


Paper by Sarah Oates, and John Gray: “Reports of Russian interference in U.S. elections have raised grave concerns about the spread of foreign disinformation on social media sites, but there is little detailed analysis that links traditional political communication theory to social media analytics. As a result, it is difficult for researchers and analysts to gauge the nature or level of the threat that is disseminated via social media. This paper leverages both social science and data science by using traditional content analysis and Twitter analytics to trace how key aspects of Russian strategic narratives were distributed via #skripal, #mh17, #Donetsk, and #russophobia in late 2018.

This work will define how key Russian international communicative goals are expressed through strategic narratives, describe how to find hashtags that reflect those narratives, and analyze user activity around the hashtags. This tests both how Twitter amplifies specific information goals of the Russians as well as the relative success (or failure) of particular hashtags to spread those messages effectively. This research uses Mentionmapp, a system co-developed by one of the authors (Gray) that employs network analytics and machine intelligence to identify the behavior of Twitter users as well as generate profiles of users via posting history and connections. This study demonstrates how political communication theory can be used to frame the study of social media; how to relate knowledge of Russian strategic priorities to labels on social media such as Twitter hashtags; and to test this approach by examining a set of Russian propaganda narratives as they are represented by hashtags. Our research finds that some Twitter users are consistently active across multiple Kremlin-linked hashtags, suggesting that knowledge of these hashtags is an important way to identify Russian propaganda online influencers. More broadly, we suggest that Twitter dichotomies such as bot/human or troll/citizen should be used with caution and analysis should instead address the nuances in Twitter use that reflect varying levels of engagement or even awareness in spreading foreign disinformation online….(More)”.

The personification of big data


Paper by Stevenson, Phillip Douglas and Mattson, Christopher Andrew: “Organizations all over the world, both national and international, gather demographic data so that the progress of nations and peoples can be tracked. This data is often made available to the public in the form of aggregated national level data or individual responses (microdata). Product designers likewise conduct surveys to better understand their customer and create personas. Personas are archetypes of the individuals who will use, maintain, sell or otherwise be affected by the products created by designers. Personas help designers better understand the person the product is designed for. Unfortunately, the process of collecting customer information and creating personas is often a slow and expensive process.

In this paper, we introduce a new method of creating personas, leveraging publicly available databanks of both aggregated national level and information on individuals in the population. A computational persona generator is introduced that creates a population of personas that mirrors a real population in terms of size and statistics. Realistic individual personas are filtered from this population for use in product development…(More)”.

Artificial Intelligence and Digital Repression: Global Challenges to Governance


Paper by Steven Feldstein: “Across the world, artificial intelligence (AI) is showing its potential for abetting repressive regimes and upending the relationship between citizen and state, thereby exacerbating a global resurgence of authoritarianism. AI is a component in a broader ecosystem of digital repression, but it is relevant to several different techniques, including surveillance, censorship, disinformation, and cyber attacks. AI offers three distinct advantages to autocratic leaders: it helps solve principal-agent loyalty problems, it offers substantial cost-efficiencies over traditional means of surveillance, and it is particularly effective against external regime challenges. China is a key proliferator of AI technology to authoritarian and illiberal regimes; such proliferation is an important component of Chinese geopolitical strategy. To counter the spread of high-tech repression abroad, as well as potential abuses at home, policy makers in democratic states must think seriously about how to mitigate harms and to shape better practices….(More)”

Whose Commons? Data Protection as a Legal Limit of Open Science


Mark Phillips and Bartha M. Knoppers in the Journal of Law, Medicine and Ethics: “Open science has recently gained traction as establishment institutions have come on-side and thrown their weight behind the movement and initiatives aimed at creation of information commons. At the same time, the movement’s traditional insistence on unrestricted dissemination and reuse of all information of scientific value has been challenged by the movement to strengthen protection of personal data. This article assesses tensions between open science and data protection, with a focus on the GDPR.

Powerful institutions across the globe have recently joined the ranks of those making substantive commitments to “open science.” For example, the European Commission and the NIH National Cancer Institute are supporting large-scale collaborations, such as the Cancer Genome Collaboratory, the European Open Science Cloud, and the Genomic Data Commons, with the aim of making giant stores of genomic and other data readily available for analysis by researchers. In the field of neuroscience, the Montreal Neurological Institute is midway through a novel five-year project through which it plans to adopt open science across the full spectrum of its research. The commitment is “to make publicly available all positive and negative data by the date of first publication, to open its biobank to registered researchers and, perhaps most significantly, to withdraw its support of patenting on any direct research outputs.” The resources and influence of these institutions seem to be tipping the scales, transforming open science from a longstanding aspirational ideal into an existing reality.

Although open science lacks any standard, accepted definition, one widely-cited model proposed by the Austria-based advocacy effort openscienceASAP describes it by reference to six principles: open methodology, open source, open data, open access, open peer review, and open educational resources. The overarching principle is “the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process.” This article adopts this principle as a working definition of open science, with a particular emphasis on open sharing of human data.

As noted above, many of the institutions committed to open science use the word “commons” to describe their initiatives, and the two concepts are closely related. “Medical information commons” refers to “a networked environment in which diverse sources of health, medical, and genomic information on large populations become widely shared resources.” Commentators explicitly link the success of information commons and progress in the research and clinical realms to open science-based design principles such as data access and transparent analysis (i.e., sharing of information about methods and other metadata together with medical or health data).

But what legal, as well as ethical and social, factors will ultimately shape the contours of open science? Should all restrictions be fought, or should some be allowed to persist, and if so, in what form? Given that a commons is not a free-for-all, in that its governing rules shape its outcomes, how might we tailor law and policy to channel open science to fulfill its highest aspirations, such as universalizing practical access to scientific knowledge and its benefits, and avoid potential pitfalls? This article primarily concerns research data, although passing reference is also made to the approach to the terms under which academic publications are available, which are subject to similar debates….(More)”.

The Market for Data Privacy


Paper by Tarun Ramadorai, Antoine Uettwiller and Ansgar Walther: “We scrape a comprehensive set of US firms’ privacy policies to facilitate research on the supply of data privacy. We analyze these data with the help of expert legal evaluations, and also acquire data on firms’ web tracking activities. We find considerable and systematic variation in privacy policies along multiple dimensions including ease of access, length, readability, and quality, both within and between industries. Motivated by a simple theory of big data acquisition and usage, we analyze the relationship between firm size, knowledge capital intensity, and privacy supply. We find that large firms with intermediate data intensity have longer, legally watertight policies, but are more likely to share user data with third parties….(More)”.

Privacy-Preserved Data Sharing for Evidence-Based Policy Decisions: A Demonstration Project Using Human Services Administrative Records for Evidence-Building Activities


Paper by the Bipartisan Policy Center: “Emerging privacy-preserving technologies and approaches hold considerable promise for improving data privacy and confidentiality in the 21st century. At the same time, more information is becoming accessible to support evidence-based policymaking.

In 2017, the U.S. Commission on Evidence-Based Policymaking unanimously recommended that further attention be given to the deployment of privacy-preserving data-sharing applications. If these types of applications can be tested and scaled in the near-term, they could vastly improve insights about important policy problems by using disparate datasets. At the same time, the approaches could promote substantial gains in privacy for the American public.

There are numerous ways to engage in privacy-preserving data sharing. This paper primarily focuses on secure computation, which allows information to be accessed securely, guarantees privacy, and permits analysis without making private information available. Three key issues motivated the launch of a domestic secure computation demonstration project using real government-collected data:

  • Using new privacy-preserving approaches addresses pressing needs in society. Current widely accepted approaches to managing privacy risks—like preventing the identification of individuals or organizations in public datasets—will become less effective over time. While there are many practices currently in use to keep government-collected data confidential, they do not often incorporate modern developments in computer science, mathematics, and statistics in a timely way. New approaches can enable researchers to combine datasets to improve the capability for insights, without being impeded by traditional concerns about bringing large, identifiable datasets together. In fact, if successful, traditional approaches to combining data for analysis may not be as necessary.
  • There are emerging technical applications to deploy certain privacy-preserving approaches in targeted settings. These emerging procedures are increasingly enabling larger-scale testing of privacy-preserving approaches across a variety of policy domains, governmental jurisdictions, and agency settings to demonstrate the privacy guarantees that accompany data access and use.
  • Widespread adoption and use by public administrators will only follow meaningful and successful demonstration projects. For example, secure computation approaches are complex and can be difficult to understand for those unfamiliar with their potential. Implementing new privacy-preserving approaches will require thoughtful attention to public policy implications, public opinions, legal restrictions, and other administrative limitations that vary by agency and governmental entity.

This project used real-world government data to illustrate the applicability of secure computation compared to the classic data infrastructure available to some local governments. The project took place in a domestic, non-intelligence setting to increase the salience of potential lessons for public agencies….(More)”.

Our data, our society, our health: a vision for inclusive and transparent health data science in the UK and Beyond


Paper by Elizabeth Ford et al in Learning Health Systems: “The last six years have seen sustained investment in health data science in the UK and beyond, which should result in a data science community that is inclusive of all stakeholders, working together to use data to benefit society through the improvement of public health and wellbeing.

However, opportunities made possible through the innovative use of data are still not being fully realised, resulting in research inefficiencies and avoidable health harms. In this paper we identify the most important barriers to achieving higher productivity in health data science. We then draw on previous research, domain expertise, and theory, to outline how to go about overcoming these barriers, applying our core values of inclusivity and transparency.

We believe a step-change can be achieved through meaningful stakeholder involvement at every stage of research planning, design and execution; team-based data science; as well as harnessing novel and secure data technologies. Applying these values to health data science will safeguard a social license for health data research, and ensure transparent and secure data usage for public benefit….(More)”.

Big Data and Dahl’s Challenge of Democratic Governance


Alex Ingrams in the Review of Policy Research: “Big data applications have been acclaimed as potentially transformative for the public sector. But, despite this acclaim, most theory of big data is narrowly focused around technocratic goals. The conceptual frameworks that situate big data within democratic governance systems recognizing the role of citizens are still missing. This paper explores the democratic governance impacts of big data in three policy areas using Robert Dahl’s dimensions of control and autonomy. Key impacts and potential tensions are highlighted. There is evidence of impacts on both dimensions, but the dimensions conflict as well as align in notable ways and focused policy efforts will be needed to find a balance….(More)”.

The Future of Civic Engagement


Report by Hollie Russon Gilman: “The 2018 mid-term voter turnout was the highest in 50 years. While vital, voting can’t sustain civic engagement in the long term. So, how do we channel near-term activism into long-term civic engagement?  In her essay, Gilman paints a picture of how new institutional structures, enabled by new technologies, could lead to a new “civic layer” in society that results in “a more responsive, participatory, collaborative, and adaptive future for civic engagement in governance decision making.”

Creating a New “Civic Layer.” The longer-term future presents an opportunity to set up institutionalized structures for engagement across local, state, and federal levels of government—creating a “civic layer.” Its precise form will evolve, but the basic concept is to establish a centralized interface within a com- munity to engage residents in governance decision making that interweaves digital and in-person engagement. People will earn “civic points” for engagement across a variety of activities—including every time they sign a petition, report a pot hole, or volunteer in their local community.

While creating a civic layer will require new institutional approaches, emerging technologies such as the Internet of Things (IoT), artificial intelligence (AI), and distributed ledger (e.g., blockchain) will also play a critical enabling role. These technologies will allow new institutional models to expand the concept of citizen coproduction of services in building a more responsive, connected, and engaged citizenry.

The following examples show different collaborative governance and technology components that will comprise the civic layer.  Each could be expanded and become interwoven into the fabric of civic life.

Use Collaborative Policymaking Models to Build a Civic Layer.  While we currently think of elections as a primary mode of citizen engagement with government, in the medium- to long-range future we could see collaborative policy models that become the de facto way people engage to supplement elections. Several of these engagement models are on the local level. However, with the formation of a civic layer these forms of engagement could become integrated into a federated structure enabling more scale, scope, and impact. Following are two promising models.

  • Participatory Budgeting can be broadly defined as the participation of citizens in the decision-making process of how to allocate their community’s budget among different priorities and in the monitoring of public spending. The process first came to the United States in 2009 through the work of the nonprofit Participatory Budgeting Project. Unlike traditional budget consultations held by some governments—which often amount to “selective listening” exercises—with participatory budgeting, citizens have an actual say in how a portion of a government’s investment budget is spent, with more money often allocated to poorer communities. Experts estimate that up to 2,500 local governments around the world have implemented participatory budgeting,
  • Citizens’Jury is another promising collaborative policymaking engagement model, pioneered in the 1980s and currently advocated by the nonprofit Jefferson Center in Minnesota. Three counties in rural Minnesota use this method as a foundation for Rural Climate Dialogues—regular gatherings where local residents hear from rural experts, work directly with their neighbors to design actionable community and policy recommendations, and share their feedback with public officials at a statewide meeting of rural Minnesota citizens, state agency representatives, and nonprofit organizations….(More)”.

The Datafication of Employment


Report by Sam Adler-Bell and Michelle Miller at the Century Foundation: “We live in a surveillance society. Our every preference, inquiry, whim, desire, relationship, and fear can be seen, recorded, and monetized by thousands of prying corporate eyes. Researchers and policymakers are only just beginning to map the contours of this new economy—and reckon with its implications for equity, democracy, freedom, power, and autonomy.

For consumers, the digital age presents a devil’s bargain: in exchange for basically unfettered access to our personal data, massive corporations like Amazon, Google, and Facebook give us unprecedented connectivity, convenience, personalization, and innovation. Scholars have exposed the dangers and illusions of this bargain: the corrosion of personal liberty, the accumulation of monopoly power, the threat of digital redlining,1 predatory ad-targeting,2 and the reification of class and racial stratification.3 But less well understood is the way data—its collection, aggregation, and use—is changing the balance of power in the workplace.

This report offers some preliminary research and observations on what we call the “datafication of employment.” Our thesis is that data-mining techniques innovated in the consumer realm have moved into the workplace. Firms who’ve made a fortune selling and speculating on data acquired from consumers in the digital economy are now increasingly doing the same with data generated by workers. Not only does this corporate surveillance enable a pernicious form of rent-seeking—in which companies generate huge profits by packaging and selling worker data in marketplace hidden from workers’ eyes—but also, it opens the door to an extreme informational asymmetry in the workplace that threatens to give employers nearly total control over every aspect of employment.

The report begins with an explanation of how a regime of ubiquitous consumer surveillance came about, and how it morphed into worker surveillance and the datafication of employment. The report then offers principles for action for policymakers and advocates seeking to respond to the harmful effects of this new surveillance economy. The final sections concludes with a look forward at where the surveillance economy is going, and how researchers, labor organizers, and privacy advocates should prepare for this changing landscape….(More)”