To Secure Knowledge: Social Science Partnerships for the Common Good


Social Science Research Council: “For decades, the social sciences have generated knowledge vital to guiding public policy, informing business, and understanding and improving the human condition. But today, the social sciences face serious threats. From dwindling federal funding to public mistrust in institutions to widespread skepticism about data, the infrastructure supporting the social sciences is shifting in ways that threaten to undercut research and knowledge production.

How can we secure social knowledge for future generations?

This question has guided the Social Science Research Council’s Task Force. Following eighteen months of consultation with key players as well as internal deliberation, we have identified both long-term developments and present threats that have created challenges for the social sciences, but also created unique opportunities. And we have generated recommendations to address these issues.

Our core finding focuses on the urgent need for new partnerships and collaborations among several key players: the federal government, academic institutions, donor organizations, and the private sector. Several decades ago, these institutions had clear zones of responsibility in producing social knowledge, with the federal government constituting the largest portion of funding for basic research. Today, private companies represent an increasingly large share not just of research and funding, but also the production of data that informs the social sciences, from smart phone usage to social media patterns.

In addition, today’s social scientists face unprecedented demands for accountability, speedy publication, and generation of novel results. These pressures have emerged from the fragmented institutional foundation that undergirds research. That foundation needs a redesign in order for the social sciences to continue helping our communities address problems ranging from income inequality to education reform.

To build a better future, we identify five areas of action: Funding, Data, Ethics, Research Quality, and Research Training. In each area, our recommendations range from enlarging corporate-academic pilot programs to improving social science training in digital literacy.

A consistent theme is that none of the measures, if taken unilaterally, can generate optimal outcomes. Instead, we have issued a call to forge a new research compact to harness the potential of the social sciences for improving human lives. That compact depends on partnerships, and we urge the key players in the construction of social science knowledge—including universities, government, foundations, and corporations—to act swiftly. With the right realignments, the security of social knowledge lies within our reach….(More)”

Ethics and Data Science


(Open) Ebook by Mike LoukidesHilary Mason and DJ Patil: “As the impact of data science continues to grow on society there is an increased need to discuss how data is appropriately used and how to address misuse. Yet, ethical principles for working with data have been available for decades. The real issue today is how to put those principles into action. With this report, authors Mike Loukides, Hilary Mason, and DJ Patil examine practical ways for making ethical data standards part of your work every day.

To help you consider all of possible ramifications of your work on data projects, this report includes:

  • A sample checklist that you can adapt for your own procedures
  • Five framing guidelines (the Five C’s) for building data products: consent, clarity, consistency, control, and consequences
  • Suggestions for building ethics into your data-driven culture

Now is the time to invest in a deliberate practice of data ethics, for better products, better teams, and better outcomes….(More)”.

The Promise and Peril of the Digital Knowledge Loop


Excerpt of Albert Wenger’s draft book World After Capital: “The zero marginal cost and universality of digital technologies are already impacting the three phases of learning, creating and sharing, giving rise to a Digital Knowledge Loop. This Digital Knowledge Loop holds both amazing promise and great peril, as can be seen in the example of YouTube.

YouTube has experienced astounding growth since its release in beta form in 2005. People around the world now upload over 100 hours of video content to YouTube every minute. It is difficult to grasp just how much content that is. If you were to spend 100 years watching YouTube twenty-four hours a day, you still wouldn’t be able to watch all the video that people upload in the course of a single week. YouTube contains amazing educational content on topics as diverse as gardening and theoretical math. Many of those videos show the promise of the Digital Knowledge loop. For example, Destin Sandlin, the creator of the Smarter Every Day series of videos. Destin is interested in all things science. When he learns something new, such as the make-up of butterfly wings, he creates a new engaging video sharing that with the world. But the peril of the Digital Knowledge Loop is right there as well: YouTube is also full of videos that peddle conspiracies, spread mis-information, and even incite outright hate.

Both the promise and the peril are made possible by the same characteristics of YouTube: All of the videos are available for free to anyone in the world (except for those countries in which YouTube is blocked). They are also available 24×7. And they become available globally the second someone publishes a new one. Anybody can publish a video. All you need to access these videos is an Internet connection and a smartphone—you don’t even need a laptop or other traditional computer. That means already today two to three billion people, almost half of the world’s population has access to YouTube and can participate in the Digital Knowledge Loop for good and for bad.

These characteristics, which draw on the underlying capabilities of digital technology, are also found in other systems that similarly show the promise and peril of the Digital Knowledge Loop.

Wikipedia, the collectively-produced online encyclopedia is another great example. Here is how it works at its most promising: Someone reads an entry and learns the method used by Pythagoras to approximate the number pi. They then go off and create an animation that illustrates this method. Finally, they share the animation by publishing it back to Wikipedia thus making it easier for more people to learn. Wikipedia entries result from a large collaboration and ongoing revision process, with only a single entry per topic visible at any given time (although you can examine both the history of the page and the conversations about it). What makes this possible is a piece of software known as a wiki that keeps track of all the historical edits [58]. When that process works well it raises the quality of entries over time. But when there is a coordinated effort at manipulation or insufficient editing resources, Wikipedia too can spread misinformation instantly and globally.

Wikipedia illustrates another important aspect of the Digital Knowledge Loop: it allows individuals to participate in extremely small or minor ways. If you wish, you can contribute to Wikipedia by fixing a single typo. In fact, the minimal contribution unit is just one letter! I have not yet contributed anything of length to Wikipedia, but I have fixed probably a dozen or so typos. That doesn’t sound like much, but if you get ten thousand people to fix a typo every day, that’s 3.65 million typos a year. Let’s assume that a single person takes two minutes on average to discover and fix a typo. It would take nearly fifty people working full time for a year (2500 hours) to fix 3.65 million typos.

Small contributions by many that add up are only possible in the Digital Knowledge Loop. The Wikipedia spelling correction example shows the power of such contributions. Their peril can be seen in systems such as Twitter and Facebook, where the smallest contributions are Likes and Retweets or Reposts to one’s friends or followers. While these tiny actions can amplify high quality content, they can just as easily spread mistakes, rumors and propaganda. The impact of these information cascades ranges from viral jokes to swaying the outcomes of elections and has even led to major outbreaks of violence.

Some platforms even make it possible for people to passively contribute to the Digital Knowledge Loop. The app Waze is a good example. …The promise of the Digital Knowledge Loop is broad access to a rapidly improving body of knowledge. The peril is a fragmented post-truth society constantly in conflict. Both of these possibilities are enabled by the same fundamental characteristics of digital technologies. And once again we see clearly that technology by itself does not determine the future…(More).

Citizen Innovations


Introduction by Jean-Claude Ruano-Borbalan and Bertrand Bocquet of Special Issue of Technology and Innovation (French) : “The last half century has seen considerable development of institutional interfaces participating in the “great standardization” of science and innovation systems. The limitations of this model appeared for many economic, political or cultural reasons. Strong developments appear within the context of a deliberative democracy that impacts scientific and technical institutions and production, and therefore the nature and the policies of innovation. The question about this part of a “technical democracy”
is whether there will be a long-term movement. We dedicate this issue to citizen participatory innovations, more or less related to technical and scientific questions. It highlights various scales and focal points of “social and citizen innovation”, domains based on examples of ongoing transformations…. (More)
Table of Contents:

Reflecting the Past, Shaping the Future: Making AI Work for International Development


USAID Report: “We are in the midst of an unprecedented surge of interest in machine learning (ML) and artificial intelligence (AI) technologies. These tools, which allow computers to make data-derived predictions and automate decisions, have become part of daily life for billions of people. Ubiquitous digital services such as interactive maps, tailored advertisements, and voice-activated personal assistants are likely only the beginning. Some AI advocates even claim that AI’s impact will be as profound as “electricity or fire” that it will revolutionize nearly every field of human activity. This enthusiasm has reached international development as well. Emerging ML/AI applications promise to reshape healthcare, agriculture, and democracy in the developing world. ML and AI show tremendous potential for helping to achieve sustainable development objectives globally. They can improve efficiency by automating labor-intensive tasks, or offer new insights by finding patterns in large, complex datasets. A recent report suggests that AI advances could double economic growth rates and increase labor productivity 40% by 2035. At the same time, the very nature of these tools — their ability to codify and reproduce patterns they detect — introduces significant concerns alongside promise.

In developed countries, ML tools have sometimes been found to automate racial profiling, to foster surveillance, and to perpetuate racial stereotypes. Algorithms may be used, either intentionally or unintentionally, in ways that result in disparate or unfair outcomes between minority and majority populations. Complex models can make it difficult to establish accountability or seek redress when models make mistakes. These shortcomings are not restricted to developed countries. They can manifest in any setting, especially in places with histories of ethnic conflict or inequality. As the development community adopts tools enabled by ML and AI, we need a cleareyed understanding of how to ensure their application is effective, inclusive, and fair. This requires knowing when ML and AI offer a suitable solution to the challenge at hand. It also requires appreciating that these technologies can do harm — and committing to addressing and mitigating these harms.

ML and AI applications may sometimes seem like science fiction, and the technical intricacies of ML and AI can be off-putting for those who haven’t been formally trained in the field. However, there is a critical role for development actors to play as we begin to lean on these tools more and more in our work. Even without technical training in ML, development professionals have the ability — and the responsibility — to meaningfully influence how these technologies impact people.

You don’t need to be an ML or AI expert to shape the development and use of these tools. All of us can learn to ask the hard questions that will keep solutions working for, and not against, the development challenges we care about. Development practitioners already have deep expertise in their respective sectors or regions. They bring necessary experience in engaging local stakeholders, working with complex social systems, and identifying structural inequities that undermine inclusive progress. Unless this expert perspective informs the construction and adoption of ML/AI technologies, ML and AI will fail to reach their transformative potential in development.

This document aims to inform and empower those who may have limited technical experience as they navigate an emerging ML/AI landscape in developing countries. Donors, implementers, and other development partners should expect to come away with a basic grasp of common ML techniques and the problems ML is uniquely well-suited to solve. We will also explore some of the ways in which ML/AI may fail or be ill-suited for deployment in developing-country contexts. Awareness of these risks, and acknowledgement of our role in perpetuating or minimizing them, will help us work together to protect against harmful outcomes and ensure that AI and ML are contributing to a fair, equitable, and empowering future…(More)”.

European science funders ban grantees from publishing in paywalled journals


Martin Enserink at Science: “Frustrated with the slow transition toward open access (OA) in scientific publishing, 11 national funding organizations in Europe turned up the pressure today. As of 2020, the group, which jointly spends about €7.6 billion on research annually, will require every paper it funds to be freely available from the moment of publication. In a statement, the group said it will no longer allow the 6- or 12-month delays that many subscription journals now require before a paper is made OA, and it won’t allow publication in so-called hybrid journals, which charge subscriptions but also make individual papers OA for an extra fee.

The move means grantees from these 11 funders—which include the national funding agencies in the United Kingdom, the Netherlands, and France as well as Italy’s National Institute for Nuclear Physics—will have to forgo publishing in thousands of journals, including high-profile ones such as NatureScienceCell, and The Lancet, unless those journals change their business model. “We think this could create a tipping point,” says Marc Schiltz, president of Science Europe, the Brussels-based association of science organizations that helped coordinate the plan. “Really the idea was to make a big, decisive step—not to come up with another statement or an expression of intent.”

The announcement delighted many OA advocates. “This will put increased pressure on publishers and on the consciousness of individual researchers that an ecosystem change is possible,” says Ralf Schimmer, head of Scientific Information Provision at the Max Planck Digital Library in Munich, Germany. Peter Suber, director of the Harvard Library Office for Scholarly Communication, calls the plan “admirably strong.” Many other funders support OA, but only the Bill & Melinda Gates Foundation applies similarly stringent requirements for “immediate OA,” Suber says. The European Commission and the European Research Council support the plan; although they haven’t adopted similar requirements for the research they fund, a statement by EU Commissioner for Research, Science and Innovation Carlos Moedas suggests they may do so in the future and urges the European Parliament and the European Council to endorse the approach….(More)”.

Keeping Democracy Alive in Cities


Myung J. Lee at the Stanford Social Innovation Review:  “It seems everywhere I go these days, people are talking and writing and podcasting about America’s lack of trust—how people don’t trust government and don’t trust each other. President Trump discourages us from trusting anything, especially the media. Even nonprofit organizations, which comprise the heart of civil society, are not exempt: A recent study found that trust in NGOsdropped by nine percent between 2017 and 2018. This fundamental lack of trust is eroding the shared public space where progress and even governance can happen, putting democracy at risk.

How did we get here? Perhaps it’s because Americans have taken our democratic way of life for granted. Perhaps it’s because people’s individual and collective beliefs are more polarized—and more out in the open—than ever before. Perhaps we’ve stopped believing we can solve problems together.

There are, however, opportunities to rebuild and fortify our sense of trust. This is especially true at the local level, where citizens can engage directly with elected leaders, nonprofit organizations, and each other.

As French political scientist Alexis de Tocqueville observed in Democracy in America, “Municipal institutions constitute the strength of free nations. Town meetings are to liberty what primary schools are to science; they bring it within the people’s reach; they teach men how to use and how to enjoy it.” Through town halls and other means, cities are where citizens, elected leaders, and nonprofit organizations can most easily connect and work together to improve their communities.

Research shows that, while trust in government is low everywhere, it is highest in local government. This is likely because people can see that their votes influence issues they care about, and they can directly interact with their mayors and city council members. Unlike with members of Congress, citizens can form real relationships with local leaders through events like “walks with the mayor” and neighborhood cleanups. Some mayors do even more to connect with their constituents. In Detroit, for example, Mayor Michael Duggan meets with residents in their homes to help them solve problems and answer questions in person. Many mayors also join in neighborhood projects. San Jose Mayor Sam Liccardo, for example, participates in a different community cleanup almost every week. Engaged citizens who participate in these activities are more likely to feel that their participation in democratic society is valuable and effective.

The role of nonprofit and community-based organizations, then, is partly to sustain democracy by being the bridge between city governments and citizens, helping them work together to solve concrete problems. It’s hard and important work. Time and again, this kind of relationship- and trust-building through action creates ripple effects that grow over time.

In my work with Cities of Service, which helps mayors and other city leaders effectively engage their citizens to solve problems, I’ve learned that local government works better when it is open to the ideas and talents of citizens. Citizen collaboration can take many forms, including defining and prioritizing problems, generating solutions, and volunteering time, creativity, and expertise to set positive change in motion. Citizens can leverage their own deep expertise about what’s best for their families and communities to deliver better services and solve public problems….(More)”.

Following Fenno: Learning from Senate Candidates in the Age of Social Media and Party Polarization


David C.W. Parker  at The Forum: “Nearly 40 years ago, Richard Fenno published Home Style, a seminal volume explaining how members of Congress think about and engage in the process of representation. To accomplish his task, he observed members of Congress as they crafted and communicated their representational styles to the folks back home in their districts. The book, and Fenno’s ensuing research agenda, served as a clarion call to move beyond sophisticated quantitative analyses of roll call voting and elite interviews in Washington, D.C. to comprehend congressional representation. Instead, Fenno argued, political scientists are better served by going home with members of Congress where “their perceptions of their constituencies are shaped, sharpened, or altered” (Fenno 1978, p. xiii). These perceptions of constituencies fundamentally shape what members of Congress do at home and in Washington. If members of Congress are single-minded seekers of reelection, as we often assume, then political scientists must begin with the constituent relationship essential to winning reelection. Go home, Fenno says, to understand Congress.

There are many ways constituency relationships can be understood and uncovered; the preferred method for Fenno is participant observation, which he variously terms as “soaking and poking” or “just hanging around.” Although it sounds easy enough to sit and watch, good participant observation requires many considerations (as Fenno details in a thorough appendix to Home Style). In this appendix, and in another series of essays, Fenno grapples forthrightly with the tough choices researchers must consider when watching and learning from politicians.

In this essay, I respond to Fenno’s thought-provoking methodological treatise in Home Style and the ensuing collection of musings he published as Watching Politicians: Essays on Participant Observation. I do so for three reasons: First, I wish to reinforce Fenno’s call to action. As the study of political science has matured, it has moved away from engaging with politicians in the field across the various sub-fields, favoring statistical analyses. “Everyone cites Fenno, but no one does Fenno,” I recently opined, echoing another scholar commenting on Fenno’s work (Fenno 2013, p. 2; Parker 2015, p. 246). Unfortunately, that sentiment is supported by data (Grimmer 2013, pp. 13–19; Curry 2017). Although quantitative and formal analyses have led to important insights into the study of political behavior and institutions, politics is as important to our discipline as science. And in politics, the motives and concerns of people are important to witness, not just because they add complexity and richness to our stories, but because they aid in theory generation.1 Fenno’s study was exploratory, but is full of key theoretical insights relevant to explaining how members of Congress understand their constituencies and the ensuing political choices they make.

Second, to “do” participant observation requires understanding the choices the methodology imposes. This necessitates that those who practice this method of discovery document and share their experiences (Lin 2000). The more the prospective participant observer can understand the size of the choice set she faces and the potential consequences at each decision point in advance, the better her odds of avoiding unanticipated consequences with both immediate and long-term research ramifications. I hope that adding my cumulative experiences to this ongoing methodological conversation will assist in minimizing both unexpected and undesirable consequences for those who follow into the field. Fenno is open about his own choices, and the difficult decisions he faced as a participant observer. Encouraging scholars to engage in participant observation is only half the battle. The other half is to encourage interested scholars to think about those same choices and methodological considerations, while acknowledging that context precludes a one-size fits all approach. Fenno’s choices may not be your choices – and that might be just fine depending upon your circumstances. Fenno would wholeheartedly agree.

Finally, Congress and American politics have changed considerably from when Fenno embarked on his research in Home Style. At the end of his introduction, Fenno writes that “this book is about the early to mid-1970s only. These years were characterized by the steady decline of strong national party attachments and strong local party organizations. … Had these conditions been different, House members might have behaved differently in their constituencies” (xv). Developments since Fenno put down his pen include political parties polarizing to an almost unprecedented degree, partisan attachments strengthening among voters, and technology emerging to change fundamentally how politicians engage with constituents. In light of this evolution of political culture in Washington and at home, it is worth considering the consequences for the participant-observation research approach. Many have asked me if it is still possible to do such work in the current political environment, and if so, what are the challenges facing political scientists going into the field? This essay provides some answers.

I proceed as follows: First, I briefly discuss my own foray into the world of participant observation, which occurred during the 2012 Senate race in Montana. Second, I consider two important methodological considerations raised by Fenno: access and participation as an observer. Third, I relate these two issues to a final consideration: the development of social media and the consequences of this for the participant observation enterprise. Finally, I show the perils of social science divorced from context, as demonstrated by the recent Stanford-Dartmouth mailer scandal. I conclude with not just a plea for us to pick up where Fenno has left off, but by suggesting that more thinking like a participant observer would benefit the discipline as whole by reminding us of our ethical obligations as researchers to each other, and to the political community that we study…(More)”.

Protecting the Confidentiality of America’s Statistics: Adopting Modern Disclosure Avoidance Methods at the Census Bureau


John Abowd at US Census: “…Throughout our history, we have been leaders in statistical data protection, which we call disclosure avoidance. Other statistical agencies use the terms “disclosure limitation” and “disclosure control.” These terms are all synonymous. Disclosure avoidance methods have evolved since the censuses of the early 1800s, when the only protection used was simply removing names. Executive orders, and a series of laws modified the legal basis for these protections, which were finally codified in the 1954 Census Act (13 U.S.C. Sections 8(b) and 9). We have continually added better and stronger protections to keep the data we publish anonymous and underlying records confidential.

However, historical methods cannot completely defend against the threats posed by today’s technology. Growth in computing power, advances in mathematics, and easy access to large, public databases pose a significant threat to confidentiality. These forces have made it possible for sophisticated users to ferret out common data points between databases using only our published statistics. If left unchecked, those users might be able to stitch together these common threads to identify the people or businesses behind the statistics as was done in the case of the Netflix Challenge.

The Census Bureau has been addressing these issues from every feasible angle and changing rapidly with the times to ensure that we protect the data our census and survey respondents provide us. We are doing this by moving to a new, advanced, and far more powerful confidentiality protection system, which uses a rigorous mathematical process that protects respondents’ information and identity in all of our publications.

The new tool is based on the concept known in scientific and academic circles as “differential privacy.” It is also called “formal privacy” because it provides provable mathematical guarantees, similar to those found in modern cryptography, about the confidentiality protections that can be independently verified without compromising the underlying protections.

“Differential privacy” is based on the cryptographic principle that an attacker should not be able to learn any more about you from the statistics we publish using your data than from statistics that did not use your data. After tabulating the data, we apply carefully constructed algorithms to modify the statistics in a way that protects individuals while continuing to yield accurate results. We assume that everyone’s data are vulnerable and provide the same strong, state-of-the-art protection to every record in our database.

The Census Bureau did not invent the science behind differential privacy. However, we were the first organization anywhere to use it when we incorporated differential privacy into the OnTheMap application in 2008. It was used in this event to protect block-level residential population data. Recently, Google, Apple, Microsoft, and Uber have all followed the Census Bureau’s lead, adopting differentially privacy systems as the standard for protecting user data confidentiality inside their browsers (Chrome), products (iPhones), operating systems (Windows 10), and apps (Uber)….(More)”.

Origin Privacy: Protecting Privacy in the Big-Data Era


Paper by Helen Nissenbaum, Sebastian Benthall, Anupam Datta, Michael Carl Tschantz, and Piot Mardziel: “Machine learning over big data poses challenges for our conceptualization of privacy. Such techniques can discover surprising and counteractive associations that take innocent looking data and turns it into important inferences about a person. For example, the buying carbon monoxide monitors has been linked to paying credit card bills, while buying chrome-skull car accessories predicts not doing so. Also, Target may have used the buying of scent-free hand lotion and vitamins as a sign that the buyer is pregnant. If we take pregnancy status to be private and assume that we should prohibit the sharing information that can reveal that fact, then we have created an unworkable notion of privacy, one in which sharing any scrap of data may violate privacy.

Prior technical specifications of privacy depend on the classification of certain types of information as private or sensitive; privacy policies in these frameworks limit access to data that allow inference of this sensitive information. As the above examples show, today’s data rich world creates a new kind of problem: it is difficult if not impossible to guarantee that information does notallow inference of sensitive topics. This makes information flow rules based on information topic unstable.

We address the problem of providing a workable definition of private data that takes into account emerging threats to privacy from large-scale data collection systems. We build on Contextual Integrity and its claim that privacy is appropriate information flow, or flow according to socially or legally specified rules.

As in other adaptations of Contextual Integrity (CI) to computer science, the parameterization of social norms in CI is translated into a logical specification. In this work, we depart from CI by considering rules that restrict information flow based on its origin and provenance, instead of on it’s type, topic, or subject.

We call this concept of privacy as adherence to origin-based rules Origin Privacy. Origin Privacy rules can be found in some existing data protection laws. This motivates the computational implementation of origin-based rules for the simple purpose of compliance engineering. We also formally model origin privacy to determine what security properties it guarantees relative to the concerns that motivate it….(More)”.