Blockchain: Unpacking the disruptive potential of blockchain technology for human development.


IDRC white paper: “In the scramble to harness new technologies to propel innovation around the world, artificial intelligence, robotics, machine learning, and blockchain technologies are being explored and deployed in a wide variety of contexts globally.

Although blockchain is one of the most hyped of these new technologies, it is also perhaps the least understood. Blockchain is the distributed ledger — a database that is shared across multiple sites or institutions to furnish a secure and transparent record of events occurring during the provision of a service or contract — that supports cryptocurrencies (digital assets designed to work as mediums of exchange).

Blockchain is now underpinning applications such as land registries and identity services, but as its popularity grows, its relevance in addressing socio-economic gaps and supporting development targets like the globally-recognized UN Sustainable Development Goals is critical to unpack. Moreover, for countries in the global South that want to be more than just end users or consumers, the complex infrastructure requirements and operating costs of blockchain could prove challenging. For the purposes of real development, we need to not only understand how blockchain is workable, but also who is able to harness it to foster social inclusion and promote democratic governance.

This white paper explores the potential of blockchain technology to support human development. It provides a non-technical overview, illustrates a range of applications, and offers a series of conclusions and recommendations for additional research and potential development programming….(More)”.

Stewardship in the “Age of Algorithms”


Clifford Lynch at First Monday: “This paper explores pragmatic approaches that might be employed to document the behavior of large, complex socio-technical systems (often today shorthanded as “algorithms”) that centrally involve some mixture of personalization, opaque rules, and machine learning components. Thinking rooted in traditional archival methodology — focusing on the preservation of physical and digital objects, and perhaps the accompanying preservation of their environments to permit subsequent interpretation or performance of the objects — has been a total failure for many reasons, and we must address this problem.

The approaches presented here are clearly imperfect, unproven, labor-intensive, and sensitive to the often hidden factors that the target systems use for decision-making (including personalization of results, where relevant); but they are a place to begin, and their limitations are at least outlined.

Numerous research questions must be explored before we can fully understand the strengths and limitations of what is proposed here. But it represents a way forward. This is essentially the first paper I am aware of which tries to effectively make progress on the stewardship challenges facing our society in the so-called “Age of Algorithms;” the paper concludes with some discussion of the failure to address these challenges to date, and the implications for the roles of archivists as opposed to other players in the broader enterprise of stewardship — that is, the capture of a record of the present and the transmission of this record, and the records bequeathed by the past, into the future. It may well be that we see the emergence of a new group of creators of documentation, perhaps predominantly social scientists and humanists, taking the front lines in dealing with the “Age of Algorithms,” with their materials then destined for our memory organizations to be cared for into the future…(More)”.

Solving Public Problems with Data


Dinorah Cantú-Pedraza and Sam DeJohn at The GovLab: “….To serve the goal of more data-driven and evidence-based governing,  The GovLab at NYU Tandon School of Engineering this week launched “Solving Public Problems with Data,” a new online course developed with support from the Laura and John Arnold Foundation.

This online lecture series helps those working for the public sector, or simply in the public interest, learn to use data to improve decision-making. Through real-world examples and case studies — captured in 10 video lectures from leading experts in the field — the new course outlines the fundamental principles of data science and explores ways practitioners can develop a data analytical mindset. Lectures in the series include:

  1. Introduction to evidence-based decision-making  (Quentin Palfrey, formerly of MIT)
  2. Data analytical thinking and methods, Part I (Julia Lane, NYU)
  3. Machine learning (Gideon Mann, Bloomberg LP)
  4. Discovering and collecting data (Carter Hewgley, Johns Hopkins University)
  5. Platforms and where to store data (Arnaud Sahuguet, Cornell Tech)
  6. Data analytical thinking and methods, Part II (Daniel Goroff, Alfred P. Sloan Foundation)
  7. Barriers to building a data practice (Beth Blauer, Johns Hopkins University and GovEx)
  8. Data collaboratives (Stefaan G. Verhulst, The GovLab)
  9. Strengthening a data analytic culture (Amen Ra Mashariki, ESRI)
  10. Data governance and sharing (Beth Simone Noveck, NYU Tandon/The GovLab)

The goal of the lecture series is to enable participants to define and leverage the value of data to achieve improved outcomes and equities, reduced cost and increased efficiency in how public policies and services are created. No prior experience with computer science or statistics is necessary or assumed. In fact, the course is designed precisely to serve public professionals seeking an introduction to data science….(More)”.

SAM, the first A.I. politician on Messenger


 at Digital Trends: “It’s said that all politicians are the same, but it seems safe to assume that you’ve never seen a politician quite like this. Meet SAM, heralded as the politician of the future. Unfortunately, you can’t exactly shake this politician’s hand, or have her kiss your baby. Rather, SAM is the world’s first Virtual Politician (and a female presence at that), “driven by the desire to close the gap between what voters want and what politicians promise, and what they actually achieve.”

The artificially intelligent chat bot is currently live on Facebook Messenger, though she probably is most helpful to those in New Zealand. After all, the bot’s website notes, “SAM’s goal is to act as a representative for all New Zealanders, and evolves based on voter input.” Capable of being reached by anyone at just about anytime from anywhere, this may just be the single most accessible politician we’ve ever seen. But more importantly, SAM purports to be a true representative, claiming to analyze “everyone’s views [and] opinions, and impact of potential decisions.” This, the bot notes, could make for better policy for everyone….(More)”.

Nearly All of Wikipedia Is Written By Just 1 Percent of Its Editors


Daniel Oberhaus at Motherboard: “…Sixteen years later, the free encyclopedia and fifth most popular website in the world is well on its way to this goal. Today, Wikipedia is home to 43 million articles in 285 languages and all of these articles are written and edited by an autonomous group of international volunteers.

Although the non-profit Wikimedia Foundation diligently keeps track of how editors and users interact with the site, until recently it was unclear how content production on Wikipedia was distributed among editors. According to the results of a recent study that looked at the 250 million edits made on Wikipedia during its first ten years, only about 1 percent of Wikipedia’s editors have generated 77 percent of the site’s content.

“Wikipedia is both an organization and a social movement,” Sorin Matei, the director of the Purdue University Data Storytelling Network and lead author of the study, told me on the phone. “The assumption is that it’s a creation of the crowd, but this couldn’t be further from the truth. Wikipedia wouldn’t have been possible without a dedicated leadership.”

At the time of writing, there are roughly 132,000 registered editors who have been active on Wikipedia in the last month (there are also an unknown number of unregistered Wikipedians who contribute to the site). So statistically speaking, only about 1,300 people are creating over three-quarters of the 600 new articles posted to Wikipedia every day.

Of course, these “1 percenters” have changed over the last decade and a half. According to Matei, roughly 40 percent of the top 1 percent of editors bow out about every five weeks. In the early days, when there were only a few hundred thousand people collaborating on Wikipedia, Matei said the content production was significantly more equitable. But as the encyclopedia grew, and the number of collaborators grew with it, a cadre of die-hard editors emerged that have accounted for the bulk of Wikipedia’s growth ever since.

Matei and his colleague Brian Britt, an assistant professor of journalism at South Dakota State University, used a machine learning algorithm to crawl the quarter of a billion publicly available edit logs from Wikipedia’s first decade of existence. The results of this research, published September as a book, suggests that for all of Wikipedia’s pretension to being a site produced by a network of freely collaborating peers, “some peers are more equal than others,” according to Matei.

Matei and Britt argue that rather than being a decentralized, spontaneously evolving organization, Wikipedia is better described as an “adhocracy“—a stable hierarchical power structure which nevertheless allows for a high degree of individual mobility within that hierarchy….(More)”.

More Machine Learning About Congress’ Priorities


ProPublica: “We keep training machine learning models on Congress. Find out what this one learned about lawmakers’ top issues…

Speaker of the House Paul Ryan is a tax wonk ― and most observers of Congress know that. But knowing what interests the other 434 members of Congress is harder.

To make it easier to know what issues each lawmaker really focuses on, we’re launching a new feature in our Represent database called Policy Priorities. We had two goals in creating it: To help researchers and journalists understand what drives particular members of Congress and to enable regular citizens to compare their representatives’ priorities to their own and their communities.

We created Policy Priorities using some sophisticated computer algorithms (more on this in a second) to calculate interest based on what each congressperson talks ― and brags ― about in their press releases.

Voting and drafting legislation aren’t the only things members of Congress do with their time, but they’re often the main way we analyze congressional data, in part because they’re easily measured. But the job of a member of Congress goes well past voting. They go to committee meetings, discuss policy on the floor and in caucuses, raise funds and ― important for our purposes ― communicate with their constituents and journalists back home. They use press releases to talk about what they’ve accomplished and to demonstrate their commitment to their political ideals.

We’ve been gathering these press releases for a few years, and have a body of some 86,000 that we used for a kind of analysis called machine learning….(More)”.

Leveraging the disruptive power of artificial intelligence for fairer opportunities


Makada Henry-Nickie at Brookings: “According to President Obama’s Council of Economic Advisers (CEA), approximately 3.1 million jobs will be rendered obsolete or permanently altered as a consequence of artificial intelligence technologies. Artificial intelligence (AI) will, for the foreseeable future, have a significant disruptive impact on jobs. That said, this disruption can create new opportunities if policymakers choose to harness them—including some with the potential to help address long-standing social inequities. Investing in quality training programs that deliver premium skills, such as computational analysis and cognitive thinking, provides a real opportunity to leverage AI’s disruptive power.

AI’s disruption presents a clear challenge: competition to traditional skilled workers arising from the cross-relevance of data scientists and code engineers, who can adapt quickly to new contexts. Data analytics has become an indispensable feature of successful companies across all industries. ….

Investing in high-quality education and training programs is one way that policymakers proactively attempt to address the workforce challenges presented by artificial intelligence. It is essential that we make affirmative, inclusive choices to ensure that marginalized communities participate equitably in these opportunities.

Policymakers should prioritize understanding the demographics of those most likely to lose jobs in the short-run. As opposed to obsessively assembling case studies, we need to proactively identify policy entrepreneurs who can conceive of training policies that equip workers with technical skills of “long-game” relevance. As IBM points out, “[d]ata democratization impacts every career path, so academia must strive to make data literacy an option, if not a requirement, for every student in any field of study.”

Machines are an equal opportunity displacer, blind to color and socioeconomic status. Effective policy responses require collaborative data collection and coordination among key stakeholders—policymakers, employers, and educational institutions—to  identify at-risk worker groups and to inform workforce development strategies. Machine substitution is purely an efficiency game in which workers overwhelmingly lose. Nevertheless, we can blunt these effects by identifying critical leverage points….

Policymakers can choose to harness AI’s disruptive power to address workforce challenges and redesign fair access to opportunity simultaneously. We should train our collective energies on identifying practical policies that update our current agrarian-based education model, which unfairly disadvantages children from economically segregated neighborhoods…(More)”

When Data Science Destabilizes Democracy and Facilitates Genocide


Rachel Thomas in Fast.AI onWhat is the ethical responsibility of data scientists?”…What we’re talking about is a cataclysmic change… What we’re talking about is a major foreign power with sophistication and ability to involve themselves in a presidential election and sow conflict and discontent all over this country… You bear this responsibility. You’ve created these platforms. And now they are being misusedSenator Feinstein said this week in a senate hearing. Who has created a cataclysmic change? Who bears this large responsibility? She was talking to executives at tech companies and referring to the work of data scientists.

Data science can have a devastating impact on our world, as illustrated by inflammatory Russian propaganda being shown on Facebook to 126 million Americans leading up to the 2016 election (and the subject of the senate hearing described above) or by lies spread via Facebook that are fueling ethnic cleansing in Myanmar. Over half a million Rohinyga have been driven from their homes due to systematic murder, rape, and burning. Data science is foundational to Facebook’s newsfeed, in determining what content is prioritized and who sees what….

The examples of bias in data science are myriad and include:

You can do awesome and meaningful things with data science (such as diagnosing cancer, stopping deforestation, increasing farm yields, and helping patients with Parkinson’s disease), and you can (often unintentionally) enable terrible things with data science, as the examples in this post illustrate. Being a data scientist entails both great opportunity, as well as great responsibility, to use our skills to not make the world a worse place. Ultimately, doing data science is about humans, not just the users of our products, but everyone who will be impacted by our work. (More)”.

The Challenge of VR to the Liberal Democratic Order


Paper by Edward Castronova: “The rapid expansion of virtual reality (VR) technology in the years 2016-2021 awakens a significant constitutional issue. In a liberal democratic order, rule is by consent of the governed. In the medium-term future, many of the governed will be immersed fully within VR environments, environments which, we are told, will provide entertainment of extraordinary power. These people will be happy. Happy people do not demand change. Yet there surely will be a change as VR takes hold: The quality of life will erode. People fully immersed in VR will come to be isolated, sedentary, and unhealthy. Objectively speaking, this is nothing to be desired. Subjectively, however, it will seem to be wonderful. The people themselves will be happy, and they will resist interference. At the moment this matter concerns only a few thousand nerds, but trends in technology and entertainment point to a future in which many people will be happily living awful, VR-dominated lives. How then will the liberal democratic order promote human well-being while remaining a liberal and democratic order?…(More)”

Understanding Corporate Data Sharing Decisions: Practices, Challenges, and Opportunities for Sharing Corporate Data with Researchers


Leslie Harris at the Future of Privacy Forum: “Data has become the currency of the modern economy. A recent study projects the global volume of data to grow from about 0.8 zettabytes (ZB) in 2009 to more than 35 ZB in 2020, most of it generated within the last two years and held by the corporate sector.

As the cost of data collection and storage becomes cheaper and computing power increases, so does the value of data to the corporate bottom line. Powerful data science techniques, including machine learning and deep learning, make it possible to search, extract and analyze enormous sets of data from many sources in order to uncover novel insights and engage in predictive analysis. Breakthrough computational techniques allow complex analysis of encrypted data, making it possible for researchers to protect individual privacy, while extracting valuable insights.

At the same time, these newfound data sources hold significant promise for advancing scholarship and shaping more impactful social policies, supporting evidence-based policymaking and more robust government statistics, and shaping more impactful social interventions. But because most of this data is held by the private sector, it is rarely available for these purposes, posing what many have argued is a serious impediment to scientific progress.

A variety of reasons have been posited for the reluctance of the corporate sector to share data for academic research. Some have suggested that the private sector doesn’t realize the value of their data for broader social and scientific advancement. Others suggest that companies have no “chief mission” or public obligation to share. But most observers describe the challenge as complex and multifaceted. Companies face a variety of commercial, legal, ethical, and reputational risks that serve as disincentives to sharing data for academic research, with privacy – particularly the risk of reidentification – an intractable concern. For companies, striking the right balance between the commercial and societal value of their data, the privacy interests of their customers, and the interests of academics presents a formidable dilemma.

To be sure, there is evidence that some companies are beginning to share for academic research. For example, a number of pharmaceutical companies are now sharing clinical trial data with researchers, and a number of individual companies have taken steps to make data available as well. What is more, companies are also increasingly providing open or shared data for other important “public good” activities, including international development, humanitarian assistance and better public decision-making. Some are contributing to data collaboratives that pool data from different sources to address societal concerns. Yet, it is still not clear whether and to what extent this “new era of data openness” will accelerate data sharing for academic research.

Today, the Future of Privacy Forum released a new study, Understanding Corporate Data Sharing Decisions: Practices, Challenges, and Opportunities for Sharing Corporate Data with ResearchersIn this report, we aim to contribute to the literature by seeking the “ground truth” from the corporate sector about the challenges they encounter when they consider making data available for academic research. We hope that the impressions and insights gained from this first look at the issue will help formulate further research questions, inform the dialogue between key stakeholders, and identify constructive next steps and areas for further action and investment….(More)”.