Stefaan Verhulst

Matthew L. Williams at Data Driven Journalism: “Collecting and publishing data collected from social media sites such as Twitter are everyday practices for the data journalist. Recent findings from Cardiff University’s Social Data Science Lab question the practice of publishing Twitter content without seeking some form of informed consent from users beforehand. Researchers found that tweets collected around certain topics, such as those related to terrorism, political votes, changes in the law and health problems, create datasets that might contain sensitive content, such as extreme political opinion, grossly offensive comments, overly personal revelations and threats to life (both to oneself and to others). Handling these data in the process of analysis (such as classifying content as hateful and potentially illegal) and reporting has brought the ethics of using social media in social research and journalism into sharp focus.

Ethics is an issue that is becoming increasingly salient in research and journalism using social media data. The digital revolution has outpaced parallel developments in research governance and agreed good practice. Codes of ethical conduct that were written in the mid twentieth century are being relied upon to guide the collection, analysis and representation of digital data in the twenty-first century. Social media is particularly ethically challenging because of the open availability of the data (particularly from Twitter). Many platforms’ terms of service specifically state users’ data that are public will be made available to third parties, and by accepting these terms users legally consent to this. However, researchers and data journalists must interpret and engage with these commercially motivated terms of service through a more reflexive lens, which implies a context sensitive approach, rather than focusing on the legally permissible uses of these data.

Social media researchers and data journalists have experimented with data from a range of sources, including Facebook, YouTube, Flickr, Tumblr and Twitter to name a few. Twitter is by far the most studied of all these networks. This is because Twitter differs from other networks, such as Facebook, that are organised around groups of ‘friends’, in that it is more ‘open’ and the data (in part) are freely available to researchers. This makes Twitter a more public digital space that promotes the free exchange of opinions and ideas. Twitter has become the primary space for online citizens to publicly express their reaction to events of national significance, and also the primary source of data for social science research into digital publics.

The Twitter streaming API provides three levels of data access: the free random 1% that provides ~5M tweets daily and the random 10% and 100% (chargeable or free to academic researchers upon request). Datasets on social interactions of this scale, speed and ease of access have been hitherto unrealisable in the social sciences and journalism, and have led to a flood of journal articles and news pieces, many of which include tweets with full text content and author identity without informed consent. This is presumably because of Twitter’s ‘open’ nature, which leads to the assumption that ‘these are public data’ and using it does not require the rigor and scrutiny of an ethical oversight. Even when these data are scrutinised, journalists don’t need to be convinced by the ‘public data’ argument, due to the lack of a framework to evaluate the potential harms to users. The Social Data Science Lab takes a more ethically reflexive approach to the use of social media data in social research, and carefully considers users’ perceptions, online context and the role of algorithms in estimating potentially sensitive user characteristics.

A recent Lab survey conducted into users’ perceptions of the use of their social media posts found the following:

94% were aware that social media companies had Terms of Service
65% had read the Terms of Service in whole or in part
76% knew that when accepting Terms of Service they were giving permission for some of their information to be accessed by third parties
80% agreed that if their social media information is used in a publication they would expect to be asked for consent
90% agreed that if their tweets were used without their consent they should be anonymized…(More)”.

Data journalism and the ethics of publishing Twitter data

James Shulman at the Mellon Foundation: “In 2001, when hundreds of individual colleges and universities were scrambling to scan their slide libraries, The Andrew W. Mellon Foundation created a new organization, Artstor, to assemble a massive library of digital images from disparate sources to support teaching and research in the arts and humanities.

Rather than encouraging—or paying for—each school to scan its own slide of the Mona Lisa, the Mellon Foundation created an intermediary organization that would balance the interests of those who created, photographed and cared for art works, such as artists and museums, and those who wanted to use such images for the admirable calling of teaching and studying history and culture. This organization would reach across the gap that separated these two communities and would respect and balance the interests of both sides, while helping each accomplish their missions. At the same time that Napster was using technology to facilitate the un-balanced transfer of digital content from creators to users, the Mellon Foundation set up a new institution aimed at respecting the interests of one side of the market and supporting the socially desirable work of the other.

As the internet has enabled the sharing of data across the world, new intermediaries have emerged as entire platforms. A networked world needs such bridges—think Etsy or Ebay sitting between sellers and buyers, or Facebook sitting between advertisers and users. While intermediaries that match sellers and buyers of things provide a marketplace to bridge from one side or the other, aggregators of data work in admittedly more shadowy territories.

In the many realms that market forces won’t support, however, a great deal of public good can be done by aggregating and managing access to datasets that might otherwise continue to live in isolation. Whether due to institutional sociology that favors local solutions, the technical challenges associated with merging heterogeneous databases built with different data models, intellectual property limitations, or privacy concerns, datasets are built and maintained by independent groups that—if networked—could be used to further each other’s work.

Think of those studying coral reefs, or those studying labor practices in developing markets, or child welfare offices seeking to call upon court records in different states, or medical researchers working in different sub-disciplines but on essentially the same disease. What intermediary invests in joining these datasets? Many people assume that computers can simply “talk” to each other and share data intuitively, but without targeted investment in connecting them, they can’t. Unlike modern databases that are now often designed with the cloud in mind, decades of locally created databases churn away in isolation, at great opportunity cost to us all.

Art history research is an unusually vivid example. Most people can understand that if you want to study Caravaggio, you don’t want to hunt and peck across hundreds of museums, books, photo archives, libraries, churches, and private collections. You want all that content in one place—exactly what Mellon sought to achieve by creating Artstor.

What did we learn in creating Artstor that might be distilled as lessons for others taking on an aggregation project to serve the public good?….(More)”.

Spanning Today’s Chasms: Seven Steps to Building Trusted Data Intermediaries

Michael Walzer at Foreign Affairs: “All governments, all political parties, and all politicians keep secrets and tell lies. Some lie more than others, and those differences are important, but the practice is general. And some lies and secrets may be justified, whereas others may not. Citizens, therefore, need to know the difference between just and unjust secrets and between just and unjust deception before they can decide when it may be justifiable for someone to reveal the secrets or expose the lies—when leaking confidential information, releasing classified documents, or blowing the whistle on misconduct may be in the public interest or, better, in the interest of democratic government.

Revealing official secrets and lies involves a form of moral risk-taking: whistleblowers may act out of a sense of duty or conscience, but the morality of their actions can be judged only by their fellow citizens, and only after the fact. This is often a difficult judgment to make—and has probably become more difficult in the Trump era.

LIES AND DAMNED LIES

A quick word about language: “leaker” and “whistleblower” are overlapping terms, but they aren’t synonyms. A leaker, in this context, anonymously reveals information that might embarrass officials or open up the government’s internal workings to unwanted public scrutiny. In Washington, good reporters cultivate sources inside every presidential administration and every Congress and hope for leaks. A whistleblower reveals what she believes to be immoral or illegal official conduct to her bureaucratic superiors or to the public. Certain sorts of whistle-blowing, relating chiefly to mismanagement and corruption, are protected by law; leakers are not protected, nor are whistleblowers who reveal state secrets…(More)”.

Just and Unjust Leaks: When to Spill Secrets

Nancy Scola at Politico: “Facebook CEO Mark Zuckerberg is quietly cracking open his company’s vast trove of user data for a study on economic inequality in the U.S. — the latest sign of his efforts to reckon with divisions in American society that the social network is accused of making worse.

The study, which hasn’t previously been reported, is mining the social connections among Facebook’s American users to shed light on the growing income disparity in the U.S., where the top 1 percent of households is said to control 40 percent of the country’s wealth. Facebook is an incomparably rich source of information for that kind of research: By one estimate, about three of five American adults use the social network….

Facebook confirmed the broad contours of its partnership with Chetty but declined to elaborate on the substance of the study. Chetty, in a brief interview following a January speech in Washington, said he and his collaborators — who include researchers from Stanford and New York University — have been working on the inequality study for at least six months.

“We’re using social networks, and measuring interactions there, to understand the role of social capital much better than we’ve been able to,” he said.

Researchers say they see Facebook’s enormous cache of data as a remarkable resource, offering an unprecedentedly detailed and sweeping look at American society. That store of information contains both details that a user might tell Facebook — their age, hometown, schooling, family relationships — and insights that the company has picked up along the way, such as the interest groups they’ve joined and geographic distribution of who they call a “friend.”

It’s all the more significant, researchers say, when you consider that Facebook’s user base — about 239 million monthly users in the U.S. and Canada at last count — cuts across just about every demographic group.

And all that information, say researchers, lets them take guesses about users’ wealth. Facebook itself recently patented a way of figuring out someone’s socioeconomic status using factors ranging from their stated hobbies to how many internet-connected devices they own.

A Facebook spokesman addressed the potential privacy implications of the study’s access to user data, saying, “We conduct research at Facebook responsibly, which includes making sure we protect people’s information.” The spokesman added that Facebook follows an “enhanced” review process for research projects, adopted in 2014 after a controversy over a study that manipulated some people’s news feeds to see if it made them happier or sadder.

According to a Stanford University source familiar with Chetty’s study, the Facebook account data used in the research has been stripped of any details that could be used to identify users. The source added that academics involved in the study have gone through security screenings that include background checks, and can access the Facebook data only in secure facilities….(More)”.

Facebook’s next project: American inequality

David M. Pritchard and Lyn Carson in The Conversation: “Today elected representatives take the tough decisions about public finances behind closed doors. In doing so, democratic politicians rely on the advice of financial bureaucrats, who, often, cater to the political needs of the elected government. Politicians rarely ask voters what they think of budget options. They are no better at explaining the reasons for a budget. Explanations are usually no more than vacuous phrases, such as “jobs and growth” or “on the move”. They never explain the difficult trade-offs that go into a budget nor their overall financial reasoning.

This reluctance to explain public finances was all too evident during the global financial crisis.

In Australia, Britain and France, centre-left governments borrowed huge sums in order to maintain private demand and, in one case, to support private banks. In each country these policies helped a lot to minimise the crisis’s human costs.

Yet, in the elections that followed the centre-left politicians that had introduced these policies refused properly to justify them. They feared that voters would not tolerate robust discussion about public finances. Without a justification for their generally good policies each of these government was defeated by centre-right opponents.

In most democracies there is the same underlying problem: elected representatives do not believe that voters can tolerate the financial truth. They assume that democracy is not good at managing public finances. For them it can only balance the budget by leaving voters in the dark.

For decades, we, independently, have studied democracy today and in the ancient past. We have learned that this assumption is dead wrong. There are more and more examples of how involving ordinary voters results in better budgets.

In 1989, councils in poor Brazilian towns began to involve residents in setting budgets. This participatory budgeting soon spread throughout South America. It has now been successfully tried in Germany, Spain, Italy, Portugal, Sweden, the United States, Poland and Australia, and some pilot projects were set up in France too. Participatory budgeting is based on the clear principle that those who will be most affected by a tough budget should be involved in setting it.

In spite of such successful democratic experiments, elected representatives still shy away from involving ordinary voters in setting budgets. This is very different from what happened in ancient Athens 2,500 years ago….

In Athenian democracy ordinary citizens actually set the budget. This ancient Greek state had a solid budget, in spite of, or, we would say, because of the involvement of the citizens in taking tough budget decisions….(More)”.

When citizens set the budget: lessons from ancient Greece

Ben Wiegand at Scientific American: “The process of medical research has been likened to searching for a needle in a haystack. With the continued acceleration of novel science and health care technologies in areas like artificial intelligence, digital therapeutics and the human microbiome we have tremendous opportunity to search the haystack in new and exciting ways. Applying these high-tech advances to today’s most pressing health issues increases our ability to address the root cause of disease, intervene earlier and change the trajectory of human health.

Global crowdsourcing forums, like the Johnson & Johnson Innovation QuickFire Challenges, can be incredibly valuable tools for searching the “haystack.” An initiative of JLABS—the no-strings-attached incubators of Johnson & Johnson Innovation—these contests spur scientific diversity through crowdsourcing, inspiring and attracting fresh thinking. They seek to stimulate the global innovation ecosystem through funding, mentorship and access to resources that can kick-start breakthrough ideas.

Our most recent challenge, the Next-Gen Baby Box QuickFire Challenge, focused on updating the 80-year-old “Finnish baby box,” a free, government-issued maternity supply kit for new parents containing such essentials as baby clothing, bath and sleep supplies packaged in a sleep-safe cardboard box. Since it first launched, the baby box has, together with increased use of maternal healthcare services early in pregnancy, helped to significantly reduce the Finnish infant mortality rate from 65 in every 1,000 live births in the 1930s to 2.5 per 1,000 today—one of the lowest rates in the world.

Partnering with Finnish innovation and government groups, we set out to see if updating this popular early parenting tool with the power of personalized health technology might one day impact Finland’s unparalleled high rate of type 1 diabetes. We issued the call globally to help create “the Baby Box of the future” as part of the Janssen and Johnson & Johnson Innovation vision to create a world without disease by accelerating science and delivering novel solutions to prevent, intercept and cure disease. The contest brought together entrepreneurs, researchers and innovators to focus on ideas with the potential to promote child health, detect childhood disease earlier and facilitate healthy parenting.

Incentive challenges like this award participants who have most effectively met a predefined objective or task. It’s a concept that emerged well before our time—as far back as the 18th century—from Napoleon’s Food Preservation Prize, meant to find a way to keep troops fed during battle, to the Longitude Prize for improved marine navigation.

Research shows that prize-based challenges that attract talent across a wide range of disciplines can generate greater risk-taking and yield more dramatic solutions….(More)”.

Can Crowdsourcing and Collaboration Improve the Future of Human Health?

George Soros at Project Syndicate: “It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. And there is a real chance that, once lost, those who grow up in the digital age – in which the power to command and shape people’s attention is increasingly concentrated in the hands of a few companies – will have difficulty regaining it.

The current moment in world history is a painful one. Open societies are in crisis, and various forms of dictatorships and mafia states, exemplified by Vladimir Putin’s Russia, are on the rise. In the United States, President Donald Trump would like to establish his own mafia-style state but cannot, because the Constitution, other institutions, and a vibrant civil society won’t allow it….

The rise and monopolistic behavior of the giant American Internet platform companies is contributing mightily to the US government’s impotence. These companies have often played an innovative and liberating role. But as Facebook and Google have grown ever more powerful, they have become obstacles to innovation, and have caused a variety of problems of which we are only now beginning to become aware…

Social media companies’ true customers are their advertisers. But a new business model is gradually emerging, based not only on advertising but also on selling products and services directly to users. They exploit the data they control, bundle the services they offer, and use discriminatory pricing to keep more of the benefits that they would otherwise have to share with consumers. This enhances their profitability even further, but the bundling of services and discriminatory pricing undermine the efficiency of the market economy.

Social media companies deceive their users by manipulating their attention, directing it toward their own commercial purposes, and deliberately engineering addiction to the services they provide. This can be very harmful, particularly for adolescents.

There is a similarity between Internet platforms and gambling companies. Casinos have developed techniques to hook customers to the point that they gamble away all of their money, even money they don’t have.

Something similar – and potentially irreversible – is happening to human attention in our digital age. This is not a matter of mere distraction or addiction; social media companies are actually inducing people to surrender their autonomy. And this power to shape people’s attention is increasingly concentrated in the hands of a few companies.

It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. Once lost, those who grow up in the digital age may have difficulty regaining it.

This would have far-reaching political consequences. People without the freedom of mind can be easily manipulated. This danger does not loom only in the future; it already played an important role in the 2016 US presidential election.

There is an even more alarming prospect on the horizon: an alliance between authoritarian states and large, data-rich IT monopolies, bringing together nascent systems of corporate surveillance with already-developed systems of state-sponsored surveillance. This may well result in a web of totalitarian control the likes of which not even George Orwell could have imagined….(More)”.

The Social Media Threat to Society and Security

Book by Paul R. Rosenbaum: “In the daily news and the scientific literature, we are faced with conflicting claims about the effects caused by some treatments, behaviors, and policies. A daily glass of wine prolongs life, or so we are told. Yet we are also told that alcohol can cause life-threatening cancer and that pregnant women should abstain from drinking. Some say that raising the minimum wage decreases inequality while others say it increases unemployment. Investigators once confidently claimed that hormone replacement therapy reduces the risk of heart disease but today investigators confidently claim it raises that risk. How should we study such questions?

Observation and Experiment is an introduction to causal inference from one of the field’s leading scholars. Using minimal mathematics and statistics, Paul Rosenbaum explains key concepts and methods through scientific examples that make complex ideas concrete and abstract principles accessible.

Some causal questions can be studied in randomized trials in which coin flips assign individuals to treatments. But because randomized trials are not always practical or ethical, many causal questions are investigated in nonrandomized observational studies. To illustrate, Rosenbaum draws examples from clinical medicine, economics, public health, epidemiology, clinical psychology, and psychiatry. Readers gain an understanding of the design and interpretation of randomized trials, the ways they differ from observational studies, and the techniques used to remove, investigate, and appraise bias in observational studies. Observation and Experiment is a valuable resource for anyone with a serious interest in the empirical study of human health, behavior, and well-being….(More)”.

Observation and Experiment: An Introduction to Causal Inference

Alexandra Borchardt at Project Syndicate: “In a democracy, the rights of the many cannot come at the expense of the rights of the few. In the age of algorithms, government must, more than ever, ensure the protection of vulnerable voices, even erring on victims’ side at times.

Germany’s Network Enforcement Act – according to which social-media platforms like Facebook and YouTube could be fined €50 million ($63 million) for every “obviously illegal” post within 24 hours of receiving a notification – has been controversial from the start. After it entered fully into effect in January, there was a tremendous outcry, with critics from all over the political map arguing that it was an enticement to censorship. Government was relinquishing its powers to private interests, they protested.

So, is this the beginning of the end of free speech in Germany?

Of course not. To be sure, Germany’s Netzwerkdurchsetzungsgesetz (or NetzDG) is the strictest regulation of its kind in a Europe that is growing increasingly annoyed with America’s powerful social-media companies. And critics do have some valid points about the law’s weaknesses. But the possibilities for free expression will remain abundant, even if some posts are deleted mistakenly.

The truth is that the law sends an important message: democracies won’t stay silent while their citizens are exposed to hateful and violent speech and images – content that, as we know, can spur real-life hate and violence. Refusing to protect the public, especially the most vulnerable, from dangerous content in the name of “free speech” actually serves the interests of those who are already privileged, beginning with the powerful companies that drive the dissemination of information.

Speech has always been filtered. In democratic societies, everyone has the right to express themselves within the boundaries of the law, but no one has ever been guaranteed an audience. To have an impact, citizens have always needed to appeal to – or bypass – the “gatekeepers” who decide which causes and ideas are relevant and worth amplifying, whether through the media, political institutions, or protest.

The same is true today, except that the gatekeepers are the algorithms that automatically filter and rank all contributions. Of course, algorithms can be programmed any way companies like, meaning that they may place a premium on qualities shared by professional journalists: credibility, intelligence, and coherence.

But today’s social-media platforms are far more likely to prioritize potential for advertising revenue above all else. So the noisiest are often rewarded with a megaphone, while less polarizing, less privileged voices are drowned out, even if they are providing the smart and nuanced perspectives that can truly enrich public discussions….(More)”.

Free Speech in the Filter Age

Financial Conduct Authority (UK): “The sandbox allows firms to test innovative products, services or business models in a live market environment, while ensuring that appropriate protections are in place. It was established to support the FCA’s objective of promoting effective competition in the interests of consumers and opened for applications in June 2016.

The sandbox has supported 50 firms from 146 applications received across the first two cohorts. This report sets out the sandbox’s overall impact on the market including the adoption of new technologies, increasing access and improving experiences for vulnerable consumers as well as lessons learnt from individual tests that have been, or are being, conducted as part of the sandbox.

Early indications suggest the sandbox is providing the benefits it set out to achieve with evidence of the sandbox enabling new products to be tested, reducing time and cost of getting innovative ideas to market, improving access to finance for innovators, and ensuring appropriate safeguards are built into new products and services.

We will be using these learnings to inform any future sandbox developments as well as our ongoing policymaking and supervision work….(More)”.

Regulatory sandbox lessons learned report

Stefaan Verhulst

Get the latest news right in you inbox