Stefaan Verhulst

Gabriel M Leung and Kathy Leung at The Lancet: “Coronavirus disease 2019 (COVID-19) has spread with unprecedented speed and scale since the first zoonotic event that introduced the causative virus—severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)—into humans, probably during November, 2019, according to phylogenetic analyses suggesting the most recent common ancestor of the sequenced genomes emerged between Oct 23, and Dec 16, 2019. The reported cumulative number of confirmed patients worldwide already exceeds 70 000 in almost 30 countries and territories as of Feb 19, 2020, although that the actual number of infections is likely to far outnumber this case count.

During any novel emerging epidemic, let alone one with such magnitude and speed of global spread, a first task is to put together a line list of suspected, probable, and confirmed individuals on the basis of working criteria of the respective case definitions. This line list would allow for quick preliminary assessment of epidemic growth and potential for spread, evidence-based determination of the period of quarantine and isolation, and monitoring of efficiency of detection of potential cases. Frequent refreshing of the line list would further enable real-time updates as more clinical, epidemiological, and virological (including genetic) knowledge become available as the outbreak progresses….

We surveyed different and varied sources of possible line lists for COVID-19 (appendix pp 1–4). A bottleneck remains in carefully collating as much relevant data as possible, sifting through and verifying these data, extracting intelligence to forecast and inform outbreak strategies, and thereafter repeating this process in iterative cycles to monitor and evaluate progress. A possible methodological breakthrough would be to develop and validate algorithms for automated bots to search through cyberspaces of all sorts, by text mining and natural language processing (in languages not limited to English) to expedite these processes.In this era of smartphone and their accompanying applications, the authorities are required to combat not only the epidemic per se, but perhaps an even more sinister outbreak of fake news and false rumours, a so-called infodemic…(More)”.

Crowdsourcing data to mitigate epidemics

Press Release: “The European Data Portal publishes its study “The Economic Impact of Open Data: Opportunities for value creation in Europe”. It researches the value created by open data in Europe. It is the second study by the European Data Portal, following the 2015 report. The open data market size is estimated at €184 billion and forecast to reach between €199.51 and €334.21 billion in 2025. The report additionally considers how this market size is distributed along different sectors and how many people are employed due to open data. The efficiency gains from open data, such as potential lives saved, time saved, environmental benefits, and improvement of language services, as well as associated potential costs savings are explored and quantified where possible. Finally, the report also considers examples and insights from open data re-use in organisations. The key findings of the report are summarised below:

The specification and implementation of high-value datasets as part of the new Open Data Directive is a promising opportunity to address quality & quantity demands of open data.
Addressing quality & quantity demands is important, yet not enough to reach the full potential of open data.
Open data re-users have to be aware and capable of understanding and leveraging the potential.
Open data value creation is part of the wider challenge of skill and process transformation: a lengthy process whose change and impact are not always easy to observe and measure.
Sector-specific initiatives and collaboration in and across private and public sector foster value creation.
Combining open data with personal, shared, or crowdsourced data is vital for the realisation of further growth of the open data market.
For different challenges, we must explore and improve multiple approaches of data re-use that are ethical, sustainable, and fit-for-purpose….(More)”.

The Economic Impact of Open Data: Opportunities for value creation in Europe

Michael Mandiberg at The Atlantic: “Wikipedia matters. In a time of extreme political polarization, algorithmically enforced filter bubbles, and fact patterns dismissed as fake news, Wikipedia has become one of the few places where we can meet to write a shared reality. We treat it like a utility, and the U.S. and U.K. trust it about as much as the news.

But we know very little about who is writing the world’s encyclopedia. We do know that just because anyone can edit, doesn’t mean that everyone does: The site’s editors are disproportionately cis white men from the global North. We also know that, as with most of the internet, a small number of the editors do a large amount of the editing. But that’s basically it: In the interest of improving retention, the Wikimedia Foundation’s own research focuses on the motivations of people who do edit, not on those who don’t. The media, meanwhile, frequently focus on Wikipedia’s personality stories, even when covering the bigger questions. And Wikipedia’s own culture pushes back against granular data harvesting: The Wikimedia Foundation’s strong data-privacy rules guarantee users’ anonymity and limit the modes and duration of their own use of editor data.

But as part of my research in producing Print Wikipedia, I discovered a data set that can offer an entry point into the geography of Wikipedia’s contributors. Every time anyone edits Wikipedia, the software records the text added or removed, the time of the edit, and the username of the editor. (This edit history is part of Wikipedia’s ethos of radical transparency: Everyone is anonymous, and you can see what everyone is doing.) When an editor isn’t logged in with a username, the software records that user’s IP address. I parsed all of the 884 million edits to English Wikipedia to collect and geolocate the 43 million IP addresses that have edited English Wikipedia. I also counted 8.6 million username editors who have made at least one edit to an article.

The result is a set of maps that offer, for the first time, insight into where the millions of volunteer editors who build and maintain English Wikipedia’s 5 million pages are—and, maybe more important, where they aren’t….

Like the Enlightenment itself, the modern encyclopedia has a history entwined with colonialism. Encyclopédie aimed to collect and disseminate all the world’s knowledge—but in the end, it could not escape the biases of its colonial context. Likewise, Napoleon’s Description de l’Égypte augmented an imperial military campaign with a purportedly objective study of the nation, which was itself an additional form of conquest. If Wikipedia wants to break from the past and truly live up to its goal to compile the sum of all human knowledge, it requires the whole world’s participation….(More)”.

Mapping Wikipedia

Paper by Ethan R. Mollick and Ramana Nanda: “In fields as diverse as technology entrepreneurship and the arts, crowds of interested stakeholders are increasingly responsible for deciding which innovations to fund, a privilege that was previously reserved for a few experts, such as venture capitalists and grant‐making bodies. Little is known about the degree to which the crowd differs from experts in judging which ideas to fund, and, indeed, whether the crowd is even rational in making funding decisions. Drawing on a panel of national experts and comprehensive data from the largest crowdfunding site, we examine funding decisions for proposed theater projects, a category where expert and crowd preferences might be expected to differ greatly.

We instead find significant agreement between the funding decisions of crowds and experts. Where crowds and experts disagree, it is far more likely to be a case where the crowd is willing to fund projects that experts may not. Examining the outcomes of these projects, we find no quantitative or qualitative differences between projects funded by the crowd alone, and those that were selected by both the crowd and experts. Our findings suggest that crowdfunding can play an important role in complementing expert decisions, particularly in sectors where the crowds are end users, by allowing projects the option to receive multiple evaluations and thereby lowering the incidence of “false negatives.”…(More)”.

Wisdom or Madness? Comparing Crowds with Expert Evaluation in Funding the Arts

Essay by Douglas Schuler: “The utopian optimism about democracy and the internet has given way to disillusionment. At the same time, given the complexity of today’s wicked problems, the need for democracy is critical. Unfortunately democracy is under attack around the world, and there are ominous signs of its retreat.

How does democracy fare when digital technology is added to the picture? Weaving technology and democracy together is risky, and technologists who begin any digital project with the conviction that technology can and will solve “problems” of democracy are likely to be disappointed. Technology can be a boon to democracy if it is informed technology.

The goal in writing this essay was to encourage people to help develop and cultivate a rich democratic sphere. Democracy has great potential that it rarely achieves. It is radical, critical, complex, and fragile. It takes different forms in different contexts. These forms are complex and the solutionism promoted by the computer industry and others is not appropriate in the case of democracies. The primary aim of technology in the service of democracy is not merely to make it easier or more convenient but to improve society’s civic intelligence, its ability to address the problems it faces effectively and equitably….(More)”.

Can Technology Support Democracy?

Paper by Virgilio Galdo, Yue Li and Martin Rama: “This paper proposes a methodology for identifying urban areas that combines subjective assessments with machine learning, and applies it to India, a country where several studies see the official urbanization rate as an under-estimate. For a representative sample of cities, towns and villages, as administratively defined, human judgment of Google images is used to determine whether they are urban or rural in practice. Judgments are collected across four groups of assessors, differing in their familiarity with India and with urban issues, following two different protocols. The judgment-based classification is then combined with data from the population census and from satellite imagery to predict the urban status of the sample.

The Logit model, and LASSO and random forests methods, are applied. These approaches are then used to decide whether each of the out-of-sample administrative units in India is urban or rural in practice. The analysis does not find that India is substantially more urban than officially claimed. However, there are important differences at more disaggregated levels, with ?other towns? and ?census towns? being more rural, and some southern states more urban, than is officially claimed. The consistency of human judgment across assessors and protocols, the easy availability of crowd-sourcing, and the stability of predictions across approaches, suggest that the proposed methodology is a promising avenue for studying urban issues….(More)”.

Identifying Urban Areas by Combining Human Judgment and Machine Learning: An Application to India

Special Report by The Economist: “The data economy is a work in progress. Its economics still have to be worked out; its infrastructure and its businesses need to be fully built; geopolitical arrangements must be found. But there is one final major tension: between the wealth the data economy will create and how it will be distributed. The data economy—or the “second economy”, as Brian Arthur of the Santa Fe Institute terms it—will make the world a more productive place no matter what, he predicts. But who gets what and how is less clear. “We will move from an economy where the main challenge is to produce more and more efficiently,” says Mr Arthur, “to one where distribution of the wealth produced becomes the biggest issue.”

The data economy as it exists today is already very unequal. It is dominated by a few big platforms. In the most recent quarter, Amazon, Apple, Alphabet, Microsoft and Facebook made a combined profit of $55bn, more than the next five most valuable American tech firms over the past 12 months. This corporate inequality is largely the result of network effects—economic forces that mean size begets size. A firm that can collect a lot of data, for instance, can make better use of artificial intelligence and attract more users, who in turn supply more data. Such firms can also recruit the best data scientists and have the cash to buy the best ai startups.

It is also becoming clear that, as the data economy expands, these sorts of dynamics will increasingly apply to non-tech companies and even countries. In many sectors, the race to become a dominant data platform is on. This is the mission of Compass, a startup, in residential property. It is one goal of Tesla in self-driving cars. And Apple and Google hope to repeat the trick in health care. As for countries, America and China account for 90% of the market capitalisation of the world’s 70 largest platforms (see chart), Africa and Latin America for just 1%. Economies on both continents risk “becoming mere providers of raw data…while having to pay for the digital intelligence produced,” the United Nations Conference on Trade and Development recently warned.

Yet it is the skewed distribution of income between capital and labour that may turn out to be the most pressing problem of the data economy. As it grows, more labour will migrate into the mirror worlds, just as other economic activity will. It is not only that people will do more digitally, but they will perform actual “data work”: generating the digital information needed to train and improve ai services. This can mean simply moving about online and providing feedback, as most people already do. But it will increasingly include more active tasks, such as labelling pictures, driving data-gathering vehicles and perhaps, one day, putting one’s digital twin through its paces. This is the reason why some say ai should actually be called “collective intelligence”: it takes in a lot of human input—something big tech firms hate to admit….(More)”.

Who will benefit most from the data economy?

Paper by Molly K. Land and Rebecca J. Hamilton: “The current preoccupation with ‘fake news’ has spurred a renewed emphasis in popular discourse on the potential harms of speech. In the world of international law, however, ‘fake news’ is far from new. Propaganda of various sorts is a well-worn tactic of governments, and in its most insidious form, it has played an instrumental role in inciting and enabling some of the worst atrocities of our time. Yet as familiar as propaganda might be in theory, it is raising new issues as it has migrated to the digital realm. Technological developments have largely outpaced existing legal and political tools for responding to the use of mass communications devices to instigate or perpetrate human rights violations.

This chapter evaluates the current practices of social media companies for responding to online hate, arguing that they are inevitably both overbroad and under-inclusive. Using the example of the role played by Facebook in the recent genocide against the minority Muslim Rohingya population in Myanmar, the chapter illustrates the failure of platform hate speech policies to address pervasive and coordinated online speech, often state-sponsored or state-aligned, denigrating a particular group that is used to justify or foster impunity for violence against that group. Addressing this “conditioning speech” requires a more tailored response that includes remedies other than content removal and account suspensions. The chapter concludes by surveying a range of innovative responses to harmful online content that would give social media platforms the flexibly to intervene earlier, but with a much lighter touch….(More)”.

Beyond Takedown: Expanding the Toolkit for Responding to Online Hate

Article by Geoff Shullenberger on “How fears of mind control went from paranoid delusion to conventional wisdom”: “In early 2017, after the double shock of Brexit and the election of Donald Trump, the British data-mining firm Cambridge Analytica gained sudden notoriety. The previously little-known company, reporters claimed, had used behavioral influencing techniques to turn out social media users to vote in both elections. By its own account, Cambridge Analytica had worked with both campaigns to produce customized propaganda for targeting individuals on Facebook likely to be swept up in the tide of anti-immigrant populism. Its methods, some news sources suggested, might have sent enough previously disengaged voters to the polls to have tipped the scales in favor of the surprise victors. To a certain segment of the public, this story seemed to answer the question raised by both upsets: How was it possible that the seemingly solid establishment consensus had been rejected? What’s more, the explanation confirmed everything that seemed creepy about the Internet, evoking a sci-fi vision of social media users turned into an army of political zombies, mobilized through subliminal manipulation.

Cambridge Analytica’s violations of Facebook users’ privacy have made it an enduring symbol of the dark side of social media. However, the more dramatic claims about the extent of the company’s political impact collapse under closer scrutiny, mainly because its much-hyped “psychographic targeting” methods probably don’t work. As former Facebook product manager Antonio García Martínez noted in a 2018 Wired article, “the public, with no small help from the media sniffing a great story, is ready to believe in the supernatural powers of a mostly unproven targeting strategy,” but “most ad insiders express skepticism about Cambridge Analytica’s claims of having influenced the election, and stress the real-world difficulty of changing anyone’s mind about anything with mere Facebook ads, least of all deeply ingrained political views.” According to García, the entire affair merely confirms a well-established truth: “In the ads world, just because a product doesn’t work doesn’t mean you can’t sell it….(More)”.

We All Wear Tinfoil Hats Now

Chapter by Vikramsinh Amarsinh Patil: “This chapter examines the theoretical underpinnings of nudge theory and makes a case for incorporating nudging into the decision-making process in corporate contexts. Nudging and more broadly behavioural economics have become buzzwords on account of the seminal work that has been done by economists and highly publicized interventions employed by governments to support national priorities. Firms are not to be left behind, however. What follows is extensive documentation of such firms that have successfully employed nudging techniques. The examples are segmented by the nudge recipient, namely – managers, employees, and consumers. Firms can guide managers to become better leaders, employees to become more productive, and consumers to stay loyal. However, nudging is not without its pitfalls. It can be used towards nefarious ends and be notoriously difficult to implement and execute. Therefore, nudges should be rigorously tested via experimentation and should be ethically sound….(More)”.

Nudge Theory and Decision Making: Enabling People to Make Better Choices

Stefaan Verhulst

Get the latest news right in your inbox