Mapping Wikipedia


Michael Mandiberg at The Atlantic: “Wikipedia matters. In a time of extreme political polarization, algorithmically enforced filter bubbles, and fact patterns dismissed as fake news, Wikipedia has become one of the few places where we can meet to write a shared reality. We treat it like a utility, and the U.S. and U.K. trust it about as much as the news.

But we know very little about who is writing the world’s encyclopedia. We do know that just because anyone can edit, doesn’t mean that everyone does: The site’s editors are disproportionately cis white men from the global North. We also know that, as with most of the internet, a small number of the editors do a large amount of the editing. But that’s basically it: In the interest of improving retention, the Wikimedia Foundation’s own research focuses on the motivations of people who do edit, not on those who don’t. The media, meanwhile, frequently focus on Wikipedia’s personality stories, even when covering the bigger questions. And Wikipedia’s own culture pushes back against granular data harvesting: The Wikimedia Foundation’s strong data-privacy rules guarantee users’ anonymity and limit the modes and duration of their own use of editor data.

But as part of my research in producing Print Wikipedia, I discovered a data set that can offer an entry point into the geography of Wikipedia’s contributors. Every time anyone edits Wikipedia, the software records the text added or removed, the time of the edit, and the username of the editor. (This edit history is part of Wikipedia’s ethos of radical transparency: Everyone is anonymous, and you can see what everyone is doing.) When an editor isn’t logged in with a username, the software records that user’s IP address. I parsed all of the 884 million edits to English Wikipedia to collect and geolocate the 43 million IP addresses that have edited English Wikipedia. I also counted 8.6 million username editors who have made at least one edit to an article.

The result is a set of maps that offer, for the first time, insight into where the millions of volunteer editors who build and maintain English Wikipedia’s 5 million pages are—and, maybe more important, where they aren’t….

Like the Enlightenment itself, the modern encyclopedia has a history entwined with colonialism. Encyclopédie aimed to collect and disseminate all the world’s knowledge—but in the end, it could not escape the biases of its colonial context. Likewise, Napoleon’s Description de l’Égypte augmented an imperial military campaign with a purportedly objective study of the nation, which was itself an additional form of conquest. If Wikipedia wants to break from the past and truly live up to its goal to compile the sum of all human knowledge, it requires the whole world’s participation….(More)”.

Identifying Urban Areas by Combining Human Judgment and Machine Learning: An Application to India


Paper by Virgilio Galdo, Yue Li and Martin Rama: “This paper proposes a methodology for identifying urban areas that combines subjective assessments with machine learning, and applies it to India, a country where several studies see the official urbanization rate as an under-estimate. For a representative sample of cities, towns and villages, as administratively defined, human judgment of Google images is used to determine whether they are urban or rural in practice. Judgments are collected across four groups of assessors, differing in their familiarity with India and with urban issues, following two different protocols. The judgment-based classification is then combined with data from the population census and from satellite imagery to predict the urban status of the sample.

The Logit model, and LASSO and random forests methods, are applied. These approaches are then used to decide whether each of the out-of-sample administrative units in India is urban or rural in practice. The analysis does not find that India is substantially more urban than officially claimed. However, there are important differences at more disaggregated levels, with ?other towns? and ?census towns? being more rural, and some southern states more urban, than is officially claimed. The consistency of human judgment across assessors and protocols, the easy availability of crowd-sourcing, and the stability of predictions across approaches, suggest that the proposed methodology is a promising avenue for studying urban issues….(More)”.

We All Wear Tinfoil Hats Now


Article by Geoff Shullenberger on “How fears of mind control went from paranoid delusion to conventional wisdom”: “In early 2017, after the double shock of Brexit and the election of Donald Trump, the British data-mining firm Cambridge Analytica gained sudden notoriety. The previously little-known company, reporters claimed, had used behavioral influencing techniques to turn out social media users to vote in both elections. By its own account, Cambridge Analytica had worked with both campaigns to produce customized propaganda for targeting individuals on Facebook likely to be swept up in the tide of anti-immigrant populism. Its methods, some news sources suggested, might have sent enough previously disengaged voters to the polls to have tipped the scales in favor of the surprise victors. To a certain segment of the public, this story seemed to answer the question raised by both upsets: How was it possible that the seemingly solid establishment consensus had been rejected? What’s more, the explanation confirmed everything that seemed creepy about the Internet, evoking a sci-fi vision of social media users turned into an army of political zombies, mobilized through subliminal manipulation.

Cambridge Analytica’s violations of Facebook users’ privacy have made it an enduring symbol of the dark side of social media. However, the more dramatic claims about the extent of the company’s political impact collapse under closer scrutiny, mainly because its much-hyped “psychographic targeting” methods probably don’t work. As former Facebook product manager Antonio García Martínez noted in a 2018 Wired article, “the public, with no small help from the media sniffing a great story, is ready to believe in the supernatural powers of a mostly unproven targeting strategy,” but “most ad insiders express skepticism about Cambridge Analytica’s claims of having influenced the election, and stress the real-world difficulty of changing anyone’s mind about anything with mere Facebook ads, least of all deeply ingrained political views.” According to García, the entire affair merely confirms a well-established truth: “In the ads world, just because a product doesn’t work doesn’t mean you can’t sell it….(More)”.

Experts say privately held data available in the European Union should be used better and more


European Commission: “Data can solve problems from traffic jams to disaster relief, but European countries are not yet using this data to its full potential, experts say in a report released today. More secure and regular data sharing across the EU could help public administrations use private sector data for the public good.

In order to increase Business-to-Government (B2G) data sharing, the experts advise to make data sharing in the EU easier by taking policy, legal and investment measures in three main areas:

  1. Governance of B2G data sharing across the EU: such as putting in place national governance structures, setting up a recognised function (‘data stewards’) in public and private organisations, and exploring the creation of a cross-EU regulatory framework.
  2. Transparency, citizen engagement and ethics: such as making B2G data sharing more citizen-centric, developing ethical guidelines, and investing in training and education.
  3. Operational models, structures and technical tools: such as creating incentives for companies to share data, carrying out studies on the benefits of B2G data sharing, and providing support to develop the technical infrastructure through the Horizon Europe and Digital Europe programmes.

They also revised the principles on private sector data sharing in B2G contexts and included new principles on accountability and on fair and ethical data use, which should guide B2G data sharing for the public interest. Examples of successful B2G data sharing partnerships in the EU include an open forest data system in Finland to help manage the ecosystem, mapping of EU fishing activities using ship tracking data, and genome sequencing data of breast cancer patients to identify new personalised treatments. …

The High-Level Expert Group on Business-to-Government Data Sharing was set up in autumn 2018 and includes members from a broad range of interests and sectors. The recommendations presented today in its final report feed into the European strategy for data and can be used as input for other possible future Commission initiatives on Business-to-Government data sharing….(More)”.

Government by Algorithm: Artificial Intelligence in Federal Administrative Agencies


The Administrative Conference of the United States: “Artificial intelligence (AI) promises to transform how government agencies do their work. Rapid developments in AI have the potential to reduce the cost of core governance functions, improve the quality of decisions, and unleash the power of administrative data, thereby making government performance more efficient and effective. Agencies that use AI to realize these gains will also confront important questions about the proper design of algorithms and user interfaces, the respective scope of human and machine decision-making, the boundaries between public actions and private contracting, their own capacity to learn over time using AI, and whether the use of AI is even permitted.

These are important issues for public debate and academic inquiry. Yet little is known about how agencies are currently using AI systems beyond a few headlinegrabbing examples or surface-level descriptions. Moreover, even amidst growing public and  scholarly discussion about how society might regulate government use of AI, little attention has been devoted to how agencies acquire such tools in the first place or oversee their use. In an effort to fill these gaps, the Administrative Conference of the United States (ACUS) commissioned this report from researchers at Stanford University and New York University. The research team included a diverse set of lawyers, law students, computer scientists, and social scientists with the capacity to analyze these cutting-edge issues from technical, legal, and policy angles. The resulting report offers three cuts at federal agency use of AI:

  • a rigorous canvass of AI use at the 142 most significant federal departments, agencies, and sub-agencies (Part I)
  • a series of in-depth but accessible case studies of specific AI applications at seven leading agencies covering a range of governance tasks (Part II); and
  • a set of cross-cutting analyses of the institutional, legal, and policy challenges raised by agency use of AI (Part III)….(More)”

The Future of Minds and Machines


Report by Aleksandra Berditchevskaia and Peter Baek: “When it comes to artificial intelligence (AI), the dominant media narratives often end up taking one of two opposing stances: AI is the saviour or the villain. Whether it is presented as the technology responsible for killer robots and mass job displacement or the one curing all disease and halting the climate crisis, it seems clear that AI will be a defining feature of our future society. However, these visions leave little room for nuance and informed public debate. They also help propel the typical trajectory followed by emerging technologies; with inevitable regularity we observe the ascent of new technologies to the peak of inflated expectations they will not be able to fulfil, before dooming them to a period languishing in the trough of disillusionment.[1]

There is an alternative vision for the future of AI development. By starting with people first, we can introduce new technologies into our lives in a more deliberate and less disruptive way. Clearly defining the problems we want to address and focusing on solutions that result in the most collective benefit can lead us towards a better relationship between machine and human intelligence. By considering AI in the context of large-scale participatory projects across areas such as citizen science, crowdsourcing and participatory digital democracy, we can both amplify what it is possible to achieve through collective effort and shape the future trajectory of machine intelligence. We call this 21st-century collective intelligence (CI).

In The Future of Minds and Machines we introduce an emerging framework for thinking about how groups of people interface with AI and map out the different ways that AI can add value to collective human intelligence and vice versa. The framework has, in large part, been developed through analysis of inspiring projects and organisations that are testing out opportunities for combining AI & CI in areas ranging from farming to monitoring human rights violations. Bringing together these two fields is not easy. The design tensions identified through our research highlight the challenges of navigating this opportunity and selecting the criteria that public sector decision-makers should consider in order to make the most of solving problems with both minds and machines….(More)”.

Novel forms of governance with high levels of civic self-reliance


Thesis by Hiska Ubels: “Enduring depopulation and ageing have affected the liveability of many of the smaller villages in the more peripheral rural municipalities of the Netherlands. Combined with a general climate of austerity and structural public budget cuts, this has led to the search of both communities and local governments for solutions in which citizens take and obtain more responsibilities and higher levels of local autonomy in dealing with local liveability challenges.

This PhD-thesis explores how novel forms of governance with high levels of civic self-reliance can be understood from the perspectives of the involved residents, local governments and the supposed beneficiaries. It also discusses the dynamics, potentials and limitations that come to the fore. To achieve this, firstly, it focusses on the development of role shifts of responsibilities and decision-making power between local governments and citizens in experimental governance initiatives over time and the main factors that enhance and obstruct higher levels of civic autonomy. Then it investigates the influence of government involvement on a civic initiatives’ organisation structure and governance process, and by doing so on the key conditions of its civic self-steering capacity. In addition, it examines how novel governance forms with citizens in the lead are experienced by the community members to whose community liveability they are supposed to contribute. Lastly, it explores the reasons why citizens do not engage in such initiatives….(More)”.

NGOs embrace GDPR, but will it be used against them?


Report by Vera Franz et al: “When the world’s most comprehensive digital privacy law – the EU General Data Protection Regulation (GDPR) – took effect in May 2018, media and tech experts focused much of their attention on how corporations, who hold massive amounts of data, would be affected by the law.

This focus was understandable, but it left some important questions under-examined–specifically about non-profit organizations that operate in the public’s interest. How would non-governmental organizations (NGOs) be impacted? What does GDPR compliance mean in very practical terms for NGOs? What are the challenges they are facing? Could the GDPR be ‘weaponized’ against NGOs and if so, how? What good compliance practices can be shared among non-profits?

Ben Hayes and Lucy Hannah from Data Protection Support & Management and I have examined these questions in detail and released our findings in this report.

Our key takeaway: GDPR compliance is an integral part of organisational resilience, and it requires resources and attention from NGO leaders, foundations and regulators to defend their organisations against attempts by governments and corporations to misuse the GDPR against them.

In a political climate where human rights and social justice groups are under increasing pressure, GDPR compliance needs to be given the attention it deserves by NGO leaders and funders. Lack of compliance will attract enforcement action by data protection regulators and create opportunities for retaliation by civil society adversaries.

At the same time, since the law came into force, we recognise that some NGOs have over-complied with the law, possibly diverting scarce resources and hampering operations.

For example, during our research, we discovered a small NGO that undertook an advanced and resource-intensive compliance process (a Data Protection Impact Assessment or DPIA) for all processing operations. DPIAs are only required for large-scale and high-risk processing of personal data. Yet this NGO, which holds very limited personal data and undertakes no marketing or outreach activities, engaged in this complex and time-consuming assessment because the organization was under enormous pressure from their government. They told us they “wanted to do everything possible to avoid attracting attention.”…

Our research also found that private companies, individuals and governments who oppose the work of an organisation have used GDPR to try to keep NGOs from publishing their work. To date, NGOs have successfully fought against this misuse of the law….(More)“.

Smarter government or data-driven disaster: the algorithms helping control local communities


Release by MuckRock: “What is the chance you, or your neighbor, will commit a crime? Should the government change a child’s bus route? Add more police to a neighborhood or take some away?

Every day government decisions from bus routes to policing used to be based on limited information and human judgment. Governments now use the ability to collect and analyze hundreds of data points everyday to automate many of their decisions.

Does handing government decisions over to algorithms save time and money? Can algorithms be fairer or less biased than human decision making? Do they make us safer? Automation and artificial intelligence could improve the notorious inefficiencies of government, and it could exacerbate existing errors in the data being used to power it.

MuckRock and the Rutgers Institute for Information Policy & Law (RIIPL) have compiled a collection of algorithms used in communities across the country to automate government decision-making.

Go right to the database.

We have also compiled policies and other guiding documents local governments use to make room for the future use of algorithms. You can find those as a project on DocumentCloud.

View policies on smart cities and technologies

These collections are a living resource and attempt to communally collect records and known instances of automated decision making in government….(More)”.

Digital democracy: Is the future of civic engagement online?


Paper by Gianluca Sgueo: “Digital innovation is radically transforming democratic decision-making. Public administrations are experimenting with mobile applications(apps) to provide citizens with real-time information, using online platforms to crowdsource ideas, and testing algorithms to engage communities in day today administration. The key question is what technology breakthrough means for governance systems created long before digital disruption. On the one hand, policy-makers are hoping that technology can be used to legitimise the public sector, re-engage citizens in politics and combat civic apathy. Scholars, on the other hand, point out that, if the digitalisation of democracy is left unquestioned, the danger is that the building blocks of democracy itself will be eroded.

This briefing examines three key global trends that are driving the on-going digitalisation of democratic decision-making. First are demographic patterns. These highlight growing global inequalities. Ten years from now, in the West the differentials of power among social groups will be on the rise, whereas in Eastern countries democratic freedoms will be at risk of further decline.

Second, a more urbanised global population will make cities ideal settings for innovative approaches to democratic decision-making. Current instances of digital democracy being used at local level include blockchain technology for voting and online crowdsourcing platforms.

Third, technological advancements will cut the costs of civic mobilisation and pose new challenges for democratic systems. Going forward, democratic decision-makers will be required to bridge digital literacy gaps, secure public structures from hacking, and to protect citizens’ privacy….(More)”.