Technology is making the world more unequal. Only technology can fix this


Here’s the good news: technology – specifically, networked technology – makes it easier for opposition movements to form and mobilise, even under conditions of surveillance, and to topple badly run, corrupt states.

Inequality creates instability, and not just because of the resentments the increasingly poor majority harbours against the increasingly rich minority. Everyone has a mix of good ideas and terrible ones, but for most of us, the harm from our terrible ideas is capped by our lack of political power and the checks that others – including the state – impose on us.

As rich people get richer, however, their wealth translates into political influence, and their ideas – especially their terrible ideas – take on outsized importance….

After all, there comes a point when the bill for guarding your wealth exceeds the cost of redistributing some of it, so you won’t need so many guards.

But that’s where technology comes in: surveillance technology makes guarding the elites much cheaper than it’s ever been. GCHQ and the NSA have managed to put the entire planet under continuous surveillance. Less technologically advanced countries can play along: Ethiopia was one of the world’s first “turnkey surveillance states”, a country with a manifestly terrible, looting elite class that has kept guillotines and firing squads at bay through buying in sophisticated spying technology from European suppliers, and using this to figure out which dissidents, opposition politicians and journalists represent a threat, so it can subject them to arbitrary detention, torture and, in some cases, execution….

That’s the bad news.

Now the good news: technology makes forming groups cheaper and easier than it’s ever been. Forming and coordinating groups is the hard problem of the human condition; the reason we have religions and corporations and criminal undergrounds and political parties. Doing work together means doing more than one person could do on their own, but it also means compromising, subjecting yourself to policies or orders from above. It’s costly and difficult, and the less money and time you have, the harder it is to form a group and mobilise it.

This is where networks shine. Modern insurgent groups substitute software for hierarchy, networks for bosses. They are able to come together without agreeing to a crisp agenda that you have to submit to in order to be part of the movement. When it costs less to form a group, it doesn’t matter so much that you aren’t all there for the same reason, and thus are doomed to fall apart. Even a small amount of work done together amounts to more than the tiny cost of admission…

The future is never so normal as we think it will be. The only sure thing about self-driving cars, for instance, is that whether or not they deliver fortunes to oligarchic transport barons, that’s not where it will end. Changing the way we travel has implications for mobility (both literal and social), the environment, surveillance, protest, sabotage, terrorism, parenting …

Long before the internet radically transformed the way we organise ourselves, theorists were predicting we’d use computers to achieve ambitious goals without traditional hierarchies – but it was a rare pundit who predicted that the first really successful example of this would be an operating system (GNU/Linux), and then an encyclopedia (Wikipedia).

The future will see a monotonic increase in the ambitions that loose-knit groups can achieve. My new novel, Walkaway, tries to signpost a territory in our future in which the catastrophes of the super-rich are transformed into something like triumphs by bohemian, anti-authoritarian “walkaways” who build housing and space programmes the way we make encyclopedias today: substituting (sometimes acrimonious) discussion and (sometimes vulnerable) networks for submission to the authority of the ruling elites….(More).

Facebook Disaster Maps


Molly Jackman et al at Facebook: “After a natural disaster, humanitarian organizations need to know where affected people are located, what resources are needed, and who is safe. This information is extremely difficult and often impossible to capture through conventional data collection methods in a timely manner. As more people connect and share on Facebook, our data is able to provide insights in near-real time to help humanitarian organizations coordinate their work and fill crucial gaps in information during disasters. This morning we announced a Facebook disaster map initiative to help organizations address the critical gap in information they often face when responding to natural disasters.

Facebook disaster maps provide information about where populations are located, how they are moving, and where they are checking in safe during a natural disaster. All data is de-identified and aggregated to a 360 square meter tile or local administrative boundaries (e.g. census boundaries). [1]

This blog describes the disaster maps datasets, how insights are calculated, and the steps taken to ensure that we’re preserving privacy….(More)”.

Europol introduce crowdsourcing to catch child abusers


LeakofNations: “The criminal intelligence branch of the European Union, known as Europol, have started a campaign called #TraceAnObject which uses social media crowdsourcing to detect potentially-identifying objects in material that depicts child abuse….

Investigative crowdsourcing has gained traction in academic and journalistic circles in recent years, but this represents the first case of government bureaus relying on social media people-power to conduct more effective analysis.

Journalists are increasingly relying on a combination of high-end computing to organise terabytes of data and internet cloud hubs that allow a consortium of journalists from around the world to share their analysis of the material. In the Panama Papers scoop the Australian software Nuix was used to analyse, extract, and index documents into an encrypted central hub in which thousands of journalists from 80 countries were able to post their workings and assist others in a forum-type setting. This model was remarkably efficient; over 11.5 million documents, dating back to the 1970’s, were analysed in less than a year.

The website Zooinverse has achieved huge success in creating public participation on academic projects, producing the pioneering game Foldit, where participants play with digital models of proteins. The Oxford University-based organisation has now engaged over 1 million volunteers, and has has significant successes in astronomy, ecology, cell biology, humanities, and climate science.

The most complex investigations still require thousands of hours of straightforward tasks that cannot be computerised. The citizen science website Planet Four studies conditions on Mars, and needs volunteers to compare photographs and detect blotches on Mars’ surface – enabling anyone to feel like Elon Musk, regardless of their educational background.

Child abuse is something that incites anger in most people. Crowdsourcing is an opportunity to take the donkey-work away from slow bureaucratic offices and allow ordinary citizens, many of whom felt powerless to protect children from these vile crimes, to genuinely progress cases that will make children safer.

Zooinverse proves that the public are hungry for this kind of work; the ICIJ project model of a central cloud forum shows that crowdsourcing across international borders allows data to be interpreted more efficiently. Europol’s latest idea could well be a huge success.

Even the most basic object could potentially provide vital clues to the culprit’s identity. The most significant items released so far include a school uniform complete with ID card necktie, and a group of snow-covered lodges….(More) (see also #TraceAnObject).

Big data allows India to map its fight against human trafficking


Nita Bhalla for Reuters: “An Indian charity is using big data to pinpoint human trafficking hot spots in a bid to prevent vulnerable women and girls vanishing from high-risk villages into the sex trade.

My Choices Foundation uses specially designed technology to identify those villages that are most at risk of modern slavery, then launches local campaigns to sound the alarm….

The analytics tool – developed by Australian firm Quantium – uses a range of factors to identify the most dangerous villages.It draws on India’s census, education and health data and factors such as drought risk, poverty levels, education and job opportunities to identify vulnerable areas….

There are an estimated 46 million people enslaved worldwide, with more than 18 million living in India, according to the 2016 Global Slavery Index. The Index was compiled by the Walk Free Foundation, a global organisation seeking to end modern slavery. Many are villagers lured by traffickers with the promise of a good job and an advance payment, only to find themselves or their children forced to work in fields or brick kilns, enslaved in brothels and sold into sexual slavery.

Almost 20,000 women and children were victims of human trafficking in India in 2016, a rise of nearly 25 percent from the previous year, according to government data.While India has strengthened its anti-trafficking policy in recent years, activists say a lack of public awareness remains one of the biggest impediments…(More)”.

Expanding Training on Data and Technology to Improve Communities


Kathryn Pettit at the National Neighborhood Indicators Partnership (NNIP): “Local government and nonprofit staff need data and technology skills to regularly monitor local conditions and design programs that achieve more effective outcomes. Tailored training is essential to help them gain the knowledge and confidence to leverage these indispensable tools. A recent survey of organizations that provide data and technology training documented current practices and how such training should be expanded. Four recommendations are provided to assist government agencies, elected leaders, nonprofit executives, and local funders in empowering workers with the necessary training to use data and technology to benefit their communities. Specifically, community stakeholders should collectively work to

  • expand the training available to government and nonprofit staff;
  • foster opportunities for sharing training materials and lessons;
  • identify allies who can enhance and support local training efforts;
  • and assess the local landscape of data and technology training.

Project Products

  • Brief: A summary of the current training landscape and key action steps for various sectors to ensure that local government and nonprofit staff have the data and technology skills needed for their civic missions.
  • Guide: A document for organizations interested in providing community data and technology training, including advice on how to assess local needs, develop training content, and fund these efforts.
  • Catalog: Example training descriptions and related materials collected from various cities for local adaptation.
  • Fact sheet: A summary of results from a survey on current training content and practices….(More)”

How Data Mining Facebook Messages Can Reveal Substance Abusers


Emerging Technology from the arXiv: “…Substance abuse is a serious concern. Around one in 10 Americans are sufferers. Which is why it costs the American economy more than $700 billion a year in lost productivity, crime, and health-care costs. So a better way to identify people suffering from the disorder, and those at risk of succumbing to it, would be hugely useful.

Bickel and co say they have developed just such a technique, which allows them to spot sufferers simply by looking at their social media messages such as Facebook posts. The technique even provides new insights into the way abuse of different substances influences people’s social media messages.

The new technique comes from the analysis of data collected between 2007 and 2012 as part of a project that ran on Facebook called myPersonality. Users who signed up were offered various psychometric tests and given feedback on their scores. Many also agreed to allow the data to be used for research purposes.

One of these tests asked over 13,000 users with an average age of 23 about the substances they used. In particular, it asked how often they used tobacco, alcohol, or other drugs, and assessed each participant’s level of use. The users were then divided into groups according to their level of substance abuse.

This data set is important because it acts as a kind of ground truth, recording the exact level of substance use for each person.

The team next gathered two other Facebook-related data sets. The first was 22 million status updates posted by more than 150,000 Facebook users. The other was even larger: the “like” data associated with 11 million Facebook users.

Finally, the team worked out how these data sets overlapped. They found almost 1,000 users who were in all the data sets, just over 1,000 who were in the substance abuse and status update data sets, and 3,500 who were in the substance abuse and likes data sets.

These users with overlapping data sets provide rich pickings for data miners. If people with substance use disorders have certain unique patterns of behavior, it may be possible to spot these in their Facebook status updates or in their patterns of likes.

So Bickel and co got to work first by text mining most of the Facebook status updates and then data mining most of the likes data set. Any patterns they found, they then tested by looking for people with similar patterns in the remaining data and seeing if they also had the same level of substance use.

The results make for interesting reading. The team says its technique was hugely successful. “Our best models achieved 86%  for predicting tobacco use, 81% for alcohol use and 84% for drug use, all of which significantly outperformed existing methods,” say Bickel and co…. (More) (Full Paper: arxiv.org/abs/1705.05633: Social Media-based Substance Use Prediction).

How Twitter Is Being Gamed to Feed Misinformation


the New York Times: “…the biggest problem with Twitter’s place in the news is its role in the production and dissemination of propaganda and misinformation. It keeps pushing conspiracy theories — and because lots of people in the media, not to mention many news consumers, don’t quite understand how it works, the precise mechanism is worth digging into….Here’s how.

The guts of the news business.

One way to think of today’s disinformation ecosystem is to picture it as a kind of gastrointestinal tract…. Twitter often acts as the small bowel of digital news. It’s where political messaging and disinformation get digested, packaged and widely picked up for mass distribution to cable, Facebook and the rest of the world.

This role for Twitter has seemed to grow more intense during (and since) the 2016 campaign. Twitter now functions as a clubhouse for much of the news. It’s where journalists pick up stories, meet sources, promote their work, criticize competitors’ work and workshop takes. In a more subtle way, Twitter has become a place where many journalists unconsciously build and gut-check a worldview — where they develop a sense of what’s important and merits coverage, and what doesn’t.

This makes Twitter a prime target for manipulators: If you can get something big on Twitter, you’re almost guaranteed coverage everywhere….

Twitter is clogged with fake people.

For determined media manipulators, getting something big on Twitter isn’t all that difficult. Unlike Facebook, which requires people to use their real names, Twitter offers users essentially full anonymity, and it makes many of its functions accessible to outside programmers, allowing people to automate their actions on the service.

As a result, numerous cheap and easy-to-use online tools let people quickly create thousands of Twitter bots — accounts that look real, but that are controlled by a puppet master.

Twitter’s design also promotes a slavish devotion to metrics: Every tweet comes with a counter of Likes and Retweets, and users come to internalize these metrics as proxies for real-world popularity….

They may ruin democracy.

…. the more I spoke to experts, the more convinced I became that propaganda bots on Twitter might be a growing and terrifying scourge on democracy. Research suggests that bots are ubiquitous on Twitter. Emilio Ferrara and Alessandro Bessi, researchers at the University of Southern California, found that about a fifth of the election-related conversation on Twitter last year was generated by bots. Most users were blind to them; they treated the bots the same way they treated other users….

in a more pernicious way, bots give us an easy way to doubt everything we see online. In the same way that the rise of “fake news” gives the president cover to label everything “fake news,” the rise of bots might soon allow us to dismiss any online enthusiasm as driven by automation. Anyone you don’t like could be a bot; any highly retweeted post could be puffed up by bots….(More)”.

Data Journalism: How Not To Be Wrong


Winny de Jong: “At the intersection of data and journalism, lots can go wrong. Merely taking precautions might not be enough….

Half True Is False

But this approach is not totally foolproof.

“In data journalism, we cannot settle for ‘half-true.’ Anything short of true is wrong – and we cannot afford to be wrong.” Unlike fact-checking websites such as Politifact, which invented ‘scales’ for truthfulness, from false to true and everything in between, data journalism should always be true.

No Pants on Fire: Politifact’s Truth-O-Meter.

True but Wrong

But even when your story is true, Gebeloff said you still could still be wrong. “You can do the math correctly, but get the context wrong, fail to acknowledge uncertainties or not describe your findings correctly.”

Fancy Math

When working on a story, journalists should consider whether they use “fancy math” – think statistics – or “standard math.” “Using fancy math you can explore complex relationships, but at the same time your story will be harder to explain.”…

Targets as a Source

…To make sure you’re not going to be wrong, you should share your findings. “Don’t just share findings with experts, share them with hostile experts too,” Gebeloff advises. “Use your targets as a source. If there’s a blowback, you want to know before publication – and include the blowback in the publication.”

How Not To Be Wrong Checklist

Here’s why you want to use this checklist, which is based on Gebeloff’s presentation: a half truth is false, and data journalism should always be true. But just being true is not enough. Your story can be mathematically true but wrong in context or explanation. You should want your stories to be true and not wrong.

  1. Check your data carefully:
    •  Pay attention to dates.
    •  Check for spelling and duplicates.
    •  Identify outliers.
    •  Statistical significance alone is not news.
    •  Prevent base year abuse: if something is a trend, it should be true in general not just if you cherrypick a base year.
    •  Make sure your data represents reality.
  2. As you work, keep a data diary that records what you’ve done and how you’ve done it. You should be able to reproduce your calculations.
  3. Make sure you explain the methods you used – your audience should be able to understand how you find a story.
  4. Play offense and defense simultaneously. Go for the maximum possible story, but at all times think of why you might be wrong, or what your target would say in response.
  5. Use your targets as a source to find blowbacks before publication.
  6. As part of the proofing process, create a footnotes file. Identify each fact and give it a number. Then, for each fact, list which document it came from, how you know it and the proof. Fix what needs to be fixed.

Additional links of interest: the slides of Robert Gebeloff’s how not to be wrong presentation, and the methodology notes and data from the “Race Behind Bars” series….(More)”

Our path to better science in less time using open data science tools


Julia S. Stewart Lowndes et al in Nature: “Reproducibility has long been a tenet of science but has been challenging to achieve—we learned this the hard way when our old approaches proved inadequate to efficiently reproduce our own work. Here we describe how several free software tools have fundamentally upgraded our approach to collaborative research, making our entire workflow more transparent and streamlined. By describing specific tools and how we incrementally began using them for the Ocean Health Index project, we hope to encourage others in the scientific community to do the same—so we can all produce better science in less time.

Figure 1: Better science in less time, illustrated by the Ocean Health Index project.
Figure 1

Every year since 2012 we have repeated Ocean Health Index (OHI) methods to track change in global ocean health36,37. Increased reproducibility and collaboration has reduced the amount of time required to repeat methods (size of bubbles) with updated data annually, allowing us to focus on improving methods each year (text labels show the biggest innovations). The original assessment in 2012 focused solely on scientific methods (for example, obtaining and analysing data, developing models, calculating, and presenting results; dark shading). In 2013, by necessity we gave more focus to data science (for example, data organization and wrangling, coding, versioning, and documentation; light shading), using open data science tools. We established R as the main language for all data preparation and modelling (using RStudio), which drastically decreased the time involved to complete the assessment. In 2014, we adopted Git and GitHub for version control, project management, and collaboration. This further decreased the time required to repeat the assessment. We also created the OHI Toolbox, which includes our R package ohicore for core analytical operations used in all OHI assessments. In subsequent years we have continued (and plan to continue) this trajectory towards better science in less time by improving code with principles of tidy data33; standardizing file and data structure; and focusing more on communication, in part by creating websites with the same open data science tools and workflow. See text and Table 1 for more details….(More)”

Citizen Science and Alien Species in Europe


European Commission: “Citizen Science programs aim at creating a bridge between science and the general public, actively involving citizens in research projects. In this way, citizen scientists can work side by side with experts, contributing to the increase of scientific knowledge, addressing local, national and international issues that need scientific support and having the potential to influence policy-making.
The EU Regulation 1143/2014 on Invasive Alien Species (IAS) acknowledges the important role public awareness and active involvement of the citizens have for the successful implementation of the Regulation. Thus, Citizen Science could bring an important contribution to the early detection and monitoring of invasive alien species, as, in order to adopt efficient control measures, it is necessary to know the presence and distribution of these species as soon as possible.
With the new website section we want to disseminate information about how citizens can be involved in activities aimed at protecting European biodiversity, awareness raising, sharing news, examples and developments from the emerging field of Citizen Science.
If you are interested in becoming a citizen scientist and want to help monitor invasive alien species (IAS) in your region, you can use our App “Invasive Alien Species Europe” to report the 37 IAS of Union Concern.
Furthermore, we have compiled a list of European Citizen Science projects dealing with alien species. The list is not exhaustive and is open for improvement …(More)”.