Europol introduce crowdsourcing to catch child abusers


LeakofNations: “The criminal intelligence branch of the European Union, known as Europol, have started a campaign called #TraceAnObject which uses social media crowdsourcing to detect potentially-identifying objects in material that depicts child abuse….

Investigative crowdsourcing has gained traction in academic and journalistic circles in recent years, but this represents the first case of government bureaus relying on social media people-power to conduct more effective analysis.

Journalists are increasingly relying on a combination of high-end computing to organise terabytes of data and internet cloud hubs that allow a consortium of journalists from around the world to share their analysis of the material. In the Panama Papers scoop the Australian software Nuix was used to analyse, extract, and index documents into an encrypted central hub in which thousands of journalists from 80 countries were able to post their workings and assist others in a forum-type setting. This model was remarkably efficient; over 11.5 million documents, dating back to the 1970’s, were analysed in less than a year.

The website Zooinverse has achieved huge success in creating public participation on academic projects, producing the pioneering game Foldit, where participants play with digital models of proteins. The Oxford University-based organisation has now engaged over 1 million volunteers, and has has significant successes in astronomy, ecology, cell biology, humanities, and climate science.

The most complex investigations still require thousands of hours of straightforward tasks that cannot be computerised. The citizen science website Planet Four studies conditions on Mars, and needs volunteers to compare photographs and detect blotches on Mars’ surface – enabling anyone to feel like Elon Musk, regardless of their educational background.

Child abuse is something that incites anger in most people. Crowdsourcing is an opportunity to take the donkey-work away from slow bureaucratic offices and allow ordinary citizens, many of whom felt powerless to protect children from these vile crimes, to genuinely progress cases that will make children safer.

Zooinverse proves that the public are hungry for this kind of work; the ICIJ project model of a central cloud forum shows that crowdsourcing across international borders allows data to be interpreted more efficiently. Europol’s latest idea could well be a huge success.

Even the most basic object could potentially provide vital clues to the culprit’s identity. The most significant items released so far include a school uniform complete with ID card necktie, and a group of snow-covered lodges….(More) (see also #TraceAnObject).

South Sudan: Satellite Images Used to Track Food Insecurity


Salem Solomon at VOA news: “The world is watching closely as food shortages grip parts of Africa and the Middle East. As humanitarian groups respond to the crisis, they have to solve a major problem: how to track food security in areas that are simply too remote or too dangerous to access.

The Famine Early Warning Systems Network (FEWSNET) has come up with an innovative answer. The U.S.-funded organization is working with DigitalGlobe, a Colorado satellite company, to crowdsource analysis of satellite imagery of South Sudan.

The effort will rely on thousands of volunteers — normal people with no subject matter expertise — to scour satellite images looking for things like livestock herds, temporary dwellings and permanent dwellings. The group has selected an area of 18,000 square kilometers across five counties in South Sudan to analyze.

“The crowd can identify settlement imagery, they can identify roads, hospitals, airplanes, you name it. It allows us to tap into this network of folks around the world, not necessarily in country, but they are folks who are interested and compelled by whatever the campaign is,” said Rhiannan Price, senior manager of the Seeing a Better World Program at DigitalGlobe….(More)”.

The Internet Doesn’t Have to Be Bad for Democracy


Tom Simonite at MIT Technology Review: “Accusations that the Internet and social media sow political division have flown thick and fast since recent contentious elections in the United States, the United Kingdom, and France. Facebook founder and CEO Mark Zuckerberg has even pledged to start working on technology that will turn the energy of online interactions into a more positive force (see “We Need More Alternatives to Facebook”).

Tiny, largely self-funded U.S. startup Pol.is has been working on a similar project longer than Zuckerberg and already has some promising results. The company’s interactive, crowdsourced survey tool can be used to generate maps of public opinion that help citizens, governments, and legislators discover the nuances of agreement and disagreement on contentious issues that exist. In 2016, that information helped the government of Taiwan break a six-year deadlock over how to regulate online alcohol sales, caused by entrenched, opposing views among citizens on what rules should apply.

“It allowed different sides to gradually see that they share the same underlying concern despite superficial disagreements,” says Audrey Tang, Taiwan’s digital minister. The island’s government now routinely sends out Pol.is surveys using Facebook ads, and to special-interest groups. It has also used the system to help thrash out what rules should apply to Airbnb rentals and mobile ride-hailing services such as Uber.

Pol.is’s open-source software is designed to serve up interactive online surveys around a particular issue. People are shown a series of short statements about aspects of a broader issue—for example, “Uber drivers should need the same licenses cab drivers do”—and asked to click to signal that they agree or disagree. People can contribute new statements of their own for others to respond to. The tangle of crisscrossing responses is used to automatically generate charts that map out different clusters of opinion, making it easy to see the points on which people tend to overlap or disagree.

Alternativet, a progressive Danish political party with nine members of parliament, is piloting Pol.is as a way to give its members a more direct role in formulating policy. Jon Skjerning-Rasmussen, a senior process coordinator with the party, says the way Pol.is visualizations are shared with people as they participate in a survey—letting them see how their opinions compare with those of others—helps people engage with the tool….(More).

Big data allows India to map its fight against human trafficking


Nita Bhalla for Reuters: “An Indian charity is using big data to pinpoint human trafficking hot spots in a bid to prevent vulnerable women and girls vanishing from high-risk villages into the sex trade.

My Choices Foundation uses specially designed technology to identify those villages that are most at risk of modern slavery, then launches local campaigns to sound the alarm….

The analytics tool – developed by Australian firm Quantium – uses a range of factors to identify the most dangerous villages.It draws on India’s census, education and health data and factors such as drought risk, poverty levels, education and job opportunities to identify vulnerable areas….

There are an estimated 46 million people enslaved worldwide, with more than 18 million living in India, according to the 2016 Global Slavery Index. The Index was compiled by the Walk Free Foundation, a global organisation seeking to end modern slavery. Many are villagers lured by traffickers with the promise of a good job and an advance payment, only to find themselves or their children forced to work in fields or brick kilns, enslaved in brothels and sold into sexual slavery.

Almost 20,000 women and children were victims of human trafficking in India in 2016, a rise of nearly 25 percent from the previous year, according to government data.While India has strengthened its anti-trafficking policy in recent years, activists say a lack of public awareness remains one of the biggest impediments…(More)”.

Expanding Training on Data and Technology to Improve Communities


Kathryn Pettit at the National Neighborhood Indicators Partnership (NNIP): “Local government and nonprofit staff need data and technology skills to regularly monitor local conditions and design programs that achieve more effective outcomes. Tailored training is essential to help them gain the knowledge and confidence to leverage these indispensable tools. A recent survey of organizations that provide data and technology training documented current practices and how such training should be expanded. Four recommendations are provided to assist government agencies, elected leaders, nonprofit executives, and local funders in empowering workers with the necessary training to use data and technology to benefit their communities. Specifically, community stakeholders should collectively work to

  • expand the training available to government and nonprofit staff;
  • foster opportunities for sharing training materials and lessons;
  • identify allies who can enhance and support local training efforts;
  • and assess the local landscape of data and technology training.

Project Products

  • Brief: A summary of the current training landscape and key action steps for various sectors to ensure that local government and nonprofit staff have the data and technology skills needed for their civic missions.
  • Guide: A document for organizations interested in providing community data and technology training, including advice on how to assess local needs, develop training content, and fund these efforts.
  • Catalog: Example training descriptions and related materials collected from various cities for local adaptation.
  • Fact sheet: A summary of results from a survey on current training content and practices….(More)”

Braindates


Greg Oates at Skift: “C2 has evolved into a global benchmark for event design since launching in 2012, but the innovation extends beyond the creative mojo and significant resources invested into the physical layout.

The show works with Montreal-based E-180 to provide a digital matchmaking platform called Braindate for attendees to find other people with like-minded interests. It’s always a challenge at conferences to meet someone who might be a potential business partner or learning source, or whatever, especially when you don’t know who those people are.

In recent years, dozens of emerging event technology companies have attempted to develop cloud-based personal connectivity platforms to help conference participants cull through their attendee lists. Few, however, have scaled to provide an effective solution for crowdsourced learning and networking at events.

Braindate is …designed to be integrated into an event app where attendees can post a one-sentence question or goal explaining their particular interest. Those posts can be tagged with any number of phrases, such as “event UX” or “smart city” or “experiential marketing” to help users streamline their search.

When someone sees a Braindate suggestion that looks appealing, he or she clicks on it to show a calendar with concurrent open slots in both people’s schedules. Then that person sends a request with a specific time to meet to discuss their shared interests. It’s also important to note that C2 registrants could schedule meetings pre-conference, although the bulk of activity kicked in on opening day.

Other conferences have incorporated the Braindate platform into their event design and programming, including TED Women, Airbnb Open, Salesforce’s Dreamforce, and re:publica.

New this year at C2 Montreal, E-180 launched a Group Braindate scenario where a participant could schedule and host a meeting focusing on a specific topic with four other people. Interested parties would reserve one of the four open spaces, and then everyone participating could chat online within the specific Braindate page to curate the conversation ahead of time….

This year, Montreal-based PixMob developed a Web-based event platform for C2 called “Klik” — with the Braindate platform embedded in the user interface — instead of the typical, downloadable iOS/Android event app traditionally used for large tech/media/marketing conferences.

The obvious benefit from a Web-based event platform is that people don’t have to download another dedicated app to their phone solely for the duration of the live experience. As such, however, there was a lag time whenever attendees tried to access Klik, and the site had to reload every time you moved from page to page. The portal also required users to keep logging in many times during the day.

That’s a big pain point when you’re running around an event floor trying to make quick decisions while paging through app sections on the fly, along with thousands of other people doing the same thing inside a highly congested space.

Attendees also couldn’t add personal meetings into the Klik event agenda, requiring them to shift constantly between the Klik schedule and their email schedule, and the content describing each session and activation was often minimal to the point of being useless….(More)”.

How Data Mining Facebook Messages Can Reveal Substance Abusers


Emerging Technology from the arXiv: “…Substance abuse is a serious concern. Around one in 10 Americans are sufferers. Which is why it costs the American economy more than $700 billion a year in lost productivity, crime, and health-care costs. So a better way to identify people suffering from the disorder, and those at risk of succumbing to it, would be hugely useful.

Bickel and co say they have developed just such a technique, which allows them to spot sufferers simply by looking at their social media messages such as Facebook posts. The technique even provides new insights into the way abuse of different substances influences people’s social media messages.

The new technique comes from the analysis of data collected between 2007 and 2012 as part of a project that ran on Facebook called myPersonality. Users who signed up were offered various psychometric tests and given feedback on their scores. Many also agreed to allow the data to be used for research purposes.

One of these tests asked over 13,000 users with an average age of 23 about the substances they used. In particular, it asked how often they used tobacco, alcohol, or other drugs, and assessed each participant’s level of use. The users were then divided into groups according to their level of substance abuse.

This data set is important because it acts as a kind of ground truth, recording the exact level of substance use for each person.

The team next gathered two other Facebook-related data sets. The first was 22 million status updates posted by more than 150,000 Facebook users. The other was even larger: the “like” data associated with 11 million Facebook users.

Finally, the team worked out how these data sets overlapped. They found almost 1,000 users who were in all the data sets, just over 1,000 who were in the substance abuse and status update data sets, and 3,500 who were in the substance abuse and likes data sets.

These users with overlapping data sets provide rich pickings for data miners. If people with substance use disorders have certain unique patterns of behavior, it may be possible to spot these in their Facebook status updates or in their patterns of likes.

So Bickel and co got to work first by text mining most of the Facebook status updates and then data mining most of the likes data set. Any patterns they found, they then tested by looking for people with similar patterns in the remaining data and seeing if they also had the same level of substance use.

The results make for interesting reading. The team says its technique was hugely successful. “Our best models achieved 86%  for predicting tobacco use, 81% for alcohol use and 84% for drug use, all of which significantly outperformed existing methods,” say Bickel and co…. (More) (Full Paper: arxiv.org/abs/1705.05633: Social Media-based Substance Use Prediction).

How Twitter Is Being Gamed to Feed Misinformation


the New York Times: “…the biggest problem with Twitter’s place in the news is its role in the production and dissemination of propaganda and misinformation. It keeps pushing conspiracy theories — and because lots of people in the media, not to mention many news consumers, don’t quite understand how it works, the precise mechanism is worth digging into….Here’s how.

The guts of the news business.

One way to think of today’s disinformation ecosystem is to picture it as a kind of gastrointestinal tract…. Twitter often acts as the small bowel of digital news. It’s where political messaging and disinformation get digested, packaged and widely picked up for mass distribution to cable, Facebook and the rest of the world.

This role for Twitter has seemed to grow more intense during (and since) the 2016 campaign. Twitter now functions as a clubhouse for much of the news. It’s where journalists pick up stories, meet sources, promote their work, criticize competitors’ work and workshop takes. In a more subtle way, Twitter has become a place where many journalists unconsciously build and gut-check a worldview — where they develop a sense of what’s important and merits coverage, and what doesn’t.

This makes Twitter a prime target for manipulators: If you can get something big on Twitter, you’re almost guaranteed coverage everywhere….

Twitter is clogged with fake people.

For determined media manipulators, getting something big on Twitter isn’t all that difficult. Unlike Facebook, which requires people to use their real names, Twitter offers users essentially full anonymity, and it makes many of its functions accessible to outside programmers, allowing people to automate their actions on the service.

As a result, numerous cheap and easy-to-use online tools let people quickly create thousands of Twitter bots — accounts that look real, but that are controlled by a puppet master.

Twitter’s design also promotes a slavish devotion to metrics: Every tweet comes with a counter of Likes and Retweets, and users come to internalize these metrics as proxies for real-world popularity….

They may ruin democracy.

…. the more I spoke to experts, the more convinced I became that propaganda bots on Twitter might be a growing and terrifying scourge on democracy. Research suggests that bots are ubiquitous on Twitter. Emilio Ferrara and Alessandro Bessi, researchers at the University of Southern California, found that about a fifth of the election-related conversation on Twitter last year was generated by bots. Most users were blind to them; they treated the bots the same way they treated other users….

in a more pernicious way, bots give us an easy way to doubt everything we see online. In the same way that the rise of “fake news” gives the president cover to label everything “fake news,” the rise of bots might soon allow us to dismiss any online enthusiasm as driven by automation. Anyone you don’t like could be a bot; any highly retweeted post could be puffed up by bots….(More)”.

More professionalism, less populism: How voting makes us stupid, and what to do about it


Paper by Benjamin Wittes and Jonathan Rauch: “For several generations, political reform and rhetoric have been entirely one-directional: always more direct democracy, never less. The general belief holds that more public involvement will produce more representative and thus more effective and legitimate governance. But does increasing popular involvement in politics remedy the ills of our government culture; is it the chicken soup of political reforms?

In a new report, “More professionalism, less populism: How voting makes us stupid, and what to do about it,” Brookings Senior Fellows Jonathan Rauch and Benjamin Wittes argue that the best way forward is to rebalance the reform agenda away from direct participation and toward intermediation and institutions. As the authors write, “Neither theory nor practice supports the idea that more participation will produce better policy outcomes, or will improve the public’s approbation of government, or is even attainable in an environment dominated by extreme partisans and narrow interest groups.”

Populism cannot solve our problems, Rauch and Wittes claim, because its core premises and reforms are self-defeating. Research has shown that voters are “irrationally biased and rationally ignorant,” and do not possess the specialized knowledge necessary to make complex policy judgments. Further, elections provide little by way of substantive guidance for policymakers and, even on its own terms, direct democracy is often unrepresentative. In the words of the authors, “By itself, building more direct input from the public into the functions of government is likely to lead to more fragmentation, more stalemate, more flawed policies—and, paradoxically, less effective representation.”

The authors are not advocating complacency about voter participation, much less for restricting or limiting voting: “We are arguing that participation is not enough, and that overinvesting in it neglects other, more promising paths.”

To truly repair American democracy, Rauch and Wittes endorse a resurgence of political institutions, such as political parties, and substantive professionals, such as career politicians and experts. Drawing on examples like the intelligence oversight community, the authors assert that these intermediaries actually make democracy more inclusive and more representative than direct participation can do by itself. “In complex policy spaces,” the authors write, “properly designed intermediary institutions can act more decisively and responsively on behalf of the public than an army of ‘the people’ could do on its own behalf, [and are] less likely to be paralyzed by factional disputes and distorted by special-interest manipulation.”…(More) (Read the full paper here).

Data Journalism: How Not To Be Wrong


Winny de Jong: “At the intersection of data and journalism, lots can go wrong. Merely taking precautions might not be enough….

Half True Is False

But this approach is not totally foolproof.

“In data journalism, we cannot settle for ‘half-true.’ Anything short of true is wrong – and we cannot afford to be wrong.” Unlike fact-checking websites such as Politifact, which invented ‘scales’ for truthfulness, from false to true and everything in between, data journalism should always be true.

No Pants on Fire: Politifact’s Truth-O-Meter.

True but Wrong

But even when your story is true, Gebeloff said you still could still be wrong. “You can do the math correctly, but get the context wrong, fail to acknowledge uncertainties or not describe your findings correctly.”

Fancy Math

When working on a story, journalists should consider whether they use “fancy math” – think statistics – or “standard math.” “Using fancy math you can explore complex relationships, but at the same time your story will be harder to explain.”…

Targets as a Source

…To make sure you’re not going to be wrong, you should share your findings. “Don’t just share findings with experts, share them with hostile experts too,” Gebeloff advises. “Use your targets as a source. If there’s a blowback, you want to know before publication – and include the blowback in the publication.”

How Not To Be Wrong Checklist

Here’s why you want to use this checklist, which is based on Gebeloff’s presentation: a half truth is false, and data journalism should always be true. But just being true is not enough. Your story can be mathematically true but wrong in context or explanation. You should want your stories to be true and not wrong.

  1. Check your data carefully:
    •  Pay attention to dates.
    •  Check for spelling and duplicates.
    •  Identify outliers.
    •  Statistical significance alone is not news.
    •  Prevent base year abuse: if something is a trend, it should be true in general not just if you cherrypick a base year.
    •  Make sure your data represents reality.
  2. As you work, keep a data diary that records what you’ve done and how you’ve done it. You should be able to reproduce your calculations.
  3. Make sure you explain the methods you used – your audience should be able to understand how you find a story.
  4. Play offense and defense simultaneously. Go for the maximum possible story, but at all times think of why you might be wrong, or what your target would say in response.
  5. Use your targets as a source to find blowbacks before publication.
  6. As part of the proofing process, create a footnotes file. Identify each fact and give it a number. Then, for each fact, list which document it came from, how you know it and the proof. Fix what needs to be fixed.

Additional links of interest: the slides of Robert Gebeloff’s how not to be wrong presentation, and the methodology notes and data from the “Race Behind Bars” series….(More)”