Facebook Is Making a Map of Everyone in the World


Robinsion Meyer at The Atlantic: “Americans inhabit an intricately mapped world. Type “Burger King” into an online box, and Google will cough up a dozen nearby options, each keyed to a precise latitude and longitude.

But throughout much of the world, local knowledge stays local. While countries might conduct censuses, the data doesn’t go much deeper than the county or province level.

Take population data, for instance: More than 7.4 billion humans sprawl across this planet of ours. They live in dense urban centers, in small towns linked by farms, and alone on the outskirts of jungles. But no one’s sure where, exactly, many of them live.

Now, Facebook says it has mapped almost 2 billion people better than any previous project. The company’s Connectivity Labs announced this week that it created new, high-resolution population-distribution maps of 20 countries, most of which are developing. It won’t release most of the maps until later this year,but if they’re accurate, they will be the best-quality population maps ever made for most of those places.

The maps will be notable for another reason, too: If they’re accurate, they ‘ll signal the arrival of a new, AI-aided age of cartography.

In the rich world, reliable population information is taken for granted.  But elsewhere, population-distribution maps have dozens of applications in different fields. Urban planners need to estimate city density so they can place and improve roads. Epidemiologists and public-health workers use them to track outbreaks or analyze access to health care. And after a disaster, population maps can be used (along with crisis mapping) to prioritize where emergency aid gets sent….(More)

#BuildHereNow


Crowdsourcing Campaign by Strong Towns: “Nearly every urban neighborhood in this country — whether small town or big city — has properties that could use a little love. This week at Strong Towns we’re talking about the federal rules that have made that love difficult to find, tilting the playing field so that capital and expertise flow away from walkable, mixed-use neighborhoods. Over eighty years of this distortion has created a lot of opportunity for Americans to make good, high-returning investments in our core cities and neighborhoods.

WE NEED YOUR HELP TO SHOW JUST HOW MUCH POTENTIAL IS OUT THERE.

We all know that empty lot, that underutilized building, that is just waiting for the right person to come along and knit it back into the fabric of the neighborhood. Imagine that right person could actually get the financing — that the rules weren’t rigged against them — and all they needed was your encouragement. This week, let’s provide that encouragement.

Let’s shine a huge spotlight on these spaces. They don’t need expensive utilities, a new road or a tax subsidy. They just need a fair shake.

HOW CAN I PARTICIPATE?

  • Get outside and take pictures of the vacant or underutilized properties in your town.
  • Upload your photos to Twitter or Instagram with the hashtag #BuildHereNow
  • Bonus points if you include the location and a suggestion of what you would like to see built there. (Note that turning on location services will also greatly aid us in mapping out these posts all over the country.)…(More)”

A Tale of Four Algorithms


 at Slate: “Algorithms don’t just power search results and news feeds, shaping our experience of Google, Facebook, Amazon, Spotify, and Tinder.Algorithms are widely—and largely invisibly—integrated into American political life, policymaking, and program administration.

Algorithms can terminate your Medicaid benefits, exclude you from air travel, purge you from voter rolls, or predict if you are likely to commit a crime in the future. They make decisions about who has access to public services, who undergoes extrascrutiny, and where we target scarce resources.

But are all algorithms created equal? Does the kind of algorithm used by government agencies have anything to do with who it is aimed at?

Bias can enter algorithmic processes through many doors. Discriminatory datacollection can mean extra scrutiny for whole communities, creating a feedback cycleof “garbage in, garbage out.” For example, much of the initial data that populated CalGang, an intelligence database used to target and track suspected gang members, was collected by the notorious Community Resources Against Street Hoodlums unitsof the LAPD, including in the scandal-ridden Rampart division. Algorithms can alsomirror and reinforce entrenched cultural assumptions. For example, as Wendy HuiKyong Chun has written, Googling “Asian + woman” a decade ago turned up moreporn sites in the first 10 hits than a search for “pornography.”

But can automated policy decisions be class-biased? Let’s look at four algorithmic systems dedicated to one purpose—identifying and decreasing fraud, waste, and abuse in federal programs—each aimed at a different economic class. We ‘ll investigate the algorithms in terms of their effectiveness at protecting key American political values—efficacy, transparency, fairness, and accountability—and see which ones make the grade.

160210_EconClass_Chart_01

Below, I’ve scored each of the four policy algorithms on a scale of 1 to 5, 1 being very low and 5 being high…

Of course this ad hoc survey is merely suggestive, not conclusive. But it indicates areality that those of us who talk about data-driven policy rarely address: All algorithmsare not created equal. Policymakers and programmers make inferences about theirtargets that get baked into the code of both legislation and high-tech administrativetools—that SNAP recipients are sneakier than other people and deserve less due process protection, for example….(More)

Public-Private Partnerships for Statistics: Lessons Learned, Future Steps


Report by Nicholas Robin, Thilo Klein and Johannes Jütting for Paris 21: “Non-offcial sources of data, big data in particular, are currently attracting enormous interest in the world of official statistics. An impressive body of work focuses on how different types of big data (telecom data, social media, sensors, etc.) can be used to fll specifc data gaps, especially with regard to the post-2015 agenda and the associated technology challenges. The focus of this paper is on a different aspect, but one that is of crucial importance: what are the perspectives of the commercial operations and national statistical offces which respectively produce and might use this data and which incentives, business models and protocols are needed in order to leverage non-offcial data sources within the offcial statistics community?

Public-private partnerships (PPPs) offer signifcant opportunities such as cost effectiveness, timeliness, granularity, new indicators, but also present a range of challenges that need to be surmounted. These comprise technical diffculties, risks related to data confdentiality as well as a lack of incentives. Nevertheless, a number of collaborative projects have already emerged and can be

Nevertheless, a number of collaborative projects have already emerged and can be classified into four ideal types: namely the in-house production of statistics by the data provider, the transfer of private data sets to the end user, the transfer of private data sets to a trusted third party for processing and/or analysis, and the outsourcing of national statistical office functions (the only model which is not centred around a data-sharing dimension). In developing countries, a severe lack of resources and particular statistical needs (to adopt a system-wide approach within national statistical systems and fill statistical gaps which are relevant to national development plans) highlight the importance of harnessing the private sector’s resources and point to the most holistic models (in-house and third party) in which the private sector contributes to the processing and analysis of data. The following key lessons are drawn from four case studies….(More)”

Open Data Button


Open Access Button: “Hidden data is hindering research, and we’re tired of it. Next week we’ll release the Open Data Button beta as part of Open Data Day. The Open Data Button will help people find, release, and share the data behind papers. We need your support to share, test, and improve the Open Data Button. Today, we’re going to provide some in depth info about the tool.

You’ll be able to download the free Open Data Button on the 29th of February. Follow the launch conversation on Twitter and at #opendatabutton.

How the Open Data Button works

You will be able to download the Open Data Button on Chrome, and later on Firefox. When you need the data supporting a paper (even if it’s behind a paywall), push the Button. If the data has already been made available through the Open Data Button, we’ll give you a link. If it hasn’t, you’ll be able to start a request for the data. Eventually, we want to search a variety of other sources for it – but can’t yet (read on, we need your help with that).

The request will be sent to the author. We know sharing data can be hard and there’s sometimes good reasons not to. The author will be able to respond to it by saying how long it’ll take to share the data – or if they can’t. If the data is already available, the author can simply share a URL to the dataset. If it isn’t, they can attach files to a response for us to make available. Files shared with us will be deposited in the Open Science Framework for identification and archiving. The Open Science Framework supports data sharing for all disciplines. As much metadata as possible will be obtained from the paper, the rest we’ll ask the author for.

The progress of this request is tracked through our new “request” pages. On request pages others can support a request and be sent a copy of the data when it’s available. We’ll map requests, and stories will be searchable – both will now be embeddable objects.

Once available, we’ll send data to people who’ve requested it. You can award an Open Data Badge to the author if there’s enough supporting information to reproduce the data’s results.

At first we’ll only have a Chrome add-on, but support for Firefox will be available from Firefox 46. Support for a bookmarklet will also be provided, but we don’t have a release date yet….(More)”

 

The Geography of Cultural Ties and Human Mobility: Big Data in Urban Contexts


Wenjie Wu Jianghao Wang & Tianshi Dai  in Annals of the American Association of Geographers: “A largely unexplored big data application in urban contexts is how cultural ties affect human mobility patterns. This article explores China’s intercity human mobility patterns from social media data to contribute to our understanding of this question. Exposure to human mobility patterns is measured by big data computational strategy for identifying hundreds of millions of individuals’ space–time footprint trajectories. Linguistic data are coded as a proxy for cultural ties from a unique geographically coded atlas of dialect distributions. We find that cultural ties are associated with human mobility flows between city pairs, contingent on commuting costs and geographical distances. Such effects are not distributed evenly over time and space, however. These findings present useful insights in support of the cultural mechanism that can account for the rise, decline, and dynamics of human mobility between regions….(More)”

Hoaxmap: Debunking false rumours about refugee ‘crimes’


Teo Kermeliotis at AlJazeera: “Back in the summer of 2015, at the height of the ongoing refugee crisis, Karolin Schwarz started noticing a disturbing pattern.

Just as refugee arrivals in her town of Leipzig, eastern Germany, began to rise, so did the frequency of rumours over supposed crimes committed by those men, women and children who had fled war and hardship to reach Europe.

As months passed by, the allegations became even more common, increasingly popping up in social media feeds and often reproduced by mainstream news outlets.

The online map featured some 240 incidents in its first week [Source: Hoaxmap/Al Jazeera]

 

“The stories seemed to be [orchestrated] by far-right parties and organisations and I wanted to try to find some way to help organise this – maybe find patterns and give people a tool to look up these stories [when] they were being confronted with new ones.”

And so she did.

Along with 35-year-old developer Lutz Helm, Schwarz launched last week Hoaxmap, an online platform that allows people to separate fact from fiction by debunking false rumours about supposed crimes committed by refugees.

Using an interactive system of popping dots, the map documents and categorises where those “crimes” allegedly took place. It then counters that false information with official statements from the police and local authorities, as well as news reports in which the allegations have been disproved. The debunked cases marked on the map range from thefts and assaults to manslaughter – but one of the most common topics is rape, Schwarz said….(More)”

Data Could Help Scholars Persuade, If Only They Were Willing to Use It


Paul Basken at the Chronicle of Higher Education: “Thanks to what they’ve learned from university research, consultants like Matthew Kalmans have become experts in modern political persuasion. A co-founder of Applecart, a New York data firm, Mr. Kalmans specializes in shaping societal attitudes by using advanced analytical techniques to discover and exploit personal connections and friendships. His is one of a fast-growing collection of similar companies now raising millions of dollars, fattening businesses, and aiding political campaigns with computerized records of Facebook exchanges, high-school yearbooks, even neighborhood gossip.

Applecart uses that data to try to persuade people on a range of topics by finding voices they trust to deliver endorsements. “You can use this sort of technology to get people to purchase insurance at higher rates, get people to purchase a product, get people to do all sorts of other things that they might otherwise not be inclined to do,” said Mr. Kalmans, a 2014 graduate of the University of Pennsylvania. And in building such a valuable service, he’s found that the intellectual underpinnings are often free. “We are constantly reading academic papers to get ideas on how to do things better,” Mr. Kalmans said. That’s because scholars conduct the field experiments and subsequent tests that Mr. Kalmans needs to build and refine his models. “They do a lot of the infrastructural work that, frankly, a lot of commercial companies don’t have the in-house expertise to do,” he said of university researchers. Yet the story of Applecart stands in contrast to the dominant attitude and approach among university researchers themselves. Universities are full of researchers who intensively study major global problems such as environmental destruction and societal violence, then stop short when their conclusions point to the need for significant change in public behavior.

Some in academe consider that boundary a matter of principle rather than a systematic failure or oversight. “The one thing that we have to do is not be political,” Michael M. Crow, the usually paradigm-breaking president of Arizona State University, said this summer at a conference on academic engagement in public discourse. “Politics is a process that we are informing. We don’t have to be political to inform politicians or political actors.” But other academics contemplate that stance and see a missed opportunity to help convert the millions of taxpayer dollars spent on research into meaningful societal benefit. They include Dan M. Kahan, a professor of law and of psychology at Yale University who has been trying to help Florida officials cope with climate change. Mr. Kahan works with the four-county Southeast Florida Regional Climate Change Compact, which wants to redesign roads, expand public transit, and build pumping stations to prepare for harsher weather. But Mr. Kahan says he and his Florida partners have had trouble getting enough

But Mr. Kahan says he and his Florida partners have had trouble getting enough policy makers to seriously consider the scale of the problem and the necessary solutions. It’s frustrating, Mr. Kahan said, to see so much university research devoted to work inside laboratories on problems like climate, and comparatively little spent on real-world needs such as sophisticated messaging strategies. “There really is a kind of deficit in the research relating to actually operationalizing the kinds of insights that people have developed from research,” he said. That deficit appears to stem from academic culture, said Utpal M. Dholakia, a professor of marketing at Rice University whose work involves testing people’s self-control in areas such as eating and shopping. He then draws conclusions about whether regulations or taxes aimed at changing behaviors will be effective. Companies find advanced personal behavioral data highly useful, said Mr. Dholakia, who works on the side to help retailers devise sales strategies. But his university, he said, appears more interested in seeing him publish his findings than take the time to help policy makers make real-world use of them. “My dean gets very worried if I don’t publish a lot.” Because universities h

That deficit appears to stem from academic culture, said Utpal M. Dholakia, a professor of marketing at Rice University whose work involves testing people’s self-control in areas such as eating and shopping. He then draws conclusions about whether regulations or taxes aimed at changing behaviors will be effective. Companies find advanced personal behavioral data highly useful, said Mr. Dholakia, who works on the side to help retailers devise sales strategies. But his university, he said, appears more interested in seeing him publish his findings than take the time to help policy makers make real-world use of them. “My dean gets very worried if I don’t publish a lot.” …(More)

Donating Your Selfies to Science


Linda Poon at CityLab: “It’s not only your friends and family who follow your online selfies and group photos. Scientists are starting to look at them, too, though they’re more interested in what’s around you. In bulk, photos can reveal weather patterns across multiple locations, air quality of a place over time, the dynamics of a neighborhood—all sorts of information that helps researchers study cities.

At the Nanyang Technological University in Singapore, a research group is using crowdsourced photos to create a low-cost alternative to air-pollution sensors. Called AirTick, the smartphone app they’ve designed will collect photos from users and analyze how hazy the environment looks. It’ll then check each image against official air quality data, and through machine-learning the app will eventually be able to predict pollution levels based on an image alone.

AirTick creator Pan Zhengziang said in a promotional video last month that the growing concern among the public over air quality can make programs like this a success—especially in Southeast Asia, where smog has gotten so bad that governments have had to shut down schools and suspend outdoor activities.  “In Singapore’s recent haze episode, around 250,000 people [have] shared their concerns via Twitter,” he said. “This has made crowdsourcing-based air quality monitoring a possibility.”…(More)”

The Point of Collection


Essay by Mimi Onuoha: “The conceptual, practical, and ethical issues surrounding “big data” and data in general begin at the very moment of data collection. Particularly when the data concern people, not enough attention is paid to the realities entangled within that significant moment and spreading out from it.

I try to do some disentangling here, through five theses around data collection — points that are worth remembering, communicating, thinking about, dwelling on, and keeping in mind, if you have anything to do with data on a daily basis (read: all of us) and want to do data responsibly.

1. Data sets are the results of their means of collection.

It’s easy to forget that the people collecting a data set, and how they choose to do it, directly determines the data set….

2. As we collect more data, we prioritize things that fit patterns of collection.

Or as Rob Kitchin and Martin Dodge say in Code/Space,“The effect of abstracting the world is that the world starts to structure itself in the image of the capta and the code.” Data emerges from a world that is increasingly software-mediated, and software thrives on abstraction. It flattens out individual variations in favor of types and models….

3. Data sets outlive the rationale for their collection.

Spotify can come up with a list of reasons why having access to users’ photos, locations, microphones, and contact lists can improve the music streaming experience. But the reasons why they decide these forms of data might be useful can be less important than the fact that they have the data itself. This is because the needs or desires influencing the decisions to collect some type of data often eventually disappear, while the data produced as a result of those decisions have the potential to live for much longer. The data are capable of shifting and changing according to specific cultural contexts and to play different roles than what they might have initially been intended for….

4. Corollary: Especially combined, data sets reveal far more than intended.

We sometimes fail to realize that data sets, both on their own and combined with others, can be used to do far more than what they were originally intended for. You can make inferences from one data set that result in conclusions in completely different realms. Facebook, by having huge amounts of data on people and their networks, could make reasonable hypotheses regarding people’s sexual orientations….

5. Data collection is a transaction that is the result of an invisible relationship.

This is a frame — connected to my first point — useful for understanding how to think about data collection on the whole:

Every data set involving people implies subjects and objects, those who collect and those who make up the collected. It is imperative to remember that on both sides we have human beings….(More)”