How to See Gentrification Coming


Nathan Collins at Pacific Standard: “Depending on whom you ask, gentrification is either damaging, not so bad, or maybe even good for the low-income people who live in what we euphemistically call up-and-coming neighborhoods. Either way, it’d be nice for everybody to know which neighborhoods are going to get revitalized/eviscerated next. Now, computer scientists think they’ve found a way to do exactly that: Using Twitter and Foursquare, map the places visited by the most socially diverse crowds. Those, it turns out, are the most likely to gentrify.

Led by University of Cambridge graduate student Desislava Hristova, the researchers began their study by mapping out the social network of 37,722 Londoners who posted Foursquare check-ins via Twitter. Two people were presumed to be friends—connected on the social network—if they followed each other’s Twitter feeds. Next, Hristova and her colleagues built a geographical network of 42,080 restaurants, clubs, shops, apartments, and so on. Quaint though it may seem, the researchers treated two places as neighbors in the geographical network if they were, in fact, physically near each other. The team then linked the social and geographical networks using 549,797 Foursquare check-ins, each of which ties a person in the social network to a place in the geographical one.

Gentrification doesn’t start when outsiders move in; it starts when outsiders come to visit.

Using the network data, the team next constructed several measures of the social diversity of places, each of which helps distinguish between places that bring together friends versus strangers, and to distinguish between spots that attract socially diverse crowds versus a steady group of regulars. Among other things, those measures showed that places in the outer boroughs of London brought together more socially homogenous groups of people—in terms of their Foursquare check-ins, at least—compared with boroughs closer to the core.

But the real question is what social diversity has to do with gentrification. To measure that, the team used the United Kingdom’s Index of Multiple Deprivation, which takes into account income, education, environmental factors such as air quality, and more to quantify the socioeconomic state of affairs in localities across the U.K., including each of London’s 32 boroughs.

The rough pattern, according to the analysis: The most socially diverse places in London were also the most deprived. This is about the opposite of what you’d expect, based on social networks studied in isolation from geography, which indicates that, generally, the people with the most diverse social networks are the most prosperous….(More)”

A ‘design-thinking’ approach to governing the future


Bronwyn van der Merwe at The Hill: “…Government organizations are starting to realize the benefits of digital transformation to reinvent the citizen experience in the form of digital services tailored to individual needs. However, public service leaders are finding that as they move further into the digital age, they need to re-orient their internal organizations around this paradigm shift, or their investments in digital are likely to fail. This is where Design Thinking comes into play.

Design Thinking has become a proven approach to reimagining complex service or organizational issues in the private sector. This approach of user research, rapid prototyping, constant feedback and experimentation is starting to take hold in leading business, like Citrix Systems, Ebay and Google, and is slowly spilling over into government bodies.

Challenges to Adopting a Design-Led Approach

Success in implementing Design Thinking depends on disrupting embedded organizational beliefs and practices, including cultural shifts, changing attitudes toward risk and failure, and encouraging openness and collaboration. Specifically, government bodies need to consider:

  • Top to bottom support – any change as wide-ranging as the shift to Design Thinking requires support from the top. Those at the top of design-led organizations need to be experimenters, improvisers and networkers who lead by example and set the tone for change on the ground.
  • Design skills gap – talent to execute innovation is in short supply and few governments are in a financial position to outbid private sector firms on pay. But the public sector does have something to offer that private companies most often do not: the ability to do meaningful work for the public good. Public sector bodies also need to upskill their current employees – at times partnering with outside design experts.
  • No risk, no reward – for government agencies, it can be challenging to embrace a culture of trial and error. But Design Thinking is useless without Design Doing. Agencies need to recognize the benefits of agile prototyping, iterating and optimizing processes, and that failings early on can save millions while costing little.

What Can Government Bodies Do to Change?

Digital has paved the way for governments and the private sector to occasionally partner to solve thorny challenges. For instance, the White House brought together the U.N. Refugee Agency and crowdfunding platform Kickstarter to raise money for the Syrian relief effort. The weeklong partnership raised nearly $1.8 million for more than 7,000 people in need.

But to effectively communicate with today’s digitally-enabled citizens, there are several key principals government bodies must follow:

  • Plain and simple – use simple language focused on content, structure, navigation, grouping and completion. Strip away the bureaucratic, government-speak and be transparent.
  • Take an outside-in design approach – by considering the entire ecosystem, and using research to uncover insights, service design reveals an outside-in view of the people in the entire ecosystem.
  • Be sensitive – too many government services, tools and processes are opaque and cumbersome when dealing with sensitive issues, such as immigration, making a tax submission, or adopting a child. Fjord recently took a human-centered design framework to the State of Michigan by designing a system that allowed caseworkers to convey the fairness of a child support order, while delivering excellent customer service and increasing transparency and accuracy to families in the midst of an emotionally-charged separation.
  • Work to digitize processes and services across departments – Governments should look to organize their digital services around the needs of the people – whether they are starting a business, retiring or having a child – rather than around their own departmental structures.
  • Address privacy concerns – The assurance of privacy and security is a critical step to encourage adoption of digital channels….(More)”

What Should We Do About Big Data Leaks?


Paul Ford at the New Republic: “I have a great fondness for government data, and the government has a great fondness for making more of it. Federal elections financial data, for example, with every contribution identified, connected to a name and address. Or the results of the census. I don’t know if you’ve ever had the experience of downloading census data but it’s pretty exciting. You can hold America on your hard drive! Meditate on the miracles of zip codes, the way the country is held together and addressable by arbitrary sets of digits.

You can download whole books, in PDF format, about the foreign policy of the Reagan Administration as it related to Russia. Negotiations over which door the Soviet ambassador would use to enter a building. Gigabytes and gigabytes of pure joy for the ephemeralist. The government is the greatest creator of ephemera ever.

Consider the Financial Crisis Inquiry Commission, or FCIC, created in 2009 to figure out exactly how the global economic pooch was screwed. The FCIC has made so much data, and has done an admirable job (caveats noted below) of arranging it. So much stuff. There are reams of treasure on a single FCIC web site, hosted at Stanford Law School: Hundreds of MP3 files, for example, with interviews with Jamie Dimonof JPMorgan Chase and Lloyd Blankfein of Goldman Sachs. I am desperate to find  time to write some code that automatically extracts random audio snippets from each and puts them on top of a slow ambient drone with plenty of reverb, so that I can relax to the dulcet tones of the financial industry explaining away its failings. (There’s a Paul Krugman interview that I assume is more critical.)

The recordings are just the beginning. They’ve released so many documents, and with the documents, a finding aid that you can download in handy PDF format, which will tell you where to, well, find things, pointing to thousands of documents. That aid alone is 1,439 pages.

Look, it is excellent that this exists, in public, on the web. But it also presents a very contemporary problem: What is transparency in the age of massive database drops? The data is available, but locked in MP3s and PDFs and other documents; it’s not searchable in the way a web page is searchable, not easy to comment on or share.

Consider the WikiLeaks release of State Department cables. They were exhausting, there were so many of them, they were in all caps. Or the trove of data Edward Snowden gathered on aUSB drive, or Chelsea Manning on CD. And the Ashley Madison leak, spread across database files and logs of credit card receipts. The massive and sprawling Sony leak, complete with whole email inboxes. And with the just-released Panama Papers, we see two exciting new developments: First, the consortium of media organizations that managed the leak actually came together and collectively, well, branded the papers, down to a hashtag (#panamapapers), informational website, etc. Second, the size of the leak itself—2.5 terabytes!—become a talking point, even though that exact description of what was contained within those terabytes was harder to understand. This, said the consortia of journalists that notably did not include The New York Times, The Washington Post, etc., is the big one. Stay tuned. And we are. But the fact remains: These artifacts are not accessible to any but the most assiduous amateur conspiracist; they’re the domain of professionals with the time and money to deal with them. Who else could be bothered?

If you watched the movie Spotlight, you saw journalists at work, pawing through reams of documents, going through, essentially, phone books. I am an inveterate downloader of such things. I love what they represent. And I’m also comfortable with many-gigabyte corpora spread across web sites. I know how to fetch data, how to consolidate it, and how to search it. I share this skill set with many data journalists, and these capacities have, in some ways, become the sole province of the media. Organs of journalism are among the only remaining cultural institutions that can fund investigations of this size and tease the data apart, identifying linkages and thus constructing informational webs that can, with great effort, be turned into narratives, yielding something like what we call “a story” or “the truth.” 

Spotlight was set around 2001, and it features a lot of people looking at things on paper. The problem has changed greatly since then: The data is everywhere. The media has been forced into a new cultural role, that of the arbiter of the giant and semi-legal database. ProPublica, a nonprofit that does a great deal of data gathering and data journalism and then shares its findings with other media outlets, is one example; it funded a project called DocumentCloud with other media organizations that simplifies the process of searching through giant piles of PDFs (e.g., court records, or the results of Freedom of Information Act requests).

At some level the sheer boredom and drudgery of managing these large data leaks make them immune to casual interest; even the Ashley Madison leak, which I downloaded, was basically an opaque pile of data and really quite boring unless you had some motive to poke around.

If this is the age of the citizen journalist, or at least the citizen opinion columnist, it’s also the age of the data journalist, with the news media acting as product managers of data leaks, making the information usable, browsable, attractive. There is an uneasy partnership between leakers and the media, just as there is an uneasy partnership between the press and the government, which would like some credit for its efforts, thank you very much, and wouldn’t mind if you gave it some points for transparency while you’re at it.

Pause for a second. There’s a glut of data, but most of it comes to us in ugly formats. What would happen if the things released in the interest of transparency were released in actual transparent formats?…(More)”

Can Data Literacy Protect Us from Misleading Political Ads?


Walter Frick at Harvard Business Review: “It’s campaign season in the U.S., and politicians have no compunction about twisting facts and figures, as a quick skim of the fact-checking website Politifact illustrates.

Can data literacy guard against the worst of these offenses? Maybe, according to research.

There is substantial evidence that numeracy can aid critical thinking, and some reason to think it can help in the political realm, within limits. But there is also evidence that numbers can mislead even data-savvy people when it’s in service of those people’s politics.

In a study published at the end of last year, Vittorio Merola of Ohio State University and Matthew Hitt of Louisiana State examined how numeracy might guard against partisan messaging. They showed participants information comparing the costs of probation and prison, and then asked whether participants agreed with the statement, “Probation should be used as an alternative form of punishment, instead of prison, for felons.”

Some of the participants were shown highly relevant numeric information arguing for the benefits of probation: that it costs less and has a better cost-benefit ratio, and that the cost of U.S. prisons has been rising. Another group was shown weaker, less-relevant numeric information. This message didn’t contain anything about the costs or benefits of parole, and instead compared prison costs to transportation spending, with no mention of why these might be at all related. The experiment also varied whether the information was supposedly from a study commissioned by Democrats or Republicans.

The researchers scored participants’ numeracy by asking questions like, “The chance of getting a viral infection is 0.0005. Out of 10,000 people, about how
many of them are expected to get infected?”

For participants who scored low in numeracy, their support depended more on the political party making the argument than on the strength of the data. When the information came from those participants’ own party, they were more likely to agree with it, no matter whether it was weak or strong.

By contrast, participants who scored higher in numeracy were persuaded by the stronger numeric information, even when it came from the other party. The results held up even after accounting for participants’ education, among other variables….

In 2013, Dan Kahan of Yale and several colleagues conducted a study in which they asked participants to draw conclusions from data. In one group, the data was about a treatment for skin rashes, a nonpolitical topic. Another group was asked to evaluate data on gun control, comparing crime rates for cities that have banned concealed weapons to cities that haven’t.

Additionally, in the skin rash group some participants were shown data indicating that the use of skin cream correlated with rashes getting better, while some were shown the opposite. Similarly, some in the gun control group were shown less crime in cities that have banned concealed weapons, while some were shown the reverse…. They found that highly numerate people did better than less-numerate ones in drawing the correct inference in the skin rash case. But comfort with numbers didn’t seem to help when it came to gun control. In fact, highly numerate participants were more polarized over the gun control data than less-numerate ones. The reason seemed to be that the numerate participants used their skill with data selectively, employing it only when doing so helped them reach a conclusion that fit with their political ideology.

Two other lines of research are relevant here.

First, work by Philip Tetlock and Barbara Mellers of the University of Pennsylvania suggests that numerate people tend to make better forecasts, including about geopolitical events. They’ve also documented that even very basic training in probabilistic thinking can improve one’s forecasting accuracy. And this approach works best, Tetlock argues, when it’s part of a whole style of thinking that emphasizes multiple points of view.

Second, two papers, one from the University of Texas at Austin and one from Princeton, found that partisan bias can be diminished with incentives: People are more likely to report factually correct beliefs about the economy when money is on the line…..(More)”

New Orleans Gamifies the City Budget


Kelsey E. Thomas at Next City: “New Orleanians can try their hand at being “mayor for a day” with a new interactive website released by the Committee for a Better New Orleans Wednesday.

The Big Easy Budget Game uses open data from the city to allow players to create their own version of an operating budget. Players are given a digital $602 million, and have to balance the budget — keeping in mind the government’s responsibilities, previous year’s spending and their personal priorities.

Each department in the game has a minimum funding level (players can’t just quit funding public schools if they feel like it), and restricted funding, such as state or federal dollars, is off limits.

CBNO hopes to attract 600 players this year, and plans to compile the data from each player into a crowdsourced meta-budget called “The People’s Budget.” Next fall, the People’s Budget will be released along with the city’s proposed 2017 budget.

Along with the budgeting game, CBNO released a more detailed website, also using the city’s open data, that breaks down the city’s budgeted versus actual spending from 2007 to now and is filterable. The goal is to allow users without big data experience to easily research funding relevant to their neighborhoods.

Many cities have been releasing interactive websites to make their data more accessible to residents. Checkbook NYC updates more than $70 billion in city expenses daily and breaks them down by transaction. Fiscal Focus Pittsburgh is an online visualization tool that outlines revenues and expenses in the city’s budget….(More)”

Website Seeks to Make Government Data Easier to Sift Through


Steve Lohr at the New York Times: “For years, the federal government, states and some cities have enthusiastically made vast troves of data open to the public. Acres of paper records on demographics, public health, traffic patterns, energy consumption, family incomes and many other topics have been digitized and posted on the web.

This abundance of data can be a gold mine for discovery and insights, but finding the nuggets can be arduous, requiring special skills.

A project coming out of the M.I.T. Media Lab on Monday seeks to ease that challenge and to make the value of government data available to a wider audience. The project, called Data USA, bills itself as “the most comprehensive visualization of U.S. public data.” It is free, and its software code is open source, meaning that developers can build custom applications by adding other data.

Cesar A. Hidalgo, an assistant professor of media arts and sciences at the M.I.T. Media Lab who led the development of Data USA, said the website was devised to “transform data into stories.” Those stories are typically presented as graphics, charts and written summaries….Type “New York” into the Data USA search box, and a drop-down menu presents choices — the city, the metropolitan area, the state and other options. Select the city, and the page displays an aerial shot of Manhattan with three basic statistics: population (8.49 million), median household income ($52,996) and median age (35.8).

Lower on the page are six icons for related subject categories, including economy, demographics and education. If you click on demographics, one of the so-called data stories appears, based largely on data from the American Community Survey of the United States Census Bureau.

Using colorful graphics and short sentences, it shows the median age of foreign-born residents of New York (44.7) and of residents born in the United States (28.6); the most common countries of origin for immigrants (the Dominican Republic, China and Mexico); and the percentage of residents who are American citizens (82.8 percent, compared with a national average of 93 percent).

Data USA features a selection of data results on its home page. They include the gender wage gap in Connecticut; the racial breakdown of poverty in Flint, Mich.; the wages of physicians and surgeons across the United States; and the institutions that award the most computer science degrees….(More)

Evaluating e-Participation: Frameworks, Practice, Evidence


Book edited by Georg Aichholzer, Herbert Kubicek and Lourdes Torres: “There is a widely acknowledged evaluation gap in the field of e-participation practice and research, a lack of systematic evaluation with regard to process organization, outcome and impacts. This book addresses the state of the art of e-participation research and the existing evaluation gap by reviewing various evaluation approaches and providing a multidisciplinary concept for evaluating the output, outcome and impact of citizen participation via the Internet as well as via traditional media. It offers new knowledge based on empirical results of its application (tailored to different forms and levels of e-participation) in an international comparative perspective. The book will advance the academic study and practical application of e-participation through fresh insights, largely drawing on theoretical arguments and empirical research results gained in the European collaborative project “e2democracy”. It applies the same research instruments to a set of similar citizen participation processes in seven local communities in three countries (Austria, Germany and Spain). The generic evaluation framework has been tailored to a tested toolset, and the presentation and discussion of related evaluation results aims at clarifying to what extent these tools can be applied to other consultation and collaboration processes, making the book of interest to policymakers and scholars alike….(More)”

The Curious Journalist’s Guide to Data


New book by The Tow Center: “This is a book about the principles behind data journalism. Not what visualization software to use and how to scrape a website, but the fundamental ideas that underlie the human use of data. This isn’t “how to use data” but “how data works.”

This gets into some of the mathy parts of statistics, but also the difficulty of taking a census of race and the cognitive psychology of probabilities. It traces where data comes from, what journalists do with it, and where it goes after—and tries to understand the possibilities and limitations. Data journalism is as interdisciplinary as it gets, which can make it difficult to assemble all the pieces you need. This is one attempt. This is a technical book, and uses standard technical language, but all mathematical concepts are explained through pictures and examples rather than formulas.

The life of data has three parts: quantification, analysis, and communication. Quantification is the process that creates data. Analysis involves rearranging the data or combining it with other information to produce new knowledge. And none of this is useful without communicating the result.

Quantification is a problem without a home. Although physicists study measurement extensively, physical theory doesn’t say much about how to quantify things like “educational attainment” or even “unemployment.” There are deep philosophical issues here, but the most useful question to a journalist is simply, how was this data created? Data is useful because it represents the world, but we can only understand data if we correctly understand how it came to be. Representation through data is never perfect: all data has error. Randomly sampled surveys are both a powerful quantification technique and the prototype for all measurement error, so this report explains where the margin of error comes from and what it means – from first principles, using pictures.

All data analysis is really data interpretation, which requires much more than math. Data needs context to mean anything at all: Imagine if someone gave you a spreadsheet with no column names. Each data set could be the source of many different stories, and there is no objective theory that tells us which true stories are the best. But the stories still have to be true, which is where data journalism relies on established statistical principles. The theory of statistics solves several problems: accounting for the possibility that the pattern you see in the data was purely a fluke, reasoning from incomplete and conflicting information, and attempting to isolate causes. Stats has been taught as something mysterious, but it’s not. The analysis chapter centers on a single problem – asking if an earlier bar closing time really did reduce assaults in a downtown neighborhood – and traces through the entire process of analysis by explaining the statistical principles invoked at each step, building up to the state-of-the-art methods of Bayesian inference and causal graphs.

A story isn’t isn’t finished until you’ve communicated your results. Data visualization works because it relies on the biology of human visual perception, just as all data communication relies on human cognitive processing. People tend to overestimate small risks and underestimate large risks; examples leave a much stronger impression than statistics; and data about some will, unconsciously, come to represent all, no matter how well you warn that your sample doesn’t generalize. If you’re not aware of these issues you can leave people with skewed impressions or reinforce harmful stereotypes. The journalist isn’t only responsible for what they put in the story, but what ends up in the mind of the audience.

This report brings together many fields to explore where data comes from, how to analyze it, and how to communicate your results. It uses examples from journalism to explain everything from Bayesian statistics to the neurobiology of data visualization, all in plain language with lots of illustrations. Some of these ideas are thousands of years old, some were developed only a decade ago, and all of them have come together to create the 21st century practice of data journalism….(More)”

Crowdsourcing Human Rights


Faisal Al Mutar at The World Post: “The Internet has also allowed activists to access information as never before. I recently joined the Movements.org team, a part of the New York-based organization, Advancing Human Rights. This new platform allows activists from closed societies to connect directly with people around the world with skills to help them. In the first month of its launch, thousands of activists from 92 countries have come to Movements.org to defend human rights.

Movements.org is a promising example of how technology can be utilized by activists to change the world. Dissidents from some of the most repressive dictatorships — Russia, Iran, Syria and China — are connecting with individuals from around the globe who have unique skills to aid them.

Here are just a few of the recent success stories:

  • A leading Saudi expert on combatting state-sponsored incitement in textbooks posted a request to speak with members of the German government due to their strict anti-hate-speech laws. A former foundation executive connected him with senior German officials.
  • A secular Syrian group posted a request for PR aid to explain to Americans that the opposition is not comprised solely of radical elements. The founder of a strategic communication firm based in Los Angeles responded and offered help.
  • A Yemeni dissident asked for help creating a radio station focused on youth empowerment. He was contacted by a Syrian dissident who set up Syrian radio programs to offer advice.
  • Journalists from leading newspapers offered to tell human rights stories and connected with activists from dictatorships.
  • A request was created for a song to commemorate the life of Sergei Magnitsky, a Russia tax lawyer who died in prisoner. A NYC-based song-writer created a beautiful song and activists from Russia (including a member of Pussy Riot) filmed a music video of it.
  • North Korean defectors posted requests to get information in and out of their country and technologists posted offers to help with radio and satellite communication systems.
  • A former Iranian political prisoner posted a request to help sustain his radio station which broadcasts into Iran and helps keep information flowing to Iranians.

There are more and more cases everyday….(More)

Accountable machines: bureaucratic cybernetics?


Alison Powell at LSE Media Policy Project Blog: “Algorithms are everywhere, or so we are told, and the black boxes of algorithmic decision-making make oversight of processes that regulators and activists argue ought to be transparent more difficult than in the past. But when, and where, and which machines do we wish to make accountable, and for what purpose? In this post I discuss how algorithms discussed by scholars are most commonly those at work on media platforms whose main products are the social networks and attention of individuals. Algorithms, in this case, construct individual identities through patterns of behaviour, and provide the opportunity for finely targeted products and services. While there are serious concerns about, for instance, price discrimination, algorithmic systems for communicating and consuming are, in my view, less inherently problematic than processes that impact on our collective participation and belonging as citizenship. In this second sphere, algorithmic processes – especially machine learning – combine with processes of governance that focus on individual identity performance to profoundly transform how citizenship is understood and undertaken.

Communicating and consuming

In the communications sphere, algorithms are what makes it possible to make money from the web for example through advertising brokerage platforms that help companies bid for ads on major newspaper websites. IP address monitoring, which tracks clicks and web activity, creates detailed consumer profiles and transform the everyday experience of communication into a constantly-updated production of consumer information. This process of personal profiling is at the heart of many of the concerns about algorithmic accountability. The consequence of perpetual production of data by individuals and the increasing capacity to analyse it even when it doesn’t appear to relate has certainly revolutionalised advertising by allowing more precise targeting, but what has it done for areas of public interest?

John Cheney-Lippold identifies how the categories of identity are now developed algorithmically, since a category like gender is not based on self-discloure, but instead on patterns of behaviour that fit with expectations set by previous alignment to a norm. In assessing ‘algorithmic identities’, he notes that these produce identity profiles which are narrower and more behaviour-based than the identities that we perform. This is a result of the fact that many of the systems that inspired the design of algorithmic systems were based on using behaviour and other markers to optimise consumption. Algorithmic identity construction has spread from the world of marketing to the broader world of citizenship – as evidenced by the Citizen Ex experiment shown at the Web We Want Festival in 2015.

Individual consumer-citizens

What’s really at stake is that the expansion of algorithmic assessment of commercially derived big data has extended the frame of the individual consumer into all kinds of other areas of experience. In a supposed ‘age of austerity’ when governments believe it’s important to cut costs, this connects with the view of citizens as primarily consumers of services, and furthermore, with the idea that a citizen is an individual subject whose relation to a state can be disintermediated given enough technology. So, with sensors on your garbage bins you don’t need to even remember to take them out. With pothole reporting platforms like FixMyStreet, a city government can be responsive to an aggregate of individual reports. But what aspects of our citizenship are collective? When, in the algorithmic state, can we expect to be together?

Put another way, is there any algorithmic process to value the long term education, inclusion, and sustenance of a whole community for example through library services?…

Seeing algorithms – machine learning in particular – as supporting decision-making for broad collective benefit rather than as part of ever more specific individual targeting and segmentation might make them more accountable. But more importantly, this would help algorithms support society – not just individual consumers….(More)”