How a New Science of Cities Is Emerging from Mobile Phone Data Analysis


MIT Technology Review: “Mobile phones have generated enormous insight into the human condition thanks largely to the study of the data they produce. Mobile phone companies record the time of each call, the caller and receiver ids, as well as the locations of the cell towers involved, among other things.
The combined data from millions of people produces some fascinating new insights in the nature of our society. Anthropologists have crunched it to reveal human reproductive strategiesa universal law of commuting and even the distribution of wealth in Africa.
Today, computer scientists have gone one step further by using mobile phone data to map the structure of cities and how people use them throughout the day. “These results point towards the possibility of a new, quantitative classification of cities using high resolution spatio-temporal data,” say Thomas Louail at the Institut de Physique Théorique in Paris and a few pals.
They say their work is part of a new science of cities that aims to objectively measure and understand the nature of large population centers.
These guys begin with a database of mobile phone calls made by people in the 31 Spanish cities that have populations larger than 200,000. The data consists of the number of unique individuals using a given cell tower (whether making a call or not) for each hour of the day over almost two months….The results reveal some fascinating patterns in city structure. For a start, every city undergoes a kind of respiration in which people converge into the center and then withdraw on a daily basis, almost like breathing. And this happens in all cities. This “suggests the existence of a single ‘urban rhythm’ common to all cities,” say Louail and co.
During the week, the number of phone users peaks at about midday and then again at about 6 p.m. During the weekend the numbers peak a little later: at 1 p.m. and 8 p.m. Interestingly, the second peak starts about an hour later in western cities, such as Sevilla and Cordoba.
The data also reveals that small cities tend to have a single center that becomes busy during the day, such as the cities of Salamanca and Vitoria.
But it also shows that the number of hotspots increases with city size; so-called polycentric cities include Spain’s largest, such as Madrid, Barcelona, and Bilboa.
That could turn out to be useful for automatically classifying cities.
There is a growing interest in the nature of cities, the way they evolve and how their residents use them. The goal of this new science is to make better use of these spaces that more than 50 percent of the planet inhabit. Louail and co show that mobile phone data clearly has an important role to play in this endeavor to better understanding these complex giants.
Ref: arxiv.org/abs/1401.4540 : From Mobile Phone Data To The Spatial Structure Of Cities”

Big Data and the Future of Privacy


John Podesta at the White House blog: “Last Friday, the President spoke to the American people, and the international community, about how to keep us safe from terrorism in a changing world while upholding America’s commitment to liberty and privacy that our values and Constitution require. Our national security challenges are real, but that is surely not the only space where changes in technology are altering the landscape and challenging conceptions of privacy.
That’s why in his speech, the President asked me to lead a comprehensive review of the way that “big data” will affect the way we live and work; the relationship between government and citizens; and how public and private sectors can spur innovation and maximize the opportunities and free flow of this information while minimizing the risks to privacy. I will be joined in this effort by Secretary of Commerce Penny Pritzker, Secretary of Energy Ernie Moniz, the President’s Science Advisor John Holdren, the President’s Economic Advisor Gene Sperling and other senior government officials.
I would like to explain a little bit more about the review, its scope, and what you can expect over the next 90 days.
We are undergoing a revolution in the way that information about our purchases, our conversations, our social networks, our movements, and even our physical identities are collected, stored, analyzed and used. The immense volume, diversity and potential value of data will have profound implications for privacy, the economy, and public policy. The working group will consider all those issues, and specifically how the present and future state of these technologies might motivate changes in our policies across a range of sectors.
When we complete our work, we expect to deliver to the President a report that anticipates future technological trends and frames the key questions that the collection, availability, and use of “big data” raise – both for our government, and the nation as a whole. It will help identify technological changes to watch, whether those technological changes are addressed by the U.S.’s current policy framework and highlight where further government action, funding, research and consideration may be required.
This is going to be a collaborative effort. The President’s Council of Advisors on Science and Technology (PCAST) will conduct a study to explore in-depth the technological dimensions of the intersection of big data and privacy, which will feed into this broader effort. Our working group will consult with industry, civil liberties groups, technologists, privacy experts, international partners, and other national and local government officials on the significance of and future for these technologies. Finally, we will be working with a number of think tanks, academic institutions, and other organizations around the country as they convene stakeholders to discuss these very issues and questions. Likewise, many abroad are analyzing and responding to the challenge and seizing the opportunity of big data. These discussions will help to inform our study.
While we don’t expect to answer all these questions, or produce a comprehensive new policy in 90 days, we expect this work to serve as the foundation for a robust and forward-looking plan of action. Check back on this blog for updates on how you can get involved in the debate and for status updates on our progress.”

Don’t believe the hype about behavioral economics


Allison Schrager: “I have a confession to make: I think behavioral economics is over-rated. Recently, Nobelist Robert Shiller called on economists to incorporate more psychology into their work. While there are certainly things economists can learn from psychology and other disciplines to enrich their understanding of the economy, this approach is not a revolution in economics. Often models that incorporate richer aspects of human behavior are the same models economists always use—they simply rationalize seemingly irrational behavior. Even if we can understand why people don’t always act rationally, it’s not clear if that can lead to better economic policy and regulation.

Mixing behavioral economics and policy raises two questions: should we change behavior and if so, can we? Sometimes people make bad choices—they under-save, take on too much debt or risk. These behaviors appear irrational and lead to bad outcomes, which would seem to demand more regulation. But if these choices reflect individuals’ preferences and values can we justify changing their behavior? Part of a free-society is letting people make bad choices, as long as his or her irrational economic behavior doesn’t pose costs to others. For example: Someone who under-saves may wind up dependent on taxpayers for financial support. High household debt has been associated with a weaker economy

It’s been argued that irrational economic behavior merits regulation to encourage or force choices that will benefit both the individual and the economy as a whole. But the limits of these policies are apparent in a new OECD report on the application of behavioral economics to policy. The report gives examples of regulations adopted by different OECD countries that draw on insights from behavioral economics. Thus it’s disappointing that, with all economists have learned studying behavioral economics the last ten years,   the big changes in regulation seem limited to more transparent fee disclosure, a ban on automatically selling people more goods than they explicitly ask for, and standard disclosures fees and energy use. These are certainly good policies. But is this a result of behavioral economics (helping consumers over-come behavioral bias that leads to sub-optimal choices) or is it simply requiring banks and merchants to be more honest?

Poor risk management and short-term thinking on Wall Street nearly took down the entire financial system. Can what we know about behavioral finance regulate Wall Street? According to Shiller, markets are inefficient and misprice assets because of behavioral biases (over-confidence, under-reaction to news, home bias). This leads to speculative bubbles. But it’s not clear what financial regulation can do to curb this behavior. According Gene Fama, Shiller’s co-laureate who believes markets are rational, (Disclosure: I used to work at Dimensional Fund Advisors where Fama is a consultant and shareholder) it’s not possible to systematically separate “irrational” behavior (that distorts prices) from healthy speculation, which aids price discovery. If speculators (who have an enormous financial interest) don’t know better, how can we expect regulators to?…

So far, the most promising use of behavioral economics has been in retirement saving. Automatically enrolling people into a company pension plan and raising their saving rates has been found to increase savings—especially among people not inclined to save. That is probably why the OECD report concedes behavioral economics has had the biggest impact in retirement saving….

The OECD report cites some other new policies based on behavioral science like the the 2009 CARD act in America. Credit card statements used to only list the minimum required payment, which people may have interpreted as a suggested payment plan and wound up taking years to pay their balance, incurring large fees. Now, in the US, statements must include how much it will cost to pay your balance within 36 months and the time and cost required to repay your balance if you pay the minimum. It’s still too early to see how this will impact behavior, but a 2013 study suggests it will offer modest savings to consumers, perhaps because the bias to under-value the future still exists.

But what’s striking from the OECD report is, when it comes to behavioral biases that contributed to the financial crisis (speculation on housing, too much housing debt, under-estimating risk), few policies have used what we’ve learned.”

Use big data and crowdsourcing to detect nuclear proliferation, says DSB


FierceGovernmentIT: “A changing set of counter-nuclear proliferation problems requires a paradigm shift in monitoring that should include big data analytics and crowdsourcing, says a report from the Defense Science Board.
Much has changed since the Cold War when it comes to ensuring that nuclear weapons are subject to international controls, meaning that monitoring in support of treaties covering declared capabilities should be only one part of overall U.S. monitoring efforts, says the board in a January report (.pdf).
There are challenges related to covert operations, such as testing calibrated to fall below detection thresholds, and non-traditional technologies that present ambiguous threat signatures. Knowledge about how to make nuclear weapons is widespread and in the hands of actors who will give the United States or its allies limited or no access….
The report recommends using a slew of technologies including radiation sensors, but also exploitation of digital sources of information.
“Data gathered from the cyber domain establishes a rich and exploitable source for determining activities of individuals, groups and organizations needed to participate in either the procurement or development of a nuclear device,” it says.
Big data analytics could be used to take advantage of the proliferation of potential data sources including commercial satellite imaging, social media and other online sources.
The report notes that the proliferation of readily available commercial satellite imagery has created concerns about the introduction of more noise than genuine signal. “On balance, however, it is the judgment from the task force that more information from remote sensing systems, both commercial and dedicated national assets, is better than less information,” it says.
In fact, the ready availability of commercial imagery should be an impetus of governmental ability to find weak signals “even within the most cluttered and noisy environments.”
Crowdsourcing also holds potential, although the report again notes that nuclear proliferation analysis by non-governmental entities “will constrain the ability of the United States to keep its options open in dealing with potential violations.” The distinction between gathering information and making political judgments “will erode.”
An effort by Georgetown University students (reported in the Washington Post in 2011) to use open source data analyzing the network of tunnels used in China to hide its missile and nuclear arsenal provides a proof-of-concept on how crowdsourcing can be used to augment limited analytical capacity, the report says – despite debate on the students’ work, which concluded that China’s arsenal could be many times larger than conventionally accepted…
For more:
download the DSB report, “Assessment of Nuclear Monitoring and Verification Technologies” (.pdf)
read the WaPo article on the Georgetown University crowdsourcing effort”

Introduction to Computational Social Science: Principles and Applications


New book by Claudio Cioffi-Revilla: “This reader-friendly textbook is the first work of its kind to provide a unified Introduction to Computational Social Science (CSS). Four distinct methodological approaches are examined in detail, namely automated social information extraction, social network analysis, social complexity theory and social simulation modeling. The coverage of these approaches is supported by a discussion of the historical context, as well as by a list of texts for further reading. Features: highlights the main theories of the CSS paradigm as causal explanatory frameworks that shed new light on the nature of human and social dynamics; explains how to distinguish and analyze the different levels of analysis of social complexity using computational approaches; discusses a number of methodological tools; presents the main classes of entities, objects and relations common to the computational analysis of social complexity; examines the interdisciplinary integration of knowledge in the context of social phenomena.”

The Power to Decide


Special Report by Antonio Regalado in MIT Technology Review: “Back in 1956, an engineer and a mathematician, William Fair and Earl Isaac, pooled $800 to start a company. Their idea: a score to handicap whether a borrower would repay a loan.
It was all done with pen and paper. Income, gender, and occupation produced numbers that amounted to a prediction about a person’s behavior. By the 1980s the three-digit scores were calculated on computers and instead took account of a person’s actual credit history. Today, Fair Isaac Corp., or FICO, generates about 10 billion credit scores annually, calculating 50 times a year for many Americans.
This machinery hums in the background of our financial lives, so it’s easy to forget that the choice of whether to lend used to be made by a bank manager who knew a man by his handshake. Fair and Isaac understood that all this could change, and that their company didn’t merely sell numbers. “We sell a radically different way of making decisions that flies in the face of tradition,” Fair once said.
This anecdote suggests a way of understanding the era of “big data”—terabytes of information from sensors or social networks, new computer architectures, and clever software. But even supercharged data needs a job to do, and that job is always about a decision.
In this business report, MIT Technology Review explores a big question: how are data and the analytical tools to manipulate it changing decision making today? On Nasdaq, trading bots exchange a billion shares a day. Online, advertisers bid on hundreds of thousands of keywords a minute, in deals greased by heuristic solutions and optimization models rather than two-martini lunches. The number of variables and the speed and volume of transactions are just too much for human decision makers.
When there’s a person in the loop, technology takes a softer approach (see “Software That Augments Human Thinking”). Think of recommendation engines on the Web that suggest products to buy or friends to catch up with. This works because Internet companies maintain statistical models of each of us, our likes and habits, and use them to decide what we see. In this report, we check in with LinkedIn, which maintains the world’s largest database of résumés—more than 200 million of them. One of its newest offerings is University Pages, which crunches résumé data to offer students predictions about where they’ll end up working depending on what college they go to (see “LinkedIn Offers College Choices by the Numbers”).
These smart systems, and their impact, are prosaic next to what’s planned. Take IBM. The company is pouring $1 billion into its Watson computer system, the one that answered questions correctly on the game show Jeopardy! IBM now imagines computers that can carry on intelligent phone calls with customers, or provide expert recommendations after digesting doctors’ notes. IBM wants to provide “cognitive services”—computers that think, or seem to (see “Facing Doubters, IBM Expands Plans for Watson”).
Andrew Jennings, chief analytics officer for FICO, says automating human decisions is only half the story. Credit scores had another major impact. They gave lenders a new way to measure the state of their portfolios—and to adjust them by balancing riskier loan recipients with safer ones. Now, as other industries get exposed to predictive data, their approach to business strategy is changing, too. In this report, we look at one technique that’s spreading on the Web, called A/B testing. It’s a simple tactic—put up two versions of a Web page and see which one performs better (see “Seeking Edge, Websites Turn to Experiments” and “Startups Embrace a Way to Fail Fast”).
Until recently, such optimization was practiced only by the largest Internet companies. Now, nearly any website can do it. Jennings calls this phenomenon “systematic experimentation” and says it will be a feature of the smartest companies. They will have teams constantly probing the world, trying to learn its shifting rules and deciding on strategies to adapt. “Winners and losers in analytic battles will not be determined simply by which organization has access to more data or which organization has more money,” Jennings has said.

Of course, there’s danger in letting the data decide too much. In this report, Duncan Watts, a Microsoft researcher specializing in social networks, outlines an approach to decision making that avoids the dangers of gut instinct as well as the pitfalls of slavishly obeying data. In short, Watts argues, businesses need to adopt the scientific method (see “Scientific Thinking in Business”).
To do that, they have been hiring a highly trained breed of business skeptics called data scientists. These are the people who create the databases, build the models, reveal the trends, and, increasingly, author the products. And their influence is growing in business. This could be why data science has been called “the sexiest job of the 21st century.” It’s not because mathematics or spreadsheets are particularly attractive. It’s because making decisions is powerful…”

How should we analyse our lives?


Gillian Tett in the Financial Times on the challenge of using the new form of data science: “A few years ago, Alex “Sandy” Pentland, a professor of computational social sciences at MIT Media Lab, conducted a curious experiment at a Bank of America call centre in Rhode Island. He fitted 80 employees with biometric devices to track all their movements, physical conversations and email interactions for six weeks, and then used a computer to analyse “some 10 gigabytes of behaviour data”, as he recalls.
The results showed that the workers were isolated from each other, partly because at this call centre, like others of its ilk, the staff took their breaks in rotation so that the phones were constantly manned. In response, Bank of America decided to change its system to enable staff to hang out together over coffee and swap ideas in an unstructured way. Almost immediately there was a dramatic improvement in performance. “The average call-handle time decreased sharply, which means that the employees were much more productive,” Pentland writes in his forthcoming book Social Physics. “[So] the call centre management staff converted the break structure of all their call centres to this new system and forecast a $15m per year productivity increase.”
When I first heard Pentland relate this tale, I was tempted to give a loud cheer on behalf of all long-suffering call centre staff and corporate drones. Pentland’s data essentially give credibility to a point that many people know instinctively: that it is horribly dispiriting – and unproductive – to have to toil in a tiny isolated cubicle by yourself all day. Bank of America deserves credit both for letting Pentland’s team engage in this people-watching – and for changing its coffee-break schedule in response.
But there is a bigger issue at stake here too: namely how academics such as Pentland analyse our lives. We have known for centuries that cultural and social dynamics influence how we behave but until now academics could usually only measure this by looking at micro-level data, which were often subjective. Anthropology (a discipline I know well) is a case in point: anthropologists typically study cultures by painstakingly observing small groups of people and then extrapolating this in a subjective manner.

Pentland and others like him are now convinced that the great academic divide between “hard” and “soft” sciences is set to disappear, since researchers these days can gather massive volumes of data about human behaviour with precision. Sometimes this information is volunteered by individuals, on sites such as Facebook; sometimes it can be gathered from the electronic traces – the “digital breadcrumbs” – that we all deposit (when we use a mobile phone, say) or deliberately collected with biometric devices like the ones used at Bank of America. Either way, it can enable academics to monitor and forecast social interaction in a manner we could never have dreamed of before. “Social physics helps us understand how ideas flow from person to person . . . and ends up shaping the norms, productivity and creative output of our companies, cities and societies,” writes Pentland. “Just as the goal of traditional physics is to understand how the flow of energy translates into change in motion, social physics seems to understand how the flow of ideas and information translates into changes in behaviour….

But perhaps the most important point is this: whether you love or hate this new form of data science, the genie cannot be put back in the bottle. The experiments that Pentland and many others are conducting at call centres, offices and other institutions across America are simply the leading edge of a trend.

The only question now is whether these powerful new tools will be mostly used for good (to predict traffic queues or flu epidemics) or for more malevolent ends (to enable companies to flog needless goods, say, or for government control). Sadly, “social physics” and data crunching don’t offer any prediction on this issue, even though it is one of the dominant questions of our age.”

Algorithms and the Changing Frontier


A GMU School of Public Policy Research Paper by Agwara, Hezekiah and Auerswald, Philip E. and Higginbotham, Brian D.: “We first summarize the dominant interpretations of the “frontier” in the United States and predecessor colonies over the past 400 years: agricultural (1610s-1880s), industrial (1890s-1930s), scientific (1940s-1980s), and algorithmic (1990s-present). We describe the difference between the algorithmic frontier and the scientific frontier. We then propose that the recent phenomenon referred to as “globalization” is actually better understood as the progression of the algorithmic frontier, as enabled by standards that in turn have facilitated the interoperability of firm-level production algorithms. We conclude by describing implications of the advance of the algorithmic frontier for scientific discovery and technological innovation.”

MIT Crowdsources the Next Great (free) IQ Test


ThePsychReport: “Raven’s Matrices have long been a gold standard for psychologists needing to measure general intelligence. But the good ones, the ones scientists like to use, are too expensive for most research projects.

Christopher Chabris, associate professor of psychology at Union College, and David Engel, postdoctoral associate at MIT Sloan School of Management, think the public can help. They recently launched a campaign to crowdsource “the next great IQ test.” The Matrix Reasoning Challenge, created through MIT’s Center for Collective Intelligence with Anita Woolley and Tom Malone,  calls on the public to design and submit matrix puzzles – 3×3 grids that asks subjects to complete a pattern by filling in a missing square.

Chabris says they aren’t trying to compete with commercially available tests used for diagnostic or clinical purposes, but rather want to provide a trustworthy and free alternative for scientists. Because these types of puzzles are nonverbal, culturally neutral, and objective, they have wide-ranging applications and are particularly useful when conducting research across various demographics. If this project is successful, a lot more scientists could do a lot more research.

A simple example of a matrix puzzle. Source: Matrix Reasoning Challenge

“Researchers typically don’t have that much money,” Chabris said. “They can’t afford pay per use tests. Sometimes they have no research budgets, or if they do, they’re not large enough for that kind of thing. Our real goal is to create something useful for researchers.”

Through the Matrix Reasoning Challenge, Chabris and Engel also hope to better understand how crowdsourcing can be used to problem-solve in social and cognitive sciences.

Social scientists already widely use crowdsourcing sites like Amazon’s Mechanical Turk to recruit participants for their studies, but the matrix project is different in that it seeks to tap into the public’s expertise to help solve scientific problems. Scientists in computer science and bioinformatics have been able to harness this expertise to yield some incredible results. Using TopCoder.com, NASA was able to find a more efficient way to deploy solar panels on the International Space Station. Harvard Medical School was able to develop better software for analyzing immune-system genes. With The Matrix Reasoning Challenge, Chabris and Engel are beginning to explore crowdsourcing’s potential in the social sciences.”

Needed: A New Generation of Game Changers to Solve Public Problems


Beth Noveck: “In order to change the way we govern, it is important to train and nurture a new generation of problem solvers who possess the multidisciplinary skills to become effective agents of change. That’s why we at the GovLab have launched The GovLab Academy with the support of the Knight Foundation.
In an effort to help people in their own communities become more effective at developing and implementing creative solutions to compelling challenges, The Gov Lab Academy is offering two new training programs:
1) An online platform with an unbundled and evolving set of topics, modules and instructors on innovations in governance, including themes such as big and open data and crowdsourcing and forthcoming topics on behavioral economics, prizes and challenges, open contracting and performance management for governance;
2) Gov 3.0: A curated and sequenced, 14-week mentoring and training program.
While the online-platform is always freely available, Gov 3.0 begins on January 29, 2014 and we invite you to to participate. Please forward this email to your networks and help us spread the word about the opportunity to participate.
Please consider applying (individuals or teams may apply), if you are:

  • an expert in communications, public policy, law, computer science, engineering, business or design who wants to expand your ability to bring about social change;

  • a public servant who wants to bring innovation to your job;

  • someone with an important idea for positive change but who lacks key skills or resources to realize the vision;

  • interested in joining a network of like-minded, purpose-driven individuals across the country; or

  • someone who is passionate about using technology to solve public problems.

The program includes live instruction and conversation every Wednesday from 5:00– 6:30 PM EST for 14 weeks starting Jan 29, 2014. You will be able to participate remotely via Google Hangout.

Gov 3.0 will allow you to apply evolving technology to the design and implementation of effective solutions to public interest challenges. It will give you an overview of the most current approaches to smarter governance and help you improve your skills in collaboration, communication, and developing and presenting innovative ideas.

Over 14 weeks, you will develop a project and a plan for its implementation, including a long and short description, a presentation deck, a persuasive video and a project blog. Last term’s projects covered such diverse issues as post-Fukushima food safety, science literacy for high schoolers and prison reform for the elderly. In every case, the goal was to identify realistic strategies for making a difference quickly.  You can read the entire Gov 3.0 syllabus here.

The program will include national experts and instructors in technology and governance both as guests and as mentors to help you design your project. Last term’s mentors included current and former officials from the White House and various state, local and international governments, academics from a variety of fields, and prominent philanthropists.

People who complete the program will have the opportunity to apply for a special fellowship to pursue their projects further.

Previously taught only on campus, we are offering Gov 3.0 in beta as an online program. This is not a MOOC. It is a mentoring-intensive coaching experience. To maximize the quality of the experience, enrollment is limited.

Please submit your application by January 22, 2014. Accepted applicants (individuals and teams) will be notified on January 24, 2014. We hope to expand the program in the future so please use the same form to let us know if you would like to be kept informed about future opportunities.”