How smartphones are solving one of China’s biggest mysteries


Ana Swanson at the Washington Post: “For decades, China has been engaged in a building boom of a scale that is hard to wrap your mind around. In the last three decades, 260 million people have moved from the countryside to Chinese cities — equivalent to around 80 percent of the population of the U.S. To make room for all of those people, the size of China’s built-up urban areas nearly quintupled between 1984 and 2010.

Much of that development has benefited people’s lives, but some has not. In a breathless rush to boost growth and development, some urban areas have built vast, unused real estate projects — China’s infamous “ghost cities.” These eerie, shining developments are complete except for one thing: people to live in them.

China’s ghost cities have sparked a lot of debate over the last few years. Some argue that the developments are evidence of the waste in top-down planning, or the result of too much cheap funding for businesses. Some blame the lack of other good places for average people to invest their money, or the desire of local officials to make a quick buck — land sales generate a lot of revenue for China’s local governments.

Others say the idea of ghost cities has been overblown. They espouse a “build it and they will come” philosophy, pointing out that, with time, some ghost cities fill up and turn into vibrant communities.

It’s been hard to evaluate these claims, since most of the research on ghost cities has been anecdotal. Even the most rigorous research methods leave a lot to be desired — for example, investment research firms sending poor junior employees out to remote locations to count how many lights are turned on in buildings at night.

Now new research from Baidu, one of China’s biggest technology companies, provides one of the first systematic looks at Chinese ghost cities. Researchers from Baidu’s Big Data Lab and Peking University in Beijing used the kind of location data gathered by mobile phones and GPS receivers to track how people moved in and out suspected ghost cities, in real time and on a national scale, over a period of six months. You can see the interactive project here.

Google has been blocked in China for years, and Baidu dominates the market in terms of search, mobile maps and other offerings. That gave the researchers a huge data base to work with —  770 million users, a hefty chunk of China’s 1.36 billion people.

To identify potential ghost cities, the researchers created an algorithm that identifies urban areas with a relatively spare population. They define a ghost city as an urban region with a population of fewer than 5,000 people per square kilometer – about half the density recommended by the Chinese Ministry of Housing and Urban-Rural Development….(More)”

Mobile data: Made to measure


Neil Savage in Nature: “For decades, doctors around the world have been using a simple test to measure the cardiovascular health of patients. They ask them to walk on a hard, flat surface and see how much distance they cover in six minutes. This test has been used to predict the survival rates of lung transplant candidates, to measure the progression of muscular dystrophy, and to assess overall cardiovascular fitness.

The walk test has been studied in many trials, but even the biggest rarely top a thousand participants. Yet when Euan Ashley launched a cardiovascular study in March 2015, he collected test results from 6,000 people in the first two weeks. “That’s a remarkable number,” says Ashley, a geneticist who heads Stanford University’s Center for Inherited Cardiovascular Disease. “We’re used to dealing with a few hundred patients, if we’re lucky.”

Numbers on that scale, he hopes, will tell him a lot more about the relationship between physical activity and heart health. The reason they can be achieved is that millions of people now have smartphones and fitness trackers with sensors that can record all sorts of physical activity. Health researchers are studying such devices to figure out what sort of data they can collect, how reliable those data are, and what they might learn when they analyse measurements of all sorts of day-to-day activities from many tens of thousands of people and apply big-data algorithms to the readings.

By July, more than 40,000 people in the United States had signed up to participate in Ashley’s study, which uses an iPhone application called MyHeart Counts. He expects the numbers to surge as the app becomes more widely available around the world. The study — designed by scientists, approved by institutional review boards, and requiring informed consent — asks participants to answer questions about their health and risk factors, and to use their phone’s motion sensors to collect data about their activities for seven days. They also do a six-minute walk test, and the phone measures the distance they cover. If their own doctors have ordered blood tests, users can enter information such as cholesterol or glucose measurements. Every three months, the app checks back to update their data.

Physicians know that physical activity is a strong predictor of long-term heart health, Ashley says. But it is less clear what kind of activity is best, or whether different groups of people do better with different types of exercise. MyHeart Counts may open a window on such questions. “We can start to look at subgroups and find differences,” he says.

“You can take pretty noisy data, but if you have enough of it, you can find a signal.”

It is the volume of the data that makes such studies possible. In traditional studies, there may not be enough data to find statistically significant results for such subgroups. And rare events may not occur in the smaller samples, or may produce a signal so weak that it is lost in statistical noise. Big data can overcome those problems, and if the data set is big enough, small errors can be smoothed out. “You can take pretty noisy data, but if you have enough of it, you can find a signal,” Ashley says….(More)”.

How big data and The Sims are helping us to build the cities of the future


The Next Web: “By 2050, the United Nations predicts that around 66 percent of the world’s population will be living in urban areas. It is expected that the greatest expansion will take place in developing regions such as Africa and Asia. Cities in these parts will be challenged to meet the needs of their residents, and provide sufficient housing, energy, waste disposal, healthcare, transportation, education and employment.

So, understanding how cities will grow – and how we can make them smarter and more sustainable along the way – is a high priority among researchers and governments the world over. We need to get to grips with the inner mechanisms of cities, if we’re to engineer them for the future. Fortunately, there are tools to help us do this. And even better, using them is a bit like playing SimCity….

Cities are complex systems. Increasingly, scientists studying cities have gone from thinking about “cities as machines”, to approaching “cities as organisms”. Viewing cities as complex, adaptive organisms – similar to natural systems like termite mounds or slime mould colonies – allows us to gain unique insights into their inner workings. …So, if cities are like organisms, it follows that we should examine them from the bottom-up, and seek to understand how unexpected large-scale phenomena emerge from individual-level interactions. Specifically, we can simulate how the behaviour of individual “agents” – whether they are people, households, or organisations – affect the urban environment, using a set of techniques known as “agent-based modelling”….These days, increases in computing power and the proliferation of big datagive agent-based modelling unprecedented power and scope. One of the most exciting developments is the potential to incorporate people’s thoughts and behaviours. In doing so, we can begin to model the impacts of people’s choices on present circumstances, and the future.

For example, we might want to know how changes to the road layout might affect crime rates in certain areas. By modelling the activities of individuals who might try to commit a crime, we can see how altering the urban environment influences how people move around the city, the types of houses that they become aware of, and consequently which places have the greatest risk of becoming the targets of burglary.

To fully realise the goal of simulating cities in this way, models need a huge amount of data. For example, to model the daily flow of people around a city, we need to know what kinds of things people spend their time doing, where they do them, who they do them with, and what drives their behaviour.

Without good-quality, high-resolution data, we have no way of knowing whether our models are producing realistic results. Big data could offer researchers a wealth of information to meet these twin needs. The kinds of data that are exciting urban modellers include:

  • Electronic travel cards that tell us how people move around a city.
  • Twitter messages that provide insight into what people are doing and thinking.
  • The density of mobile telephones that hint at the presence of crowds.
  • Loyalty and credit-card transactions to understand consumer behaviour.
  • Participatory mapping of hitherto unknown urban spaces, such as Open Street Map.

These data can often be refined to the level of a single person. As a result, models of urban phenomena no longer need to rely on assumptions about the population as a whole – they can be tailored to capture the diversity of a city full of individuals, who often think and behave differently from one another….(More)

Open government: a new paradigm in social change?


Rosie Williams: In a recent speech to the Australian and New Zealand School of Government (ANSOG) annual conference, technology journalist and academic Suelette Drefyus explained the growing ‘information asymmetry’ that characterises the current-day relationship between government and citizenry.

According to Dreyfus:

‘Big Data makes government very powerful in its relationship with the citizen. This is even more so with the rise of intelligent systems, software that increasingly trawls, matches and analyses that Big Data. And it is moving toward making more decisions once made by human beings.’

The role of technology in the delivery of government services gives much food for thought in terms of both its implications for potential good and the potential dangers it may pose. The concept of open government is an important one for the future of policy and democracy in Australia. Open government has at its core a recognition that the world has changed, that the ways people engage and who they engage with has transformed in ways that governments around the world must respond to in both technological and policy terms.

As described in the ANSOG speech, the change within government in how it uses technology is well underway, however in many regards we are at the very beginning of understanding and implementing the potential of data and technology in providing solutions to many of our shared problems. Australia’s pending membership of the Open Government Partnership is integral to how Australia responds to these challenges. Membership of the multi-lateral partnership requires the Australian government to create a National Action Plan based on consultation and demonstrate our credentials in the areas of Fiscal Transparency, Access to Information, Income and Asset Disclosure, and Citizen Engagement.

What are the implications of the National Action Plan for policy consultation formulation, implementation and evaluation? In relative terms, Australia’s history with open government is fairly recent. Policies on open data have seen the roll out of data.gov.au – a repository of data published by government agencies and made available for re-use in efforts such as the author’s own financial transparency site OpenAus.

In this way citizen activity and government come together for the purposes of achieving open government. These efforts express a new paradigm in government and activism where the responsibility for solving the problems of democracy are shared between government and the people as opposed to the government ‘solving’ the problems of a passive, receptive citizenry.

As the famous whistle-blowers have shown, citizens are no longer passive but this new capability also requires a consciousness of the responsibilities and accountability that go along with the powers newly developed by citizen activists through technological change.

The opening of data and communication channels in the formulation of public policy provides a way forward to create both a better informed citizenry and also better informed policy evaluation. When new standards of transparency are applied to wicked problems what shortcomings does this highlight?

This question was tested with my recent request for a basic fact missing from relevant government research and reviews but key to social issues of homelessness and domestic violence….(More)”

The Human Face of Big Data


A film by Sandy Smolan [56 minutes]: “Big Data is defined as the real time collection, analyses, and visualization of vast amounts of information. In the hands of Data Scientists this raw information is fueling a revolution which many people believe may have as big an impact on humanity going forward as the Internet has over the past two decades. Its enable us to sense, measure, and understand aspects of our existence in ways never before possible.

The Human Face of Big Data captures an extraordinary revolution sweeping, almost invisibly, through business, academia, government, healthcare, and everyday life. It’s already enabling us to provide a healthier life for our children. To provide our seniors with independence while keeping them safe. To help us conserve precious resources like water and energy. To alert us to tiny changes in our health, weeks or years before we develop a life—threatening illness. To peer into our own individual genetic makeup. To create new forms of life. And soon, as many predict, to re—engineer our own species. And we’ve barely scratched the surface…

This massive gathering and analyzing of data in real time is allowing us to address some of humanities biggest challenges. Yet, as Edward Snowden and the release of the NSA documents has shown, the accessibility of all this data can come at a steep price….(More)”

New Human Need Index fills a data void to help those in need


Scott W. Allard at Brookings: “My 2009 book, “Out of Reach,” examined why it can be hard for poor families to get help from the safety net. One critical barrier is the lack of information about local program resources and nonprofit social service organizations. Good information is key to finding help, but also to important if we are to target resources effectively and assess if program investments were successful.

As I prepared data for the book in 2005, my research team struggled to compile useful information about services and programs in the three major metro areas at the center of the study. We grappled with out-of-date print directories, incomplete online listings, bad addresses, disconnected phone numbers, and inaccurate information about the availability of services. It wasn’t clear families experiencing hardship could easily find the help they needed. It also wasn’t clear how potential volunteers or donors could know where to direct their energies, or whether communities could know whether they were deploying adequate and relevant safety net resources. In the book’s conclusion, however, I was optimistic things would get better. A mix of emerging technology, big data systems, and a generation of young entrepreneurs would certainly close these information gaps over the next several years.

Recently, I embarked upon an effort to again identify the social service organizations operating in one of the book’s original study sites. To my surprise, the work was much harder this time around. Print directories are artifacts of the past. Online referral tools provided only spotty coverage. Addresses and service information can still be quite out of date. In many local communities, it felt as if there was less information available now than a decade ago.

Lack of data about local safety net programs, particularly nonprofit organizations, has long been a problem for scholars, community advocates, nonprofit leaders, and philanthropists. Data about providers and populations served are expensive to collect, update, and disseminate. There are no easy ways to monetize data resources or find regular revenue streams to support data work. There are legal obstacles and important concerns about confidentiality. Many organizations don’t have the resources to do much analytic or learning work.

The result is striking. We spend tens of billions of dollars on social services for low-income households each year, but we have only the vaguest ideas of where those dollars go, what impact they have, and where unmet needs exist.

Into this information void steps the Salvation Army and the Lilly Family School of Philanthropy at Indiana University with a possible path forward. Working together and with an advisory board of scholars, the Salvation Army and the Lilly School have created a real-time Human Needs Index drawn from service provision tracking systems maintained by more than 7,000 Salvation Army sites nationwide. The index provides useful insight into consumption of an array of emergency services (e.g., food, shelter, clothing) at a given place and point in time across the entire country…(More)”

How Big Data Could Open The Financial System For Millions Of People


But that’s changing as the poor start leaving data trails on the Internet and on their cell phones. Now that data can be mined for what it says about someone’s creditworthiness, likeliness to repay, and all that hardcore stuff lenders want to know.

“Every time these individuals make a phone call, send a text, browse the Internet, engage social media networks, or top up their prepaid cards, they deepen the digital footprints they are leaving behind,” says a new report from the Omidyar Network. “These digital footprints are helping to spark a new kind of revolution in lending.”

The report, called “Big Data, Small Credit,” looks at the potential to expand credit access by analyzing mobile and smartphone usage data, utility records, Internet browsing patters and social media behavior….

“In the last few years, a cluster of fast-emerging and innovative firms has begun to use highly predictive technologies and algorithms to interrogate and generate insights from these footprints,” the report says.

“Though these are early days, there is enough to suggest that hundreds of millions of mass-market consumers may not have to remain ‘invisible’ to formal, unsecured credit for much longer.”…(More)

Toward a manifesto for the ‘public understanding of big data’


Mike Michael and Deborah Lupton in Public Understanding of Science: “….we sketch a ‘manifesto’ for the ‘public understanding of big data’. On the one hand, this entails such public understanding of science and public engagement with science and technology–tinged questions as follows: How, when and where are people exposed to, or do they engage with, big data? Who are regarded as big data’s trustworthy sources, or credible commentators and critics? What are the mechanisms by which big data systems are opened to public scrutiny? On the other hand, big data generate many challenges for public understanding of science and public engagement with science and technology: How do we address publics that are simultaneously the informant, the informed and the information of big data? What counts as understanding of, or engagement with, big data, when big data themselves are multiplying, fluid and recursive? As part of our manifesto, we propose a range of empirical, conceptual and methodological exhortations. We also provide Appendix 1 that outlines three novel methods for addressing some of the issues raised in the article….(More)”

Open data, open mind: Why you should share your company data with the world


Mark Samuels at ZDnet: “If information really is the lifeblood of modern organisations, then CIOs could create huge benefits from opening their data to new, creative pairs of eyes. Research from consultant McKinsey suggests that seven sectors alone could generate more than $3 trillion a year in additional value as a result of open data: that is, taking previously proprietary data (often starting with public sector data) and opening up access.

So, should your business consider giving outsiders access to insider information? ZDNet speaks to three experts.

More viewpoints can mean better results

Former Tullow Oil CIO Andrew Marks says debates about the potential openness of data in a private sector context are likely to be dominated by one major concern: information security.

“It’s a perfectly reasonable debate until people start thinking about privacy,” he says. “Putting information at risk, both in terms of customer data and competitive advantage, will be a risk too far for many senior executives.”

But what if CIOs could allay c-suite peers’ concerns and create a new opportunity? Marks points to the Goldcorp Challenge, which saw the mining specialist share its proprietary geological data to allow outside experts pick likely spots for mining. The challenge, which included prize money of $575,000 helped identify more than 110 sites, 50 per cent of which were previously unknown to the company. The value of gold found through the competition exceeded $6bn. Marks wonders whether other firms could take similarly brave steps.
“There is a period of time when information is very sensitive,” he says. “Once the value of data starts to become finite, then it might be beneficial for businesses to open the doors and to let outsiders play with the information. That approach, in terms of gamification, might lead to the creation of new ideas and innovations.”…

Marks says these projects help prove that, when it comes to data, more is likely to mean different – and possibly better – results. “Whether using big data algorithms or the human touch, the more viewpoints you bring together, the more you can increases chances of success and reduce risk,” he says.

“There is, therefore, always likely to be value in seeking an alternative perspective. Opening access to data means your firm is going to get more ideas, but CIOs and other senior executives need to think very carefully about what such openness means for the business, and the potential benefits.”….Some leading firms are already taking steps towards openness. Take Christina Scott, chief product and information officer at the Financial Times, who says the media organisation has used data analysts to help push the benefits of information-led insight across the business.

Her team has democratised data in order to make sure that all parts of the organisation can get the information they need to complete their day-to-day jobs. Scott says the approach is best viewed as an open data strategy, but within the safe confines of the existing enterprise firewall. While the tactic is internally focused currently, Scott says the FT is keen to find ways to make the most of external talent in the future.

“We’re starting to consider how we might open data beyond the organisation, too,” she says. “Our data holds a lot of value and insight, including across the metadata we’ve created. So it would be great to think about how we could use that information in a more open way.” Part of the FT’s business includes trade-focused magazines. Scott says opening the data could provide new insight to its B2B customers across a range of sectors. In fact, the firm has already dabbled at a smaller scale.

“We’ve run hackathons, where we’ve exposed our APIs and given people the chance to come up with some new ideas,” she says. “But I don’t think we’ve done as much work on open data as we could. And I think that’s the direction in which better organisations are moving. They recognise that not all innovation is going to happen within the company.”…

CIO Omid Shiraji is another IT expert who recognises that there is a general move towards a more open society. Any executive who expects to work within a tightly defined enterprise firewall is living in cloud cuckoo land, he argues. More to the point, they will miss out on big advantages.
“If you can expose your sources to a range of developers, you can start to benefit from massive innovation,” he says. “You can get really big benefits from opening your data to external experts who can focus on areas that you don’t have the capability to develop internally.”

Many IT leaders would like to open data to outside experts, suggests Shiraji. For CIOs who are keen to expose their sources, he suggests letting small-scale developers take a close look at in-house data silos in an attempt to discover what relationships might exist and what advantages could accrue….(More)”

Big data problems we face today can be traced to the social ordering practices of the 19th century.


Hamish Robertson and Joanne Travaglia in LSE’s The Impact Blog: “This is not the first ‘big data’ era but the second. The first was the explosion in data collection that occurred from the early 19th century – Hacking’s ‘avalanche of numbers’, precisely situated between 1820 and 1840. This was an analogue big data era, different to our current digital one but characterized by some very similar problems and concerns. Contemporary problems of data analysis and control include a variety of accepted factors that make them ‘big’ and these generally include size, complexity and technology issues. We also suggest that digitisation is a central process in this second big data era, one that seems obvious but which has also appears to have reached a new threshold. Until a decade or so ago ‘big data’ looked just like a digital version of conventional analogue records and systems. Ones whose management had become normalised through statistical and mathematical analysis. Now however we see a level of concern and anxiety, similar to the concerns that were faced in the first big data era.

This situation brings with it a socio-political dimension of interest to us, one in which our understanding of people and our actions on individuals, groups and populations are deeply implicated. The collection of social data had a purpose – understanding and controlling the population in a time of significant social change. To achieve this, new kinds of information and new methods for generating knowledge were required. Many ideas, concepts and categories developed during that first data revolution remain intact today, some uncritically accepted more now than when they were first developed. In this piece we draw out some connections between these two data ‘revolutions’ and the implications for the politics of information in contemporary society. It is clear that many of the problems in this first big data age and, more specifically, their solutions persist down to the present big data era….Our question then is how do we go about re-writing the ideological inheritance of that first data revolution? Can we or will we unpack the ideological sequelae of that past revolution during this present one? The initial indicators are not good in that there is a pervasive assumption in this broad interdisciplinary field that reductive categories are both necessary and natural. Our social ordering practices have influenced our social epistemology. We run the risk in the social sciences of perpetuating the ideological victories of the first data revolution as we progress through the second. The need for critical analysis grows apace not just with the production of each new technique or technology but with the uncritical acceptance of the concepts, categories and assumptions that emerged from that first data revolution. That first data revolution proved to be a successful anti-revolutionary response to the numerous threats to social order posed by the incredible changes of the nineteenth century, rather than the Enlightenment emancipation that was promised. (More)”

This is part of a wider series on the Politics of Data. For more on this topic, also see Mark Carrigan’sPhilosophy of Data Science interview series and the Discover Society special issue on the Politics of Data (Science).