Selfiecity


New Project aimed at investigating the style of self-portraits (selfies) in five cities across the world: “Selfiecity investigates selfies using a mix of theoretic, artistic and quantitative methods:

  • We present our findings about the demographics of people taking selfies, their poses and expressions.
  • Rich media visualizations (imageplots) assemble thousands of photos to reveal interesting patterns.
  • The interactive selfiexploratory allows you to navigate the whole set of 3200 photos.
  • Finally, theoretical essays discuss selfies in the history of photography, the functions of images in social media, and methods and dataset.”

Can Twitter Predict Major Events Such As Mass Protests?


Emerging Technology From the arXiv : “The idea that social media sites such as Twitter can predict the future has a controversial history. In the last few years, various groups have claimed to be able to predict everything from the outcome of elections to the box office takings for new movies.
It’s fair to say that these claims have generated their fair share of criticism. So it’s interesting to see a new claim come to light.
Today, Nathan Kallus at the Massachusetts Institute of Technology in Cambridge says he has developed a way to predict crowd behaviour using statements made on Twitter. In particular, he has analysed the tweets associated with the 2013 coup d’état in Egypt and says that the civil unrest associated with this event was clearly predictable days in advance.
It’s not hard to imagine how the future behaviour of crowds might be embedded in the Twitter stream. People often signal their intent to meet in advance and even coordinate their behaviour using social media. So this social media activity is a leading indicator of future crowd behaviour.
That makes it seem clear that predicting future crowd behaviour is simply a matter of picking this leading indicator out of the noise.
Kallus says this is possible by mining tweets for any mention of future events and then analysing trends associated with them. “The gathering of crowds into a single action can often be seen through trends appearing in this data far in advance,” he says.
It turns out that exactly this kind of analysis is available from a company called Recorded Future based in Cambridge, which scans 300,000 different web sources in seven different languages from all over the world. It then extracts mentions of future events for later analysis….
The bigger question is whether it’s possible to pick out this evidence in advance. In other words, is possible to make predictions before the events actually occur?
That’s not so clear but there are good reasons to be cautious. First of all, while it’s possible to correlate Twitter activity to real protests, it’s also necessary to rule out false positives. There may be significant Twitter trends that do not lead to significant protests in the streets. Kallus does not adequately address the question of how to tell these things apart.
Then there is the question of whether tweets are trustworthy. It’s not hard to imagine that when it comes to issues of great national consequence, propaganda, rumor and irony may play a significant role. So how to deal with this?
There is also the question of demographics and whether tweets truly represent the intentions and activity of the population as a whole. People who tweet are overwhelmingly likely to be young but there is another silent majority that plays hugely important role. So can the Twitter firehose really represent the intentions of this part of the population too?
The final challenge is in the nature of prediction. If the Twitter feed is predictive, then what’s needed is evidence that it can be used to make real predictions about the future and not just historical predictions about the past.
We’ve looked at some of these problems with the predictive power of social media before and the challenge is clear: if there is a claim to be able to predict the future, then this claim must be accompanied by convincing evidence of an actual prediction about an event before it happens.
Until then, it would surely be wise to be circumspect about the predictive powers of Twitter and other forms of social media.
Ref: arxiv.org/abs/1402.2308: Predicting Crowd Behavior with Big Public Data”

Crowdsourcing and regulatory reviews: A new way of challenging red tape in British government?


New paper by Martin Lodge and Kai Wegrich in Regulation and Governance: “Much has been said about the appeal of digital government devices to enhance consultation on rulemaking. This paper explores the most ambitious attempt by the UK central government so far to draw on “crowdsourcing” to consult and act on regulatory reform, the “Red Tape Challenge.” We find that the results of this exercise do not represent any major change to traditional challenges to consultation processes. Instead, we suggest that the extensive institutional arrangements for crowdsourcing were hardly significant in informing actual policy responses: neither the tone of the crowdsourced comments, the direction of the majority views, nor specific comments were seen to matter. Instead, it was processes within the executive that shaped the overall governmental responses to this initiative. The findings, therefore, provoke wider debates about the use of social media in rulemaking and consultation exercises.”

DIY Citizenship: Critical Making and Social Media


New book edited  by Matt Ratto and Megan Boler :”Today, DIY–do-it-yourself–describes more than self-taught carpentry. Social media enables DIY citizens to organize and protest in new ways (as in Egypt’s “Twitter revolution” of 2011) and to repurpose corporate content (or create new user-generated content) in order to offer political counternarratives. This book examines the usefulness and limits of DIY citizenship, exploring the diverse forms of political participation and “critical making” that have emerged in recent years. The authors and artists in this collection describe DIY citizens whose activities range from activist fan blogging and video production to knitting and the creation of community gardens.
Contributors examine DIY activism, describing new modes of civic engagement that include Harry Potter fan activism and the activities of the Yes Men. They consider DIY making in learning, culture, hacking, and the arts, including do-it-yourself media production and collaborative documentary making. They discuss DIY and design and how citizens can unlock the black box of technological infrastructures to engage and innovate open and participatory critical making. And they explore DIY and media, describing activists’ efforts to remake and reimagine media and the public sphere. As these chapters make clear, DIY is characterized by its emphasis on “doing” and making rather than passive consumption. DIY citizens assume active roles as interventionists, makers, hackers, modders, and tinkerers, in pursuit of new forms of engaged and participatory democracy.”

Open Data (Updated and Expanded)


As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. We start our series with a focus on Open Data. To suggest additional readings on this or any other topic, please email [email protected].

Data and its uses for GovernanceOpen data refers to data that is publicly available for anyone to use and which is licensed in a way that allows for its re-use. The common requirement that open data be machine-readable not only means that data is distributed via the Internet in a digitized form, but can also be processed by computers through automation, ensuring both wide dissemination and ease of re-use. Much of the focus of the open data advocacy community is on government data and government-supported research data. For example, in May 2013, the US Open Data Policy defined open data as publicly available data structured in a way that enables the data to be fully discoverable and usable by end users, and consistent with a number of principles focused on availability, accessibility and reusability.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)
Fox, Mark S. “City Data: Big, Open and Linked.” Working Paper, Enterprise Integration Laboratory (2013). http://bit.ly/1bFr7oL.

  • This paper examines concepts that underlie Big City Data using data from multiple cities as examples. It begins by explaining the concepts of Open, Unified, Linked, and Grounded data, which are central to the Semantic Web. Fox then explore Big Data as an extension of Data Analytics, and provide case examples of good data analytics in cities.
  • Fox concludes that we can develop the tools that will enable anyone to analyze data, both big and small, by adopting the principles of the Semantic Web:
    • Data being openly available over the internet,
    • Data being unifiable using common vocabularies,
    • Data being linkable using International Resource Identifiers,
    • Data being accessible using a common data structure, namely triples,
    • Data being semantically grounded using Ontologies.

Foulonneau, Muriel, Sébastien Martin, and Slim Turki. “How Open Data Are Turned into Services?” In Exploring Services Science, edited by Mehdi Snene and Michel Leonard, 31–39. Lecture Notes in Business Information Processing 169. Springer International Publishing, 2014. http://bit.ly/1fltUmR.

  • In this chapter, the authors argue that, considering the important role the development of new services plays as a motivation for open data policies, the impact of new services created through open data should play a more central role in evaluating the success of open data initiatives.
  • Foulonneau, Martin and Turki argue that the following metrics should be considered when evaluating the success of open data initiatives: “the usage, audience, and uniqueness of the services, according to the changes it has entailed in the public institutions that have open their data…the business opportunity it has created, the citizen perception of the city…the modification to particular markets it has entailed…the sustainability of the services created, or even the new dialog created with citizens.”

Goldstein, Brett, and Lauren Dyson. Beyond Transparency: Open Data and the Future of Civic Innovation. 1 edition. (Code for America Press: 2013). http://bit.ly/15OAxgF

  • This “cross-disciplinary survey of the open data landscape” features stories from practitioners in the open data space — including Michael Flowers, Brett Goldstein, Emer Colmeman and many others — discussing what they’ve accomplished with open civic data. The book “seeks to move beyond the rhetoric of transparency for transparency’s sake and towards action and problem solving.”
  • The book’s editors seek to accomplish the following objectives:
    • Help local governments learn how to start an open data program
    • Spark discussion on where open data will go next
    • Help community members outside of government better engage with the process of governance
    • Lend a voice to many aspects of the open data community.
  • The book is broken into five sections: Opening Government Data, Building on Open Data, Understanding Open Data, Driving Decisions with Data and Looking Ahead.

Granickas, Karolis. “Understanding the Impact of Releasing and Re-using Open Government Data.” European Public Sector Information Platform, ePSIplatform Topic Report No. 2013/08, (2013). http://bit.ly/GU0Nx4.

  • This paper examines the impact of open government data by exploring the latest research in the field, with an eye toward enabling  an environment for open data, as well as identifying the benefits of open government data and its political, social, and economic impacts.
  • Granickas concludes that to maximize the benefits of open government data: a) further research is required that structure and measure potential benefits of open government data; b) “government should pay more attention to creating feedback mechanisms between policy implementers, data providers and data-re-users”; c) “finding a balance between demand and supply requires mechanisms of shaping demand from data re-users and also demonstration of data inventory that governments possess”; and lastly, d) “open data policies require regular monitoring.”

Gurin, Joel. Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation, (New York: McGraw-Hill, 2014). http://amzn.to/1flubWR.

  • In this book, GovLab Senior Advisor and Open Data 500 director Joel Gurin explores the broad realized and potential benefit of Open Data, and how, “unlike Big Data, Open Data is transparent, accessible, and reusable in ways that give it the power to transform business, government, and society.”
  • The book provides “an essential guide to understanding all kinds of open databases – business, government, science, technology, retail, social media, and more – and using those resources to your best advantage.”
  • In particular, Gurin discusses a number of applications of Open Data with very real potential benefits:
    • “Hot Startups: turn government data into profitable ventures;
    • Savvy Marketing: understanding how reputational data drives your brand;
    • Data-Driven Investing: apply new tools for business analysis;
    • Consumer Information: connect with your customers using smart disclosure;
    • Green Business: use data to bet on sustainable companies;
    • Fast R&D: turn the online world into your research lab;
    • New Opportunities: explore open fields for new businesses.”

Jetzek, Thorhildur, Michel Avital, and Niels Bjørn-Andersen. “Generating Value from Open Government Data.” Thirty Fourth International Conference on Information Systems, 5. General IS Topics 2013. http://bit.ly/1gCbQqL.

  • In this paper, the authors “developed a conceptual model portraying how data as a resource can be transformed to value.”
  • Jetzek, Avital and Bjørn-Andersen propose a conceptual model featuring four Enabling Factors (openness, resource governance, capabilities and technical connectivity) acting on four Value Generating Mechanisms (efficiency, innovation, transparency and participation) leading to the impacts of Economic and Social Value.
  • The authors argue that their research supports that “all four of the identified mechanisms positively influence value, reflected in the level of education, health and wellbeing, as well as the monetary value of GDP and environmental factors.”

Kassen, Maxat. “A promising phenomenon of open data: A case study of the Chicago open data project.Government Information Quarterly (2013). http://bit.ly/1ewIZnk.

  • This paper uses the Chicago open data project to explore the “empowering potential of an open data phenomenon at the local level as a platform useful for promotion of civic engagement projects and provide a framework for future research and hypothesis testing.”
  • Kassen argues that “open data-driven projects offer a new platform for proactive civic engagement” wherein governments can harness “the collective wisdom of the local communities, their knowledge and visions of the local challenges, governments could react and meet citizens’ needs in a more productive and cost-efficient manner.”
  • The paper highlights the need for independent IT developers to network in order for this trend to continue, as well as the importance of the private sector in “overall diffusion of the open data concept.”

Keen, Justin, Radu Calinescu, Richard Paige, John Rooksby. “Big data + politics = open data: The case of health care data in England.Policy and Internet 5 (2), (2013): 228–243. http://bit.ly/1i231WS.

  • This paper examines the assumptions regarding open datasets, technological infrastructure and access, using healthcare systems as a case study.
  • The authors specifically address two assumptions surrounding enthusiasm about Big Data in healthcare: the assumption that healthcare datasets and technological infrastructure are up to task, and the assumption of access to this data from outside the healthcare system.
  • By using the National Health Service in England as an example, the authors identify data, technology, and information governance challenges. They argue that “public acceptability of third party access to detailed health care datasets is, at best, unclear,” and that the prospects of Open Data depend on Open Data policies, which are inherently political, and the government’s assertion of property rights over large datasets. Thus, they argue that the “success or failure of Open Data in the NHS may turn on the question of trust in institutions.”

Kulk, Stefan and Bastiaan Van Loenen. “Brave New Open Data World?International Journal of Spatial Data Infrastructures Research, May 14, 2012. http://bit.ly/15OAUYR.

  • This paper examines the evolving tension between the open data movement and the European Union’s privacy regulations, especially the Data Protection Directive.
  • The authors argue, “Technological developments and the increasing amount of publicly available data are…blurring the lines between non-personal and personal data. Open data may not seem to be personal data on first glance especially when it is anonymised or aggregated. However, it may become personal by combining it with other publicly available data or when it is de-anonymised.”

Kundra, Vivek. “Digital Fuel of the 21st Century: Innovation through Open Data and the Network Effect.” Joan Shorenstein Center on the Press, Politics and Public Policy, Harvard College: Discussion Paper Series, January 2012, http://hvrd.me/1fIwsjR.

  • In this paper, Vivek Kundra, the first Chief Information Officer of the United States, explores the growing impact of open data, and argues that, “In the information economy, data is power and we face a choice between democratizing it and holding on to it for an asymmetrical advantage.”
  • Kundra offers four specific recommendations to maximize the impact of open data: Citizens and NGOs must demand open data in order to fight government corruption, improve accountability and government services; Governments must enact legislation to change the default setting of government to open, transparent and participatory; The press must harness the power of the network effect through strategic partnerships and crowdsourcing to cut costs and provide better insights; and Venture capitalists should invest in startups focused on building companies based on public sector data.

Noveck, Beth Simone and Daniel L. Goroff. “Information for Impact: Liberating Nonprofit Sector Data.” The Aspen Institute Philanthropy & Social Innovation Publication Number 13-004. 2013. http://bit.ly/WDxd7p.

  • This report is focused on “obtaining better, more usable data about the nonprofit sector,” which encompasses, as of 2010, “1.5 million tax-exempt organizations in the United States with $1.51 trillion in revenues.”
  • Toward that goal, the authors propose liberating data from the Form 990, an Internal Revenue Service form that “gathers and publishes a large amount of information about tax-exempt organizations,” including information related to “governance, investments, and other factors not directly related to an organization’s tax calculations or qualifications for tax exemption.”
  • The authors recommend a two-track strategy: “Pursuing the longer-term goal of legislation that would mandate electronic filing to create open 990 data, and pursuing a shorter-term strategy of developing a third party platform that can demonstrate benefits more immediately.”

Robinson, David G., Harlan Yu, William P. Zeller, and Edward W. Felten, “Government Data and the Invisible Hand.” Yale Journal of Law & Technology 11 (2009), http://bit.ly/1c2aDLr.

  • This paper proposes a new approach to online government data that “leverages both the American tradition of entrepreneurial self-reliance and the remarkable low-cost flexibility of contemporary digital technology.”
  • “In order for public data to benefit from the same innovation and dynamism that characterize private parties’ use of the Internet, the federal government must reimagine its role as an information provider. Rather than struggling, as it currently does, to design sites that meet each end-user need, it should focus on creating a simple, reliable and publicly accessible infrastructure that ‘exposes’ the underlying data.”
Ubaldi, Barbara. “Open Government Data: Towards Empirical Analysis of Open Government Data Initiatives.” OECD Working Papers on Public Governance. Paris: Organisation for Economic Co-operation and Development, May 27, 2013. http://bit.ly/15OB6qP.

  • This working paper from the OECD seeks to provide an all-encompassing look at the principles, concepts and criteria framing open government data (OGD) initiatives.
  • Ubaldi also analyzes a variety of challenges to implementing OGD initiatives, including policy, technical, economic and financial, organizational, cultural and legal impediments.
  • The paper also proposes a methodological framework for evaluating OGD Initiatives in OECD countries, with the intention of eventually “developing a common set of metrics to consistently assess impact and value creation within and across countries.”

Worthy, Ben. “David Cameron’s Transparency Revolution? The Impact of Open Data in the UK.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, November 29, 2013. http://bit.ly/NIrN6y.

  • In this article, Worthy “examines the impact of the UK Government’s Transparency agenda, focusing on the publication of spending data at local government level. It measures the democratic impact in terms of creating transparency and accountability, public participation and everyday information.”
  • Worthy’s findings, based on surveys of local authorities, interviews and FOI requests, are disappointing. He finds that:
    • Open spending data has led to some government accountability, but largely from those already monitoring government, not regular citizens.
    • Open Data has not led to increased participation, “as it lacks the narrative or accountability instruments to fully bring such effects.”
    • It has also not “created a new stream of information to underpin citizen choice, though new innovations offer this possibility. The evidence points to third party innovations as the key.
  • Despite these initial findings, “Interviewees pointed out that Open Data holds tremendous opportunities for policy-making. Joined up data could significantly alter how policy is made and resources targeted. From small scale issues e.g. saving money through prescriptions to targeting homelessness or health resources, it can have a transformative impact. “

Zuiderwijk, Anneke, Marijn Janssen, Sunil Choenni, Ronald Meijer and Roexsana Sheikh Alibaks. “Socio-technical Impediments of Open Data.” Electronic Journal of e-Government 10, no. 2 (2012). http://bit.ly/17yf4pM.

  • This paper to seeks to identify the socio-technical impediments to open data impact based on a review of the open data literature, as well as workshops and interviews.
  • The authors discovered 118 impediments across ten categories: 1) availability and access; 2) find-ability; 3) usability; 4) understandability; 5) quality; 6) linking and combining data; 7) comparability and compatibility; 8) metadata; 9) interaction with the data provider; and 10) opening and uploading.

Zuiderwijk, Anneke and Marijn Janssen. “Open Data Policies, Their Implementation and Impact: A Framework for Comparison.” Government Information Quarterly 31, no. 1 (January 2014): 17–29. http://bit.ly/1bQVmYT.

  • In this article, Zuiderwijk and Janssen argue that “currently there is a multiplicity of open data policies at various levels of government, whereas very little systematic and structured research [being] done on the issues that are covered by open data policies, their intent and actual impact.”
  • With this evaluation deficit in mind, the authors propose a new framework for comparing open data policies at different government levels using the following elements for comparison:
    • Policy environment and context, such as level of government organization and policy objectives;
    • Policy content (input), such as types of data not publicized and technical standards;
    • Performance indicators (output), such as benefits and risks of publicized data; and
    • Public values (impact).

To stay current on recent writings and developments on Open Data, please subscribe to the GovLab Digest.
Did we miss anything? Please submit reading recommendations to [email protected] or in the comments below.

The News: A User’s Manual


New book by Alain De Botton: “The news is everywhere, we can’t stop checking it constantly on our screens, but what is it doing to our minds?

the-news
The news occupies the same dominant position in modern society as religion once did, asserts Alain de Botton – but we don’t begin to understand its impact on us. In this dazzling new book, de Botton takes 25 archetypal news stories – from an aircrash to a murder, a celebrity interview to a political scandal – and submits them to unusually intense analysis.

He raises questions like: How come disaster stories are often so uplifting? What makes the love lives of celebrities so interesting? Why do we enjoy politicians being brought down? Why are upheavals in far off lands often so… boring?

De Botton has written the ultimate manual for our news-addicted age, one sure to bring calm, understanding and a measure of sanity to our daily (perhaps even hourly) interactions with the news machine.

Inspired by writing the book, he has also created a news outlet, which can be visited here: www.philosophersmail.com.

New Programming Language Removes Human Error from Privacy Equation


MIT Technology Review: “Anytime you hear about Facebook inadvertently making your location public, or revealing who is stalking your profile, it’s likely because a programmer added code that inadvertently led to a bug.
But what if there was a system in place that could substantially reduce such privacy breaches and effectively remove human error from the equation?
One MIT PhD thinks she has the answer, and its name is Jeeves.
This past month, Jean Yang released an open-source Python version of “Jeeves,” a programming language with built-in privacy features that free programmers from having to provide on-the-go ad-hoc maintenance of privacy settings.
Given that somewhere between 10 and 20 percent of all code is related to privacy policy, Yang thinks that Jeeves will be an attractive option for social app developers who are looking to be more efficient in their use of programmer resources – as well as those who are hoping to assuage users’ privacy concerns about if and how they use your data.
For more information about Jeeves visit the project site.
For more information on Yang visit her CSAIL page.”

AskThem.io – Questions-and-Answers with Every Elected Official


Press Release: “AskThem.io, launching Feb. 10th, is a free & open-source website for questions-and-answers with public figures. AskThem is like a version of the White House’s “We The People” petition platform, where over 8 million people have taken action to support questions for a public response – but for the first time, for every elected official nationwide…AskThem.io has official government data for over 142,000 U.S. elected officials at every level of government: federal, state, county, and municipal. Also, AskThem allows anyone to ask a question to any verified Twitter account, for online dialogue with public figures.

Here’s how AskThem works for online public dialogue:

  • For the first time in an open-source website, visitors enter their street address to see all their elected officials, from federal down to the city levels, or search for a verified Twitter account.
  • Individuals & organizations submit a question to their elected officials – for example, asking a city council member about a proposed ban on plastic bags.
  • People then sign on to the questions and petitions they support, voting them up on AskThem and sharing them over social media, as with online petitions.
  • When a question passes a certain threshold of signatures, AskThem delivers it to the recipient over email & social media and encourages a public response – creating a continual, structured dialogue with elected officials at every level of government.

AskThem also incorporates open government data, such as city council agendas and key vote information, to inform good questions of people in power. Open government advocate, Chicago, IL Clerk Susana Mendoza, joined AskThem because she believes that “technology should bring residents and the Office of the Chicago City Clerk closer together.”

Elected officials who sign up with AskThem agree to respond to the most popular questions from their constituents (about two per month). Interested elected officials can sign up now to become verified, free & open to everyone.

Issue-based organizations can use question & petition info from AskThem to surface political issues in their area that people care about, stay continuously engaged with government, and promote public accountability. Participating groups on AskThem include the internet freedom non-profit Fight For the Future, the social media crowd-speaking platform Thunderclap.it, the Roosevelt Institute National Student Network, and more.”

Citizen Engagement: 3 Cities And Their Civic Tech Tools


Melissa Jun Rowley at the Toolbox: “Though democratic governments are of the people, by the people, and for the people, it often seems that our only input is electing officials who pass laws on our behalf. After all, I don’t know many people who attend town hall meetings these days. But the evolution of technology has given citizens a new way to participate. Governments are using technology to include as many voices from their communities as possible in civic decisions and activities. Here are three examples.
Raleigh, NC
Raleigh North Carolina’s open government initiative is a great example of passive citizen engagement. By following an open source strategy, Open Raleigh has made city data available to the public. Citizens then use the data in a myriad of ways, from simply visualizing daily crime in their city, to creating an app that lets users navigate and interactively utilize the city’s greenway system.
Fort Smith, AR
Using MindMixer, Fort Smith Arkansas has created an online forum for residents to discuss the city’s comprehensive plan, effectively putting the community’s future in the hands of the community itself. Citizens are invited to share their own ideas, vote on ideas submitted by others, and engage with city officials that are “listening” to the conversation on the site.
Seattle, WA
Being a tech town, it’s no surprise that Seattle is using social media as a citizen engagement tool. The Seattle Police Department (SPD) uses a variety of social media tools to reach the public. In 2012, the department launched a first-of-its kind hyper-local twitter initiative. A police scanner for the twitter generation, Tweets by Beat provides twitter feeds of police dispatches in each of Seattle’s 51 police beats so that residents can find out what is happening right on their block.
In addition to Twitter and Facebook, SPD created a Tumblr to, in their own words, “show you your police department doing police-y things in your city.” In a nutshell, the department’s Tumblr serves as an extension of their other social media outlets. “

"Natural Cities" Emerge from Social Media Location Data


Emerging Technology From the arXiv: “Nobody agrees on how to define a city. But the emergence of “natural cities” from social media data sets may change that, say computational geographers…
A city is a large, permanent human settlement. But try and define it more carefully and you’ll soon run into trouble. A settlement that qualifies as a city in Sweden may not qualify in China, for example. And the reasons why one settlement is classified as a town while another as a city can sometimes seem almost arbitrary.
City planners know this problem well.  They tend to define cities by administrative, legal or even historical boundaries that have little logic to them. Indeed, the same city can sometimes be defined in various different ways.
That causes all kinds of problems from counting the total population to working out who pays for the upkeep of the place.  Which definition do you use?
Now help may be at hand thanks to the work of Bin Jiang and Yufan Miao at the University of Gävle in Sweden. These guys have found a way to use people’s location recorded by social media to define the boundaries of so-called natural cities which have a close resemblance to real cities in the US.
Jiang and Miao began with a dataset from the Brightkite social network, which was active between 2008 and 2010. The site encouraged users to log in with their location details so that they could see other users nearby. So the dataset consists of almost 3 million locations in the US and the dates on which they were logged.
To start off, Jiang and Miao simply placed a dot on a map at the location of each login. They then connected these dots to their neighbours to form triangles that end up covering the entire mainland US.
Next, they calculated the size of each triangle on the map and plotted this size distribution, which turns out to follow a power law. So there are lots of tiny triangles but only a few  large ones.
Finally, the calculated the average size of the triangles and then coloured in all those that were smaller than average. The coloured areas are “natural cities”, say Jiang and Miao.
It’s easy to imagine that resulting map of triangles is of little value.  But to the evident surprise of ther esearchers, it produces a pretty good approximation of the cities in the US. “We know little about why the procedure works so well but the resulting patterns suggest that the natural cities effectively capture the evolution of real cities,” they say.
That’s handy because it suddenly gives city planners a way to study and compare cities on a level playing field. It allows them to see how cities evolve and change over time too. And it gives them a way to analyse how cities in different parts of the world differ.
Of course, Jiang and Miao will want to find out why this approach reveals city structures in this way. That’s still something of a puzzle but the answer itself may provide an important insight into the nature of cities (or at least into the nature of this dataset).
A few days ago, this blog wrote about how a new science of cities is emerging from the analysis of big data.  This is another example and expect to see more.
Ref:  http://arxiv.org/abs/1401.6756 : The Evolution of Natural Cities from the Perspective of Location-Based Social Media”