Seven Principles for Big Data and Resilience Projects

PopTech & Rockefeler Bellagio Fellows: “The following is a draft “Code of Conduct” that seeks to provide guidance on best practices for resilience building projects that leverage Big Data and Advanced Computing. These seven core principles serve to guide data projects to ensure they are socially just, encourage local wealth- & skill-creation, require informed consent, and be maintainable over long timeframes. This document is a work in progress, so we very much welcome feedback. Our aim is not to enforce these principles on others but rather to hold ourselves accountable and in the process encourage others to do the same. Initial versions of this draft were written during the 2013 PopTech & Rockefeller Foundation workshop in Bellagio, August 2013.
Open Source Data Tools – Wherever possible, data analytics and manipulation tools should be open source, architecture independent and broadly prevalent (R, python, etc.). Open source, hackable tools are generative, and building generative capacity is an important element of resilience….
Transparent Data Infrastructure – Infrastructure for data collection and storage should operate based on transparent standards to maximize the number of users that can interact with the infrastructure. Data infrastructure should strive for built-in documentation, be extensive and provide easy access. Data is only as useful to the data scientist as her/his understanding of its collection is correct…
Develop and Maintain Local Skills – Make “Data Literacy” more widespread. Leverage local data labor and build on existing skills. The key and most constraint ingredient to effective data solutions remains human skill/knowledge and needs to be retained locally. In doing so, consider cultural issues and language. Catalyze the next generation of data scientists and generate new required skills in the cities where the data is being collected…
Local Data Ownership – Use Creative Commons and licenses that state that data is not to be used for commercial purposes. The community directly owns the data it generates, along with the learning algorithms (machine learning classifiers) and derivatives. Strong data protection protocols need to be in place to protect identities and personally identifying information…
Ethical Data Sharing – Adopt existing data sharing protocols like the ICRC’s (2013). Permission for sharing is essential. How the data will be used should be clearly articulated. An opt in approach should be the preference wherever possible, and the ability for individuals to remove themselves from a data set after it has been collected must always be an option. Projects should always explicitly state which third parties will get access to data, if any, so that it is clear who will be able to access and use the data…
Right Not To Be Sensed – Local communities have a right not to be sensed. Large scale city sensing projects must have a clear framework for how people are able to be involved or choose not to participate. All too often, sensing projects are established without any ethical framework or any commitment to informed consent. It is essential that the collection of any sensitive data, from social and mobile data to video and photographic records of houses, streets and individuals, is done with full public knowledge, community discussion, and the ability to opt out…
Learning from Mistakes – Big Data and Resilience projects need to be open to face, report, and discuss failures. Big Data technology is still very much in a learning phase. Failure and the learning and insights resulting from it should be accepted and appreciated. Without admitting what does not work we are not learning effectively as a community. Quality control and assessment for data-driven solutions is notably harder than comparable efforts in other technology fields. The uncertainty about quality of the solution is created by the uncertainty inherent in data…”

Five Ways to Make Government Procurement Better

Mark Headd at Civic Innovations:  “Nothing in recent memory has focused attention on the need for wholesale reform of the government IT procurement system more than the troubled launch of
There has been a myriad of blog posts, stories and articles written in the last few weeks detailing all of the problems that led to the ignominious launch of the website meant to allow people to sign up for health care coverage.
Though the details of this high profile flop are in the latest headlines, the underlying cause has been talked about many times before – the process by which governments contract with outside parties to obtain IT services is broken…
With all of this in mind, here are – in no particular order – five suggested changes that can be adopted to improve the government procurement process.
Raise the threshold on simplified / streamlined procurement
Many governments use a separate, more streamlined process for smaller projects that do not require a full RFP (in the City of Philadelphia, professional services projects that do not exceed $32,000 annually go through this more streamlined bidding process). In Philadelphia, we’ve had great success in using these smaller projects to test new ideas and strategies for partnering with IT vendors. There is much we can learn from these experiments, and a modest increase to enable more experimentation would allow governments to gain valuable new insights.
Narrowing the focus of any enhanced thresholds for streamlined budding to web-based projects would help mitigate risk and foster a quicker process for testing new ideas.
Identify clear standards for projects
Having a clear set of vendor-agnostic IT standards to use when developing RFPs and in performing work can make a huge difference in how a project turns out. Clearly articulating standards for:

  • The various components that a system will use.
  • The environment in which it will be housed.
  • The testing it must undergo prior to final acceptance.

…can go a long way to reduce the risk an uncertainly inherent in IT projects.
It’s worth noting that most governments probably already have a set of IT standards that are usually made part of any IT solicitation. But these standards documents can quickly become out of date – they must undergo constant review and refinement. In addition, many of the people writing these standards may confuse a specific vendor product or platform with a true standard.
Require open source
Requiring that IT projects be open source during development or after completion can be an effective way to reduce risk on an IT project and enhance transparency. This is particularly true of web-based projects.
In addition, government RFPs should encourage the use of existing open source tools – leveraging existing software components that are in use in similar projects and maintained by an active community – to foster external participation by vendors and volunteers alike. When governments make the code behind their project open source, they enable anyone that understands software development to help make them better.
Develop a more robust internal capacity for IT project management and implementation
Governments must find ways to develop the internal capacity for developing, implementing and managing technology projects.
Part of the reason that governments make use of a variety of different risk mitigation provisions in public bidding is that there is a lack of people in government with hands on experience building or maintaining technology. There is a dearth of makers in government, and there is a direct relationship between the perceived risk that governments take on with new technology projects and the lack of experienced technologists working in government.
Governments need to find ways to develop a maker culture within their workforces and should prioritize recruitment from the local technology and civic hacking communities.
Make contracting, lobbying and campaign contribution data public as open data
One of the more disheartening revelations to come out of the analysis of implementation is that some of the firms that were awarded work as part of the project also spent non-trivial amounts of money on lobbying. It’s a good bet that this kind of thing also happens at the state and local level as well.
This can seriously undermine confidence in the bidding process, and may cause many smaller firms – who lack funds or interest in lobbying elected officials – to simply throw up their hands and walk away.
In the absence of statutory or regulatory changes to prevent this from happening, governments can enhance the transparency around the bidding process by working to ensure that all contracting data as well as data listing publicly registered lobbyists and contributions to political campaigns is open.
Ensuring that all prospective participants in the public bidding process have confidence that the process will be fair and transparent is essential to getting as many firms to participate as possible – including small firms more adept at agile software development methodologies. More bids typically equates to higher quality proposals and lower prices.
None of the changes list above will be easy, and governments are positioned differently in how well they may achieve any one of them. Nor do they represent the entire universe of things we can do to improve the system in the near term – these are items that I personally think are important and very achievable.
One thing that could help speed the adoption of these and other changes is the development of robust communication framework between government contracting and IT professionals in different cities and different states. I think a “Municipal Procurement Academy” could go a long way toward achieving this.”

More Top-Down Participation, Please! Institutionalized empowerment through open participation

Michelle Ruesch and Oliver Märker in DDD: “…this is not another article on the empowering potential of bottom-up digital political participation. Quite the contrary: It instead seeks to stress the empowering potential of top-down digital political participation. Strikingly, the democratic institutionalization of (digital) political participation is rarely considered when we speak about power in the context of political participation. Wouldn’t it be true empowerment though if the right of citizens to speak their minds were directly integrated into political and administrative decision-making processes?

Institutionalized political participation

Political participation, defined as any act that aims to influence politics in some way, can be initiated either by citizens, referred to as “bottom-up” participation, or by government, often referred to as “top-down” participation.  For many, the word “top-down” instantly evokes negative connotations, even though top-down participatory spaces are actually the foundation of democracy. These are the spaces of participation offered by the state and guaranteed by democratic constitutions. For a long time, top-down participation could be equated with formal democratic participation such as elections, referenda or party politics. Today, however, in states like Germany we can observe a new form of top-down political participation, namely government-initiated participation that goes beyond what is legally required and usually makes extensive use of digital media.
Like many other Western states, Germany has to cope with decreasing voter turnout and a lack of trust in political parties. At the same time, according to a recent study from 2012, two-thirds of eligible voters would like to be more involved in political decisions. The case of “Stuttgart 21” served as a late wake-up call for many German municipalities. Plans to construct a new train station in the center of the city of Stuttgart resulted in a petition for a local referendum, which was rejected. Protests against the train station culminated in widespread demonstrations in 2010, forcing construction to be halted. Even though a referendum was finally held in 2011 and a slight majority voted in favor of the train station, the Stuttgart 21 case has since been cited by Chancellor Angela Merkel and others as an example of the negative consequences of taking decisions without consulting with citizens early on. More and more municipalities and federal ministries in Germany have therefore started acknowledging that the conventional democratic model of participation in elections every few years is no longer sufficient. The Federal Ministry of Transport, Building and Urban Development, for example, published a manual for “good participation” in urban development projects….

What’s so great about top-down participation?

Semi-formal top-down participation processes have one major thing in common, regardless of the topic they address: Governmental institutions voluntarily open up a space for dialogue and thereby obligate themselves to take citizens’ concerns and ideas into account.
As a consequence, government-initiated participation offers the potential for institutionalized empowerment beyond elections. It grants the possibility of integrating participation into political and administrative decision-making processes….
Bottom-up participation will surely always be an important mobilizer of democratic change. Nevertheless, the provision of spaces of open participation by governments can aid in the institutionalization of citizens’ involvement in political decision-making. Had Stuttgart offered an open space of participation early in the train station construction process, maybe protests would never have escalated the way they did.
So is top-down participation the next step in the process of democratization? It could be, but only under certain conditions. Most importantly, top-down open participation requires a genuine willingness to abandon the old principle of doing business behind closed doors. This is not an easy undertaking; it requires time and endurance. Serious open participation also requires creating state institutions that ensure the relevance of the results by evaluating them and considering them in political decisions. We have formulated ten conditions that we consider necessary for the genuine institutionalization of open political participation [14]:

  • There needs to be some scope for decision-making. Top-down participation only makes sense when the results of the participation can influence decisions.
  • The government must genuinely aim to integrate the results into decision-making processes.
  • The limits of participation must be communicated clearly. Citizens must be informed if final decision-making power rests with a political body, for example.
  • The subject matter, rules and procedures need to be transparent.
  • Citizens need to be aware that they have the opportunity to participate.
  • Access to participation must be easy, the channels of participation chosen according to the citizens’ media habits. Using the Internet should not be a goal in itself.
  • The participatory space should be “neutral ground”. A moderator can help ensure this.
  • The set-up must be interactive. Providing information is only a prerequisite for participation.
  • Participation must be possible without providing real names or personal data.
  • Citizens must receive continuous feedback regarding how results are handled and the implementation process.”

Smart Cities Turn Big Data Into Insight [Infographic]

Mark van Rijmenam in SmartDataCollective: “Cities around the globe are confronted with growing populations, aging infrastructure, reduced budgets, and the challenge of doing more with less. Applying big data technologies within cities can provide valuable insights that can keep a city habitable. The City of Songdo is a great example of a connected city, where all connected devices create a smart city that is optimized for the every-changing conditions in that same city. IBM recently released an infographic showing the vast opportunities of smart cities and the possible effects on the economy.”
Infographic Smarter Cities. Turning Big Data into Insight

The Art of Making City Code Beautiful

Nancy Scola in Next City: “Some rather pretty legal websites have popped up lately:, and, as of last Thursday, This is how municipal code would design itself if it actually wanted to be read.
The network of [city] sites is the output of The State Decoded, a project of the OpenGov Foundation (See correction below), which has its own fascinating provenance. That D.C.-based non-profit grew out of the fight in Congress over the SOPA and PIPA digital copyright bills a few winters ago. At the time, the office of Rep. Darrell Issa, more recently of Benghazi fame, built a platform called Madison that invited the public to help edit an alternative bill. Madison outlived the SOPA debate, and was spun out last summer as the flagship project of the OpenGov Foundation, helmed by former Issa staffer Seamus Kraft.
“What we discovered,” Kraft says, “is that co-authoring legislation is high up there on what [the public wants to] do with government information, but it’s not at the top.” What heads the list, he says, is simply knowing “what are the laws?” Pre-SanFranciscoCode, the city’s laws on everything from elections to electrical installations to transportation were trapped in an interface, run by publisher American Legal, that would not have looked out of place in “WarGames.” (Here’s the comparable “old” site for Chicago. It’s probably enough to say that Philadelphia’s comes with a “Frames/No Frames” option.) Madison needed a base of clean, structured municipal code upon which to function, and Kraft and company were finding that in cities across the country, that just didn’t exist.
Fixing the code, Kraft says, starts with him “unlawyering the text” that is either supplied to them by the city or scraped from online. This involves reading through the city code and looking for signposts that indicate when sections start, how provisions nest within them, and other structural cues that establish a pattern. That breakdown gets passed to the organization’s developers, who use it to automatically parse the full corpus. The process is time consuming. In San Francisco, 16 different patterns were required to capture each of the code’s sections. Often, the parser needs to be tweaked. “Sometimes it happens in a few minutes or a few hours,” Kraft says, “and sometimes it takes a few days.”

Over the long haul, Kraft has in mind adopting the customizability of YouVersion, the online digital Bible that allows users to choose fonts, colors and more. Kraft, a 2007 graduate of Georgetown who will cite the Catholic Church’s distributed structure as a model for networked government, proclaims YouVersion “the most kick-ass Bible you’ve ever seen. It’s stunning.” He’d like to do the same with municipal code, for the benefit of both the average American and those who have more regular engagement with local legal texts. “If you’re spending all day reading law,” he says, “you should at the very least have the most comfortable view possible.”


AskThem is a project of the Participatory Politics Foundation, a 501(c)3 non-profit organization with a mission to increase civic engagement. AskThem is supported by a charitable grant from the Knight Foundation’s Tech For Engagement initiative.
AskThem is a free & open-source website for questions-and-answers with public figures. It’s a not-for-profit tool for a stronger democracy, with open data for informed and engaged communities.
AskThem allows you to:

  • Find and ask questions to over 142,000 elected officials nationwide: federal, state and city levels of government.
  • Get signatures for your question or petition, have it delivered over email or Twitter, and push for a public response.
  • See questions from people near you, sign-on to questions you care about, and review answers from public figures.

It’s like a version of “We The People” for every elected official, from local city council members all the way up to U.S. senators. Enter your email above to be the first to ask a question when we launch and see previews of the site this Fall.
Elected officials: enter your email above and we’ll send you more information about signing up to answer questions on AskThem. It’s a free and non-partisan service to respond to your constituents in an open public forum and update them over email about your work. Or, be a leader in open-government and sign up now.
Issue-based organizations and media: sign up to help promote questions to government from people in your area. We’re working to launch with partnerships that build greater public accountability.
Previously known as the project, AskThem is open-source and uses open government data – our code is available on GitHub – contributions welcome. For more development updates & discussion, join our low-traffic Google Group.
We’re a small non-profit organization actively seeking charitable funding support – help us launch this powerful new tool for public dialogue! Email us for a copy of our non-profit funding prospectus. If you can make a tax-exempt gift to support our work, please donate to PPF via OpenCongress. More background on the project is available on our Knight NewsChallenge proposal from March 2013.
Questions, feedback, ideas? Email David Moore, Executive Director of PPF – david at, Twitter: @ppolitics; like our page on Facebook & follow @AskThemPPF on Twitter. Stay tuned!”

Towards an effective framework for building smart cities: Lessons from Seoul and San Francisco

New paper by JH Lee, MG Hancock, MC Hu in Technological Forecasting and Social Change: “This study aims to shed light on the process of building an effective smart city by integrating various practical perspectives with a consideration of smart city characteristics taken from the literature. We developed a framework for conducting case studies examining how smart cities were being implemented in San Francisco and Seoul Metropolitan City. The study’s empirical results suggest that effective, sustainable smart cities emerge as a result of dynamic processes in which public and private sector actors coordinate their activities and resources on an open innovation platform. The different yet complementary linkages formed by these actors must further be aligned with respect to their developmental stage and embedded cultural and social capabilities. Our findings point to eight ‘stylized facts’, based on both quantitative and qualitative empirical results that underlie the facilitation of an effective smart city. In elaborating these facts, the paper offers useful insights to managers seeking to improve the delivery of smart city developmental projects.”

Using Big Data to Ask Big Questions

Chase Davis in the SOURCE: “First, let’s dispense with the buzzwords. Big Data isn’t what you think it is: Every federal campaign contribution over the last 30-plus years amounts to several tens of millions of records. That’s not Big. Neither is a dataset of 50 million Medicare records. Or even 260 gigabytes of files related to offshore tax havens—at least not when Google counts its data in exabytes. No, the stuff we analyze in pursuit of journalism and app-building is downright tiny by comparison.
But you know what? That’s ok. Because while super-smart Silicon Valley PhDs are busy helping Facebook crunch through petabytes of user data, they’re also throwing off intellectual exhaust that we can benefit from in the journalism and civic data communities. Most notably: the ability to ask Big Questions.
Most of us who analyze public data for fun and profit are familiar with small questions. They’re focused, incisive, and often have the kind of black-and-white, definitive answers that end up in news stories: How much money did Barack Obama raise in 2012? Is the murder rate in my town going up or down?
Big Questions, on the other hand, are speculative, exploratory, and systemic. As the name implies, they are also answered at scale: Rather than distilling a small slice of a dataset into a concrete answer, Big Questions look at entire datasets and reveal small questions you wouldn’t have thought to ask.
Can we track individual campaign donor behavior over decades, and what does that tell us about their influence in politics? Which neighborhoods in my city are experiencing spikes in crime this week, and are police changing patrols accordingly?
Or, by way of example, how often do interest groups propose cookie-cutter bills in state legislatures?

Looking at Legislation

Even if you don’t follow politics, you probably won’t be shocked to learn that lawmakers don’t always write their own bills. In fact, interest groups sometimes write them word-for-word.
Sometimes those groups even try to push their bills in multiple states. The conservative American Legislative Exchange Council has gotten some press, but liberal groups, social and business interests, and even sororities and fraternities have done it too.
On its face, something about elected officials signing their names to cookie-cutter bills runs head-first against people’s ideal of deliberative Democracy—hence, it tends to make news. Those can be great stories, but they’re often limited in scope to a particular bill, politician, or interest group. They’re based on small questions.
Data science lets us expand our scope. Rather than focusing on one bill, or one interest group, or one state, why not ask: How many model bills were introduced in all 50 states, period, by anyone, during the last legislative session? No matter what they’re about. No matter who introduced them. No matter where they were introduced.
Now that’s a Big Question. And with some basic data science, it’s not particularly hard to answer—at least at a superficial level.

Analyze All the Things!

Just for kicks, I tried building a system to answer this question earlier this year. It was intended as an example, so I tried to choose methods that would make intuitive sense. But it also makes liberal use of techniques applied often to Big Data analysis: k-means clustering, matrices, graphs, and the like.
If you want to follow along, the code is here….
To make exploration a little easier, my code represents similar bills in graph space, shown at the top of this article. Each dot (known as a node) represents a bill. And a line connecting two bills (known as an edge) means they were sufficiently similar, according to my criteria (a cosine similarity of 0.75 or above). Thrown into a visualization software like Gephi, it’s easy to click around the clusters and see what pops out. So what do we find?
There are 375 clusters in total. Because of the limitations of our data, many of them represent vague, subject-specific bills that just happen to have similar titles even though the legislation itself is probably very different (think things like “Budget Bill” and “Campaign Finance Reform”). This is where having full bill text would come handy.
But mixed in with those bills are a handful of interesting nuggets. Several bills that appear to be modeled after legislation by the National Conference of Insurance Legislators appear in multiple states, among them: a bill related to limited lines travel insurance; another related to unclaimed insurance benefits; and one related to certificates of insurance.”

User-Generated Content Is Here to Stay

in the Huffington Post: “The way media are transmitted has changed dramatically over the last 10 years. User-generated content (UGC) has completely changed the landscape of social interaction, media outreach, consumer understanding, and everything in between. Today, UGC is media generated by the consumer instead of the traditional journalists and reporters. This is a movement defying and redefining traditional norms at the same time. Current events are largely publicized on Twitter and Facebook by the average person, and not by a photojournalist hired by a news organization. In the past, these large news corporations dominated the headlines — literally — and owned the monopoly on public media. Yet with the advent of smartphones and spread of social media, everything has changed. The entire industry has been replaced; smartphones have supplanted how information is collected, packaged, edited, and conveyed for mass distribution. UGC allows for raw and unfiltered movement of content at lightening speed. With the way that the world works today, it is the most reliable way to get information out. One thing that is for certain is that UGC is here to stay whether we like it or not, and it is driving much more of modern journalistic content than the average person realizes.
Think about recent natural disasters where images are captured by citizen journalists using their iPhones. During Hurricane Sandy, 800,000 photos uploaded onto Instagram with “#Sandy.” Time magazine even hired five iPhoneographers to photograph the wreckage for its Instagram page. During the May 2013 Oklahoma City tornadoes, the first photo released was actually captured by a smartphone. This real-time footage brings environmental chaos to your doorstep in a chillingly personal way, especially considering the photographer of the first tornado photos ultimately died because of the tornado. UGC has been monumental for criminal investigations and man-made catastrophes. Most notably, the Boston Marathon bombing was covered by UGC in the most unforgettable way. Dozens of images poured in identifying possible Boston bombers, to both the detriment and benefit of public officials and investigators. Though these images inflicted considerable damage to innocent bystanders sporting suspicious backpacks, ultimately it was also smartphone images that highlighted the presence of the Tsarnaev brothers. This phenomenon isn’t limited to America. Would the so-called Arab Spring have happened without social media and UGC? Syrians, Egyptians, and citizens from numerous nations facing protests can easily publicize controversial images and statements to be shared worldwide….
This trend is not temporary but will only expand. The first iPhone launched in 2007, and the world has never been the same. New smartphones are released each month with better cameras and faster processors than computers had even just a few years ago….”

Using Participatory Crowdsourcing in South Africa to Create a Safer Living Environment

New Paper by Bhaveer Bhana, Stephen Flowerday, and Aharon Satt in the International Journal of Distributed Sensor Networks: “The increase in urbanisation is making the management of city resources a difficult task. Data collected through observations (utilising humans as sensors) of the city surroundings can be used to improve decision making in terms of managing these resources. However, the data collected must be of a certain quality in order to ensure that effective and efficient decisions are made. This study is focused on the improvement of emergency and non-emergency services (city resources) through the use of participatory crowdsourcing (humans as sensors) as a data collection method (collect public safety data), utilising voice technology in the form of an interactive voice response (IVR) system.
The study illustrates how participatory crowdsourcing (specifically humans as sensors) can be used as a Smart City initiative focusing on public safety by illustrating what is required to contribute to the Smart City, and developing a roadmap in the form of a model to assist decision making when selecting an optimal crowdsourcing initiative. Public safety data quality criteria were developed to assess and identify the problems affecting data quality.
This study is guided by design science methodology and applies three driving theories: the Data Information Knowledge Action Result (DIKAR) model, the characteristics of a Smart City, and a credible Data Quality Framework. Four critical success factors were developed to ensure high quality public safety data is collected through participatory crowdsourcing utilising voice technologies.”