The move toward 'crowdsourcing' public safety


PhysOrg: “Earlier this year, Martin Dias, assistant professor in the D’Amore-McKim School of Business, presented research for the National Law Enforcement Telecommunications System in which he examined Nlets’ network and how its governance and technology helped enable inter-agency information sharing. This work builds on his research aimed at understanding design principles for this public safety “social networks” and other collaborative networks. We asked Dias to discuss how information sharing around public safety has evolved in recent years and the benefits and challenges of what he describes as “crowdsourcing public safety.” …

What is “crowdsourcing public safety” and why are public safety agencies moving toward this trend?
Crowdsourcing—the term coined by our own assistant professor of journalism Jeff Howe—involves taking a task or job traditionally performed by a distinct agent, or employee, and having that activity be executed by an “undefined, generally large group of people in an open call.” Crowdsourcing public safety involves engaging and enabling private citizens to assist public safety professionals in addressing natural disasters, terror attacks, organized crime incidents, and large-scale industrial accidents.
Public safety agencies have long recognized the need for citizen involvement. Tip lines and missing persons bulletins have been used to engage citizens for years, but with advances in mobile applications and big data analytics, the ability of to receive, process, and make use of high volume, tips, and leads makes crowdsourcing searches and investigations more feasible. You saw this in the FBI Boston Marathon Bombing web-based Tip Line. You see it in the “See Something Say Something” initiatives throughout the country. You see it in AMBER alerts or even remote search and rescue efforts. You even see it in more routine instances like Washington State’s HERO program to reduce traffic violations.
Have these efforts been successful, and what challenges remain?
There are a number of issues to overcome with regard to crowdsourcing public safety—such as maintaining privacy rights, ensuring data quality, and improving trust between citizens and officers. Controversies over the National Security Agency’s surveillance program and neighborhood watch programs – particularly the shooting death of teenager Trayvon Martin by neighborhood watch captain George Zimmerman, reflect some of these challenges. It is not clear yet from research the precise set of success criteria, but those efforts that appear successful at the moment have tended to be centered around a particular crisis incident—such as a specific attack or missing person. But as more crowdsourcing public safety mobile applications are developed, adoption and use is likely to increase. One trend to watch is whether national public safety programs are able to tap into the existing social networks of community-based responders like American Red Cross volunteers, Community Emergency Response Teams, and United Way mentors.
The move toward crowdsourcing is part of an overall trend toward improving community resilience, which refers to a system’s ability to bounce back after a crisis or disturbance. Stephen Flynn and his colleagues at Northeastern’s George J. Kostas Research Institute for Homeland Security are playing a key role in driving a national conversation in this area. Community resilience is inherently multi-disciplinary, so you see research being done regarding transportation infrastructure, social media use after a crisis event, and designing sustainable urban environments. Northeastern is a place where use-inspired research is addressing real-world problems. It will take a village to improve community resilience capabilities, and our institution is a vital part of thought leadership for that village.”
 

If big data is an atomic bomb, disarmament begins in Silicon Valley


at GigaOM: “Big data is like atomic energy, according to scientist Albert-László Barabási in a Monday column on Politico. It’s very beneficial when used ethically, and downright destructive when turned into a weapon. He argues scientists can help resolve the damage done by government spying by embracing the principles of nuclear nonproliferation that helped bring an end to Cold War fears and distrust.
Barabási’s analogy is rather poetic:

“Powered by the right type of Big Data, data mining is a weapon. It can be just as harmful, with long-term toxicity, as an atomic bomb. It poisons trust, straining everything from human relations to political alliances and free trade. It may target combatants, but it cannot succeed without sifting through billions of data points scraped from innocent civilians. And when it is a weapon, it should be treated like a weapon.”

I think he’s right, but I think the fight to disarm the big data bomb begins in places like Silicon Valley and Madison Avenue. And it’s not just scientists; all citizens should have a role…
I write about big data and data mining for a living, and I think the underlying technologies and techniques are incredibly valuable, even if the applications aren’t always ideal. On the one hand, advances in machine learning from companies such as Google and Microsoft are fantastic. On the other hand, Facebook’s newly expanded Graph Search makes Europe’s proposed right-to-be-forgotten laws seem a lot more sensible.
But it’s all within the bounds of our user agreements and beauty is in the eye of the beholder.
Perhaps the reason we don’t vote with our feet by moving to web platforms that embrace privacy, even though we suspect it’s being violated, is that we really don’t know what privacy means. Instead of regulating what companies can and can’t do, perhaps lawmakers can mandate a degree of transparency that actually lets users understand how data is being used, not just what data is being collected. Great, some company knows my age, race, ZIP code and web history: What I really need to know is how it’s using that information to target, discriminate against or otherwise serve me.
An intelligent national discussion about the role of the NSA is probably in order. For all anyone knows,  it could even turn out we’re willing to put up with more snooping than the goverment might expect. But until we get a handle on privacy from the companies we choose to do business with, I don’t think most Americans have the stomach for such a difficult fight.”

Where in the World are Young People Using the Internet?


Georgia Tech: “According to a common myth, today’s young people are all glued to the Internet. But in fact, only 30 percent of the world’s youth population between the ages of 15 and 24 years old has been active online for at least five years. In South Korea, 99.6 percent of young people are active, the highest percentage in the world. The least? The Asian island of Timor Leste with less than 1 percent.

Digital Natives as Percentage of Total Population

Digital natives as a percentage of total population, 2012 (Courtesy: ITU)

Those are among the many findings in a study from the Georgia Institute of Technology and International Telecommunication Union (ITU). The study is the first attempt to measure, by country, the world’s “digital natives.” The term is typically used to categorize young people born around the time the personal computer was introduced and have spent their lives connected with technology.
Nearly 96 percent of American millennials are digital natives. That figure is behind Japan (99.5 percent) and several European countries, including Finland, Denmark and the Netherlands.
But the percentage that Georgia Tech Associate Professor Michael Best thinks is the most important is the number of digital natives as compared to a country’s total population….
The countries with the highest proportion of digital natives among their population are mostly rich nations, which have high levels of overall Internet penetration. Iceland is at the top of the list with 13.9 percent. The United States is sixth (13.1 percent). A big surprise is Malaysia, a middle-income country with one of the highest proportions of digital natives (ranked 4th at 13.4 percent). Malaysia has a strong history of investing in educational technology.
The countries with the smallest estimated proportion of digital natives are Timor-Leste, Myanmar and Sierra Leone. The bottom 10 consists entirely of African or Asian nations, many of which are suffering from conflict and/or have very low Internet availability.”

The Art of Making City Code Beautiful


Nancy Scola in Next City: “Some rather pretty legal websites have popped up lately: PhillyCode.org, ChicagoCode.org and, as of last Thursday, SanFranciscoCode.org. This is how municipal code would design itself if it actually wanted to be read.
The network of [city]Code.org sites is the output of The State Decoded, a project of the OpenGov Foundation (See correction below), which has its own fascinating provenance. That D.C.-based non-profit grew out of the fight in Congress over the SOPA and PIPA digital copyright bills a few winters ago. At the time, the office of Rep. Darrell Issa, more recently of Benghazi fame, built a platform called Madison that invited the public to help edit an alternative bill. Madison outlived the SOPA debate, and was spun out last summer as the flagship project of the OpenGov Foundation, helmed by former Issa staffer Seamus Kraft.
“What we discovered,” Kraft says, “is that co-authoring legislation is high up there on what [the public wants to] do with government information, but it’s not at the top.” What heads the list, he says, is simply knowing “what are the laws?” Pre-SanFranciscoCode, the city’s laws on everything from elections to electrical installations to transportation were trapped in an interface, run by publisher American Legal, that would not have looked out of place in “WarGames.” (Here’s the comparable “old” site for Chicago. It’s probably enough to say that Philadelphia’s comes with a “Frames/No Frames” option.) Madison needed a base of clean, structured municipal code upon which to function, and Kraft and company were finding that in cities across the country, that just didn’t exist.
Fixing the code, Kraft says, starts with him “unlawyering the text” that is either supplied to them by the city or scraped from online. This involves reading through the city code and looking for signposts that indicate when sections start, how provisions nest within them, and other structural cues that establish a pattern. That breakdown gets passed to the organization’s developers, who use it to automatically parse the full corpus. The process is time consuming. In San Francisco, 16 different patterns were required to capture each of the code’s sections. Often, the parser needs to be tweaked. “Sometimes it happens in a few minutes or a few hours,” Kraft says, “and sometimes it takes a few days.”

Over the long haul, Kraft has in mind adopting the customizability of YouVersion, the online digital Bible that allows users to choose fonts, colors and more. Kraft, a 2007 graduate of Georgetown who will cite the Catholic Church’s distributed structure as a model for networked government, proclaims YouVersion “the most kick-ass Bible you’ve ever seen. It’s stunning.” He’d like to do the same with municipal code, for the benefit of both the average American and those who have more regular engagement with local legal texts. “If you’re spending all day reading law,” he says, “you should at the very least have the most comfortable view possible.”

The Brave New World of Good


Brad Smith: “Welcome to the Brave New World of Good. Once almost the exclusive province of nonprofit organizations and the philanthropic foundations that fund them, today the terrain of good is disputed by social entrepreneurs, social enterprises, impact investors, big business, governments, and geeks. Their tools of choice are markets, open data, innovation, hackathons, and disruption. They cross borders, social classes, and paradigms with the swipe of a touch screen. We seemed poised to unleash a whole new era of social and environmental progress, accompanied by unimagined economic prosperity.
As a brand, good is unassailably brilliant. Who could be against it? It is virtually impossible to write an even mildly skeptical blog post about good without sounding well, bad — or at least a bit old-fashioned. For the record, I firmly believe there is much in the brave new world of good that is helping us find our way out of the tired and often failed models of progress and change on which we have for too long relied. Still, there are assumptions worth questioning and questions worth answering to ensure that the good we seek is the good that can be achieved.

Open Data
Second only to “good” in terms of marketing genius is the concept of “open data.” An offspring of previous movements such as “open source,” “open content,” and “open access,” open data in the Internet age has come to mean data that is machine-readable, free to access, and free to use, re-use, and re-distribute, subject to attribution. Fully open data goes way beyond posting your .pdf document on a Web site (as neatly explained by Tim Berners Lee’s five-star framework).
When it comes to government, there is a rapidly accelerating movement around the world that is furthering transparency by making vast stores of data open. Ditto on the data of international aid funders like the United States Agency for International Development, the World Bank, and the Organisation for Economic Co-operation and Development. The push has now expanded to the tax return data of nonprofits and foundations (IRS Forms 990). Collection of data by government has a business model; it’s called tax dollars. However, open data is not born pure. Cleaning that data, making it searchable, and building and maintaining reliable user interfaces is complex, time-consuming, and often expensive. That requires a consistent stream of income of the kind that can only come from fees, subscriptions, or, increasingly less so, government.
Foundation grants are great for short-term investment, experimentation, or building an app or two, but they are no substitute for a scalable business model. Structured, longitudinal data are vital to social, environmental, and economic progress. In a global economy where government is retreating from the funding of public goods, figuring how to pay for the cost of that data is one of our greatest challenges.”

Towards an effective framework for building smart cities: Lessons from Seoul and San Francisco


New paper by JH Lee, MG Hancock, MC Hu in Technological Forecasting and Social Change: “This study aims to shed light on the process of building an effective smart city by integrating various practical perspectives with a consideration of smart city characteristics taken from the literature. We developed a framework for conducting case studies examining how smart cities were being implemented in San Francisco and Seoul Metropolitan City. The study’s empirical results suggest that effective, sustainable smart cities emerge as a result of dynamic processes in which public and private sector actors coordinate their activities and resources on an open innovation platform. The different yet complementary linkages formed by these actors must further be aligned with respect to their developmental stage and embedded cultural and social capabilities. Our findings point to eight ‘stylized facts’, based on both quantitative and qualitative empirical results that underlie the facilitation of an effective smart city. In elaborating these facts, the paper offers useful insights to managers seeking to improve the delivery of smart city developmental projects.”
 

Using Big Data to Ask Big Questions


Chase Davis in the SOURCE: “First, let’s dispense with the buzzwords. Big Data isn’t what you think it is: Every federal campaign contribution over the last 30-plus years amounts to several tens of millions of records. That’s not Big. Neither is a dataset of 50 million Medicare records. Or even 260 gigabytes of files related to offshore tax havens—at least not when Google counts its data in exabytes. No, the stuff we analyze in pursuit of journalism and app-building is downright tiny by comparison.
But you know what? That’s ok. Because while super-smart Silicon Valley PhDs are busy helping Facebook crunch through petabytes of user data, they’re also throwing off intellectual exhaust that we can benefit from in the journalism and civic data communities. Most notably: the ability to ask Big Questions.
Most of us who analyze public data for fun and profit are familiar with small questions. They’re focused, incisive, and often have the kind of black-and-white, definitive answers that end up in news stories: How much money did Barack Obama raise in 2012? Is the murder rate in my town going up or down?
Big Questions, on the other hand, are speculative, exploratory, and systemic. As the name implies, they are also answered at scale: Rather than distilling a small slice of a dataset into a concrete answer, Big Questions look at entire datasets and reveal small questions you wouldn’t have thought to ask.
Can we track individual campaign donor behavior over decades, and what does that tell us about their influence in politics? Which neighborhoods in my city are experiencing spikes in crime this week, and are police changing patrols accordingly?
Or, by way of example, how often do interest groups propose cookie-cutter bills in state legislatures?

Looking at Legislation

Even if you don’t follow politics, you probably won’t be shocked to learn that lawmakers don’t always write their own bills. In fact, interest groups sometimes write them word-for-word.
Sometimes those groups even try to push their bills in multiple states. The conservative American Legislative Exchange Council has gotten some press, but liberal groups, social and business interests, and even sororities and fraternities have done it too.
On its face, something about elected officials signing their names to cookie-cutter bills runs head-first against people’s ideal of deliberative Democracy—hence, it tends to make news. Those can be great stories, but they’re often limited in scope to a particular bill, politician, or interest group. They’re based on small questions.
Data science lets us expand our scope. Rather than focusing on one bill, or one interest group, or one state, why not ask: How many model bills were introduced in all 50 states, period, by anyone, during the last legislative session? No matter what they’re about. No matter who introduced them. No matter where they were introduced.
Now that’s a Big Question. And with some basic data science, it’s not particularly hard to answer—at least at a superficial level.

Analyze All the Things!

Just for kicks, I tried building a system to answer this question earlier this year. It was intended as an example, so I tried to choose methods that would make intuitive sense. But it also makes liberal use of techniques applied often to Big Data analysis: k-means clustering, matrices, graphs, and the like.
If you want to follow along, the code is here….
To make exploration a little easier, my code represents similar bills in graph space, shown at the top of this article. Each dot (known as a node) represents a bill. And a line connecting two bills (known as an edge) means they were sufficiently similar, according to my criteria (a cosine similarity of 0.75 or above). Thrown into a visualization software like Gephi, it’s easy to click around the clusters and see what pops out. So what do we find?
There are 375 clusters in total. Because of the limitations of our data, many of them represent vague, subject-specific bills that just happen to have similar titles even though the legislation itself is probably very different (think things like “Budget Bill” and “Campaign Finance Reform”). This is where having full bill text would come handy.
But mixed in with those bills are a handful of interesting nuggets. Several bills that appear to be modeled after legislation by the National Conference of Insurance Legislators appear in multiple states, among them: a bill related to limited lines travel insurance; another related to unclaimed insurance benefits; and one related to certificates of insurance.”

Best Practices for Government Crowdsourcing Programs


Anton Root: “Crowdsourcing helps communities connect and organize, so it makes sense that governments are increasingly making use of crowd-powered technologies and processes.
Just recently, for instance, we wrote about the Malaysian government’s initiative to crowdsource the national budget. Closer to home, we’ve seen government agencies from U.S. AID to NASA make use of the crowd.
Daren Brabham, professor at the University of Southern California, recently published a report titled “Using Crowdsourcing In Government” that introduces readers to the basics of crowdsourcing, highlights effective use cases, and establishes best practices when it comes to governments opening up to the crowd. Below, we take a look at a few of the suggestions Brabham makes to those considering crowdsourcing.
Brabham splits up his ten best practices into three phases: planning, implementation, and post-implementation. The first suggestion in the planning phase he makes may be the most critical of all: “Clearly define the problem and solution parameters.” If the community isn’t absolutely clear on what the problem is, the ideas and solutions that users submit will be equally vague and largely useless.
This applies not only to government agencies, but also to SMEs and large enterprises making use of crowdsourcing. At Massolution NYC 2013, for instance, we heard again and again the importance of meticulously defining a problem. And open innovation platform InnoCentive’s CEO Andy Zynga stressed the big role his company plays in helping organizations do away with the “curse of knowledge.”
Brabham also has advice for projects in their implementation phase, the key bit being: “Launch a promotional plan and a plan to grow and sustain the community.” Simply put, crowdsourcing cannot work without a crowd, so it’s important to build up the community before launching a campaign. It does take some balance, however, as a community that’s too large by the time a campaign launches can turn off newcomers who “may not feel welcome or may be unsure how to become initiated into the group or taken seriously.”
Brabham’s key advice for the post-implementation phase is: “Assess the project from many angles.” The author suggests tracking website traffic patterns, asking users to volunteer information about themselves when registering, and doing original research through surveys and interviews. The results of follow-up research can help to better understand the responses submitted, and also make it easier to show the successes of the crowdsourcing campaign. This is especially important for organizations partaking in ongoing crowdsourcing efforts.”

The Solution Revolution


New book by William D. Eggers and Paul Macmillan from Deloitte: “Where tough societal problems persist, citizens, social enterprises, and yes, even businesses, are relying less and less on government-only solutions. More likely, they are crowd funding, ride-sharing, app- developing or impact- investing to design lightweight solutions for seemingly intractable problems. No challenge is too daunting, from malaria in Africa to traffic congestion in California.
These wavemakers range from edgy social enterprises growing at a clip of 15% a year, to mega-foundations that are eclipsing development aid, to Fortune 500 companies delivering social good on the path to profit. The collective force of these new problem solvers is creating dynamic and rapidly evolving markets for social good. They trade solutions instead of dollars to fill the gap between what government can provide and what citizens need. By erasing public-private sector boundaries, they are unlocking trillions of dollars in social benefit and commercial value.
The Solution Revolution explores how public and private are converging to form the Solution Economy. By examining scores of examples, Eggers and Macmillan reveal the fundamentals of this new – globally prevalent – economic and social order. The book is designed to help guide those willing to invest time, knowledge or capital toward sustainable, social progress.”

Imagining Data Without Division


Thomas Lin in Quanta Magazine: “As science dives into an ocean of data, the demands of large-scale interdisciplinary collaborations are growing increasingly acute…Seven years ago, when David Schimel was asked to design an ambitious data project called the National Ecological Observatory Network, it was little more than a National Science Foundation grant. There was no formal organization, no employees, no detailed science plan. Emboldened by advances in remote sensing, data storage and computing power, NEON sought answers to the biggest question in ecology: How do global climate change, land use and biodiversity influence natural and managed ecosystems and the biosphere as a whole?…
For projects like NEON, interpreting the data is a complicated business. Early on, the team realized that its data, while mid-size compared with the largest physics and biology projects, would be big in complexity. “NEON’s contribution to big data is not in its volume,” said Steve Berukoff, the project’s assistant director for data products. “It’s in the heterogeneity and spatial and temporal distribution of data.”
Unlike the roughly 20 critical measurements in climate science or the vast but relatively structured data in particle physics, NEON will have more than 500 quantities to keep track of, from temperature, soil and water measurements to insect, bird, mammal and microbial samples to remote sensing and aerial imaging. Much of the data is highly unstructured and difficult to parse — for example, taxonomic names and behavioral observations, which are sometimes subject to debate and revision.
And, as daunting as the looming data crush appears from a technical perspective, some of the greatest challenges are wholly nontechnical. Many researchers say the big science projects and analytical tools of the future can succeed only with the right mix of science, statistics, computer science, pure mathematics and deft leadership. In the big data age of distributed computing — in which enormously complex tasks are divided across a network of computers — the question remains: How should distributed science be conducted across a network of researchers?
Part of the adjustment involves embracing “open science” practices, including open-source platforms and data analysis tools, data sharing and open access to scientific publications, said Chris Mattmann, 32, who helped develop a precursor to Hadoop, a popular open-source data analysis framework that is used by tech giants like Yahoo, Amazon and Apple and that NEON is exploring. Without developing shared tools to analyze big, messy data sets, Mattmann said, each new project or lab will squander precious time and resources reinventing the same tools. Likewise, sharing data and published results will obviate redundant research.
To this end, international representatives from the newly formed Research Data Alliance met this month in Washington to map out their plans for a global open data infrastructure.”