New paper by JH Lee, MG Hancock, MC Hu in Technological Forecasting and Social Change: “This study aims to shed light on the process of building an effective smart city by integrating various practical perspectives with a consideration of smart city characteristics taken from the literature. We developed a framework for conducting case studies examining how smart cities were being implemented in San Francisco and Seoul Metropolitan City. The study’s empirical results suggest that effective, sustainable smart cities emerge as a result of dynamic processes in which public and private sector actors coordinate their activities and resources on an open innovation platform. The different yet complementary linkages formed by these actors must further be aligned with respect to their developmental stage and embedded cultural and social capabilities. Our findings point to eight ‘stylized facts’, based on both quantitative and qualitative empirical results that underlie the facilitation of an effective smart city. In elaborating these facts, the paper offers useful insights to managers seeking to improve the delivery of smart city developmental projects.”
Using Big Data to Ask Big Questions
Chase Davis in the SOURCE: “First, let’s dispense with the buzzwords. Big Data isn’t what you think it is: Every federal campaign contribution over the last 30-plus years amounts to several tens of millions of records. That’s not Big. Neither is a dataset of 50 million Medicare records. Or even 260 gigabytes of files related to offshore tax havens—at least not when Google counts its data in exabytes. No, the stuff we analyze in pursuit of journalism and app-building is downright tiny by comparison.
But you know what? That’s ok. Because while super-smart Silicon Valley PhDs are busy helping Facebook crunch through petabytes of user data, they’re also throwing off intellectual exhaust that we can benefit from in the journalism and civic data communities. Most notably: the ability to ask Big Questions.
Most of us who analyze public data for fun and profit are familiar with small questions. They’re focused, incisive, and often have the kind of black-and-white, definitive answers that end up in news stories: How much money did Barack Obama raise in 2012? Is the murder rate in my town going up or down?
Big Questions, on the other hand, are speculative, exploratory, and systemic. As the name implies, they are also answered at scale: Rather than distilling a small slice of a dataset into a concrete answer, Big Questions look at entire datasets and reveal small questions you wouldn’t have thought to ask.
Can we track individual campaign donor behavior over decades, and what does that tell us about their influence in politics? Which neighborhoods in my city are experiencing spikes in crime this week, and are police changing patrols accordingly?
Or, by way of example, how often do interest groups propose cookie-cutter bills in state legislatures?
Looking at Legislation
Even if you don’t follow politics, you probably won’t be shocked to learn that lawmakers don’t always write their own bills. In fact, interest groups sometimes write them word-for-word.
Sometimes those groups even try to push their bills in multiple states. The conservative American Legislative Exchange Council has gotten some press, but liberal groups, social and business interests, and even sororities and fraternities have done it too.
On its face, something about elected officials signing their names to cookie-cutter bills runs head-first against people’s ideal of deliberative Democracy—hence, it tends to make news. Those can be great stories, but they’re often limited in scope to a particular bill, politician, or interest group. They’re based on small questions.
Data science lets us expand our scope. Rather than focusing on one bill, or one interest group, or one state, why not ask: How many model bills were introduced in all 50 states, period, by anyone, during the last legislative session? No matter what they’re about. No matter who introduced them. No matter where they were introduced.
Now that’s a Big Question. And with some basic data science, it’s not particularly hard to answer—at least at a superficial level.
Analyze All the Things!
Just for kicks, I tried building a system to answer this question earlier this year. It was intended as an example, so I tried to choose methods that would make intuitive sense. But it also makes liberal use of techniques applied often to Big Data analysis: k-means clustering, matrices, graphs, and the like.
If you want to follow along, the code is here….
To make exploration a little easier, my code represents similar bills in graph space, shown at the top of this article. Each dot (known as a node) represents a bill. And a line connecting two bills (known as an edge) means they were sufficiently similar, according to my criteria (a cosine similarity of 0.75 or above). Thrown into a visualization software like Gephi, it’s easy to click around the clusters and see what pops out. So what do we find?
There are 375 clusters in total. Because of the limitations of our data, many of them represent vague, subject-specific bills that just happen to have similar titles even though the legislation itself is probably very different (think things like “Budget Bill” and “Campaign Finance Reform”). This is where having full bill text would come handy.
But mixed in with those bills are a handful of interesting nuggets. Several bills that appear to be modeled after legislation by the National Conference of Insurance Legislators appear in multiple states, among them: a bill related to limited lines travel insurance; another related to unclaimed insurance benefits; and one related to certificates of insurance.”
Best Practices for Government Crowdsourcing Programs
Anton Root: “Crowdsourcing helps communities connect and organize, so it makes sense that governments are increasingly making use of crowd-powered technologies and processes.
Just recently, for instance, we wrote about the Malaysian government’s initiative to crowdsource the national budget. Closer to home, we’ve seen government agencies from U.S. AID to NASA make use of the crowd.
Daren Brabham, professor at the University of Southern California, recently published a report titled “Using Crowdsourcing In Government” that introduces readers to the basics of crowdsourcing, highlights effective use cases, and establishes best practices when it comes to governments opening up to the crowd. Below, we take a look at a few of the suggestions Brabham makes to those considering crowdsourcing.
Brabham splits up his ten best practices into three phases: planning, implementation, and post-implementation. The first suggestion in the planning phase he makes may be the most critical of all: “Clearly define the problem and solution parameters.” If the community isn’t absolutely clear on what the problem is, the ideas and solutions that users submit will be equally vague and largely useless.
This applies not only to government agencies, but also to SMEs and large enterprises making use of crowdsourcing. At Massolution NYC 2013, for instance, we heard again and again the importance of meticulously defining a problem. And open innovation platform InnoCentive’s CEO Andy Zynga stressed the big role his company plays in helping organizations do away with the “curse of knowledge.”
Brabham also has advice for projects in their implementation phase, the key bit being: “Launch a promotional plan and a plan to grow and sustain the community.” Simply put, crowdsourcing cannot work without a crowd, so it’s important to build up the community before launching a campaign. It does take some balance, however, as a community that’s too large by the time a campaign launches can turn off newcomers who “may not feel welcome or may be unsure how to become initiated into the group or taken seriously.”
Brabham’s key advice for the post-implementation phase is: “Assess the project from many angles.” The author suggests tracking website traffic patterns, asking users to volunteer information about themselves when registering, and doing original research through surveys and interviews. The results of follow-up research can help to better understand the responses submitted, and also make it easier to show the successes of the crowdsourcing campaign. This is especially important for organizations partaking in ongoing crowdsourcing efforts.”
The Solution Revolution
New book by William D. Eggers and Paul Macmillan from Deloitte: “Where tough societal problems persist, citizens, social enterprises, and yes, even businesses, are relying less and less on government-only solutions. More likely, they are crowd funding, ride-sharing, app- developing or impact- investing to design lightweight solutions for seemingly intractable problems. No challenge is too daunting, from malaria in Africa to traffic congestion in California.
These wavemakers range from edgy social enterprises growing at a clip of 15% a year, to mega-foundations that are eclipsing development aid, to Fortune 500 companies delivering social good on the path to profit. The collective force of these new problem solvers is creating dynamic and rapidly evolving markets for social good. They trade solutions instead of dollars to fill the gap between what government can provide and what citizens need. By erasing public-private sector boundaries, they are unlocking trillions of dollars in social benefit and commercial value.
The Solution Revolution explores how public and private are converging to form the Solution Economy. By examining scores of examples, Eggers and Macmillan reveal the fundamentals of this new – globally prevalent – economic and social order. The book is designed to help guide those willing to invest time, knowledge or capital toward sustainable, social progress.”
Imagining Data Without Division
Thomas Lin in Quanta Magazine: “As science dives into an ocean of data, the demands of large-scale interdisciplinary collaborations are growing increasingly acute…Seven years ago, when David Schimel was asked to design an ambitious data project called the National Ecological Observatory Network, it was little more than a National Science Foundation grant. There was no formal organization, no employees, no detailed science plan. Emboldened by advances in remote sensing, data storage and computing power, NEON sought answers to the biggest question in ecology: How do global climate change, land use and biodiversity influence natural and managed ecosystems and the biosphere as a whole?…
For projects like NEON, interpreting the data is a complicated business. Early on, the team realized that its data, while mid-size compared with the largest physics and biology projects, would be big in complexity. “NEON’s contribution to big data is not in its volume,” said Steve Berukoff, the project’s assistant director for data products. “It’s in the heterogeneity and spatial and temporal distribution of data.”
Unlike the roughly 20 critical measurements in climate science or the vast but relatively structured data in particle physics, NEON will have more than 500 quantities to keep track of, from temperature, soil and water measurements to insect, bird, mammal and microbial samples to remote sensing and aerial imaging. Much of the data is highly unstructured and difficult to parse — for example, taxonomic names and behavioral observations, which are sometimes subject to debate and revision.
And, as daunting as the looming data crush appears from a technical perspective, some of the greatest challenges are wholly nontechnical. Many researchers say the big science projects and analytical tools of the future can succeed only with the right mix of science, statistics, computer science, pure mathematics and deft leadership. In the big data age of distributed computing — in which enormously complex tasks are divided across a network of computers — the question remains: How should distributed science be conducted across a network of researchers?
Part of the adjustment involves embracing “open science” practices, including open-source platforms and data analysis tools, data sharing and open access to scientific publications, said Chris Mattmann, 32, who helped develop a precursor to Hadoop, a popular open-source data analysis framework that is used by tech giants like Yahoo, Amazon and Apple and that NEON is exploring. Without developing shared tools to analyze big, messy data sets, Mattmann said, each new project or lab will squander precious time and resources reinventing the same tools. Likewise, sharing data and published results will obviate redundant research.
To this end, international representatives from the newly formed Research Data Alliance met this month in Washington to map out their plans for a global open data infrastructure.”
User-Generated Content Is Here to Stay
Azeem Khan in the Huffington Post: “The way media are transmitted has changed dramatically over the last 10 years. User-generated content (UGC) has completely changed the landscape of social interaction, media outreach, consumer understanding, and everything in between. Today, UGC is media generated by the consumer instead of the traditional journalists and reporters. This is a movement defying and redefining traditional norms at the same time. Current events are largely publicized on Twitter and Facebook by the average person, and not by a photojournalist hired by a news organization. In the past, these large news corporations dominated the headlines — literally — and owned the monopoly on public media. Yet with the advent of smartphones and spread of social media, everything has changed. The entire industry has been replaced; smartphones have supplanted how information is collected, packaged, edited, and conveyed for mass distribution. UGC allows for raw and unfiltered movement of content at lightening speed. With the way that the world works today, it is the most reliable way to get information out. One thing that is for certain is that UGC is here to stay whether we like it or not, and it is driving much more of modern journalistic content than the average person realizes.
Think about recent natural disasters where images are captured by citizen journalists using their iPhones. During Hurricane Sandy, 800,000 photos uploaded onto Instagram with “#Sandy.” Time magazine even hired five iPhoneographers to photograph the wreckage for its Instagram page. During the May 2013 Oklahoma City tornadoes, the first photo released was actually captured by a smartphone. This real-time footage brings environmental chaos to your doorstep in a chillingly personal way, especially considering the photographer of the first tornado photos ultimately died because of the tornado. UGC has been monumental for criminal investigations and man-made catastrophes. Most notably, the Boston Marathon bombing was covered by UGC in the most unforgettable way. Dozens of images poured in identifying possible Boston bombers, to both the detriment and benefit of public officials and investigators. Though these images inflicted considerable damage to innocent bystanders sporting suspicious backpacks, ultimately it was also smartphone images that highlighted the presence of the Tsarnaev brothers. This phenomenon isn’t limited to America. Would the so-called Arab Spring have happened without social media and UGC? Syrians, Egyptians, and citizens from numerous nations facing protests can easily publicize controversial images and statements to be shared worldwide….
This trend is not temporary but will only expand. The first iPhone launched in 2007, and the world has never been the same. New smartphones are released each month with better cameras and faster processors than computers had even just a few years ago….”
When the Government Shuts, Even Web Sites Go Down
In keeping with the senseless nature of the shutdown, some Web sites are down while others are still up. The Federal Trade Commission, for instance, has blocked access to its site. It has posted a notice online saying that it’s closed indefinitely as are its systems for people to register complaints or enter telephone numbers on the do-not call list. By contrast, the Department of Education has left its site up with a notice informing visitors that it will not be updated during the shutdown. Sites for the White House, Treasury and the Internal Revenue Service, are being updated at least in part. (Here’s a pretty comprehensive list of which sites are up and which are not.)
Each department and agency has had to decide what to do with its Web site based on its interpretation of federal laws and rules. In a memo (PDF) written last month, the Office of Management and Budget offered some guidance to officials trying to figure out what to do. ..In further keeping with the truly bizarre nature of government shutdowns, the O.M.B. also reminded government officials that they should pay no attention to whether it will cost more to shut down their Web site than it does to keep it going.”
See also: Blacked Out Government Websites Available Through Wayback Machine
5 Ways Cities Are Using Big Data
Eric Larson in Mashable: “New York City released more than 200 high-value data sets to the public on Monday — a way, in part, to provide more content for open-sourced mapping projects like OpenStreetMap.
It’s one of the many releases since the Local Law 11 of 2012 passed in February, which calls for more transparency of the city government’s collected data.
But it’s not just New York: Cities across the world, large and small, are utilizing big data sets — like traffic statistics, energy consumption rates and GPS mapping — to launch projects to help their respective communities.
We rounded up a few of our favorites below….
1. Seattle’s Power Consumption
The city of Seattle recently partnered with Microsoft and Accenture on a pilot project to reduce the area’s energy usage. Using Microsoft’s Azure cloud, the project will collect and analyze hundreds of data sets collected from four downtown buildings’ management systems.
With predictive analytics, then, the system will work to find out what’s working and what’s not — i.e. where energy can be used less, or not at all. The goal is to reduce power usage by 25%.
2. SpotHero
SpotHero is an app, for both iOS and Android devices, that tracks down parking spots in a select number of cities. How it works: Users type in an address or neighborhood (say, Adams Morgan in Washington, D.C.) and are taken to a listing of available garages and lots nearby — complete with prices and time durations.
The app tracks availability in real-time, too, so a spot is updated in the system as soon as it’s snagged.
Seven cities are currently synced with the app: Washington, D.C., New York, Chicago, Baltimore, Boston, Milwaukee and Newark, N.J.
3. Adopt-a-Hydrant
In January, the city’s Office of New Urban Mechanics released an app called Adopt-a-Hydrant. The program is mapped with every fire hydrant in the city proper — more than 13,000, according to a Harvard blog post — and lets residents pledge to shovel out one, or as many as they choose, in the almost inevitable event of a blizzard.
Once a pledge is made, volunteers receive a notification if their hydrant — or hydrants — become buried in snow.
4. Adopt-a-Sidewalk
Similar to Adopt-a-Hydrant, Chicago’s Adopt-a-Sidewalk app lets residents of the Windy City pledge to shovel sidewalks after snowfall. In a city just as notorious for snowstorms as Boston, it’s an effective way to ensure public spaces remain free of snow and ice — especially spaces belonging to the elderly or disabled.
If you’re unsure which part of town you’d like to “adopt,” just register on the website and browse the map — you’ll receive a pop-up notification for each street you swipe that’s still available.
5. Less Congestion for Lyon
The system, called the “Decision Support System Optimizer (DSSO),” uses real-time traffic reports to detect and predict congestions. If an operator sees that a traffic jam is likely to occur, then, she/he can adjust traffic signals accordingly to keep the flow of cars moving smoothly.
It’s an especially helpful tool for emergencies — say, when an ambulance is en route to the hospital. Over time, the algorithms in the system will “learn” from its most successful recommendations, then apply that knowledge when making future predictions.”
Data Swap
This isn’t your mother’s hackathon.
There’s no conference room full over over-caffeinated and under-deodorized engineers, no 72 hour time limit, and no room for shoddy prototypes. This is an opportunity for a select number of gifted researchers to join interdisciplinary teams to work on the pressing and meaningful problems facing Boston communities.
Unlike hackathons, meant to generate quick ideas and prototypes in a short period of time, DataSwap is about forging and supporting long-term collaborations between researchers, communities and data guardians. Groups sharing common interests and complementary skills will collaborate around specific problems. Each problem will be proposed by the owners of one of the datasets who present. On day one at The Boston Globe, you’ll learn more about that dataset and others to help you in your research. You’ll be given a community facilitator to help you craft useful research that is relevant outside the bounds of academia. Then, it’s up to you! Over the next several months, you and your team are challenged to craft a presentation around the problem you were given. At the conclusion of the time frame, we’ll reconvene to share our findings with one another and choose a winner.”
Embracing Expertise
It concerns and bothers me that most technologists are male and white but I am not concerned, in fact I am quite thrilled, these experts are taking political charge. I tend to agree with Michael Shudson’s reading of Walter Lippman that when it comes to democracy we need more experts not less: “The intellectual challenge is not to invent democracy without experts, but to seek a way to harness experts to a legitimately democratic function.”
Imagine if as many doctors and professors mobilized their moral authority and expertise as hackers have done, to rise up and intervene in the problems plaguing their vocational domains. Professors would be visibly denouncing the dismal and outrageous labor conditions of adjuncts whose pay is a pittance. Doctors would be involved in the fight for more affordable health care in the United States. Mobilizing expertise does not mean other stakeholders can’t and should not have a voice but there are many practical and moral reasons why we should embrace a politics of expertise, especially if configured to allow more generally contributions.
More than any other group of experts, hackers have shown how productive an expert based politics can be. And many domains of hacker and geek politics such as the Pirate Parties and Anonymous are interesting precisely for how they marry an open participatory element along with a more technical, expert-based one. Expertise can co-exist with participation if configured as such.
My sense is that hacker (re: technically informed) based politics will grow more important in years to come. Just last week I went to visit one hacker-activist, Jeremy Hammond who is in jail for his politically motivated acts of direct action. I asked him what he thought of Edward Snowden’s revelations about the NSA’s blanket surveillance of American citizens. Along with saying he was encouraged for someone dared to expose this wrongdoing (as many of us are), he captured the enormous power held by hackers and technologists when he followed with this statement: “there are all these nerds who don’t agree with what is politically happening and they have power.”
Hammond and others are exercising their technical power and I generally think this is a net gain for democracy. But it is why we must diligently work toward establishing more widespread digital and technical literacy. The low numbers of female technologists and other minorities in and out of hacker-dom are appalling and disturbing (and why I am involved with initiatives like those of NCWIT to rectify this problem). There are certainly barriers internal to the hacker world but the problems are so entrenched and so systematic unless those are solved, the numbers of women in voluntary and political domains will continue to be low.
So it is not that expertise is the problem. It is the barriers that prevent a large class of individuals from ever becoming experts that concerns me the most”.