Why Crowdsourcing is the Next Cloud Computing


Alpheus Bingham, co-founder and a member of the board of directors at InnoCentive, in Wired: “But over the course of a decade, what we now call cloud-based or software-as-a-service (SaaS) applications has taken the world by storm and become mainstream. Today, cloud computing is an umbrella term that applies to a wide variety of successful technologies (and business models), from business apps like Salesforce.com, to infrastructure like Amazon Elastic Compute Cloud (Amazon EC2), to consumer apps like Netflix. It took years for all these things to become mainstream, and if the last decade saw the emergence (and eventual dominance) of the cloud over previous technologies and models, this decade will see the same thing with crowdsourcing.
Both an art and a science, crowdsourcing taps into the global experience and wisdom of individuals, teams, communities, and networks to accomplish tasks and work. It doesn’t matter who you are, where you live, or what you do or believe — in fact, the more diversity of thought and perspective, the better. Diversity is king and it’s common for people on the periphery of — or even completely outside of — a discipline or science to end up solving important problems.
The specific nature of the work offers few constraints – from a small business needing a new logo, to the large consumer goods company looking to ideate marketing programs, or to the nonprofit research organization looking to find a biomarker for ALS, the value is clear as well.
To get to the heart of the matter on why crowdsourcing is this decade’s cloud computing, several immediate reasons come to mind:
Crowdsourcing Is Disruptive
Much as cloud computing has created a new guard that in many ways threatens the old guard, so too has crowdsourcing. …
Crowdsourcing Provides On-Demand Talent Capacity
Labor is expensive and good talent is scarce. Think about the cost of adding ten additional researchers to a 100-person R&D team. You’ve increased your research capacity by 10% (more or less), but at a significant cost – and, a significant FIXED cost at that. …
Crowdsourcing Enables Pay-for-Performance.
You pay as you go with cloud computing — gone are the days of massive upfront capital expenditures followed by years of ongoing maintenance and upgrade costs. Crowdsourcing does even better: you pay for solutions, not effort, which predictably sometimes results in failure. In fact, with crowdsourcing, the marketplace bears the cost of failure, not you….
Crowdsourcing “Consumerizes” Innovation
Crowdsourcing can provide a platform for bi-directional communication and collaboration with diverse individuals and groups, whether internal or external to your organization — employees, customers, partners and suppliers. Much as cloud computing has consumerized technology, crowdsourcing has the same potential to consumerize innovation, and more broadly, how we collaborate to bring new ideas, products and services to market.
Crowdsourcing Provides Expert Services and Skills That You Don’t Possess.
One of the early value propositions of cloud-based business apps was that you didn’t need to engage IT to deploy them or Finance to help procure them, thereby allowing general managers and line-of-business heads to do their jobs more fluently and more profitably…”

The small-world effect is a modern phenomenon


New paper by Seth A. Marvel, Travis Martin, Charles R. Doering, David Lusseau, M. E. J. Newman: “The “small-world effect” is the observation that one can find a short chain of acquaintances, often of no more than a handful of individuals, connecting almost any two people on the planet. It is often expressed in the language of networks, where it is equivalent to the statement that most pairs of individuals are connected by a short path through the acquaintance network. Although the small-world effect is well-established empirically for contemporary social networks, we argue here that it is a relatively recent phenomenon, arising only in the last few hundred years: for most of mankind’s tenure on Earth the social world was large, with most pairs of individuals connected by relatively long chains of acquaintances, if at all. Our conclusions are based on observations about the spread of diseases, which travel over contact networks between individuals and whose dynamics can give us clues to the structure of those networks even when direct network measurements are not available. As an example we consider the spread of the Black Death in 14th-century Europe, which is known to have traveled across the continent in well-defined waves of infection over the course of several years. Using established epidemiological models, we show that such wave-like behavior can occur only if contacts between individuals living far apart are exponentially rare. We further show that if long-distance contacts are exponentially rare, then the shortest chain of contacts between distant individuals is on average a long one. The observation of the wave-like spread of a disease like the Black Death thus implies a network without the small-world effect.”

Facilitating scientific discovery through crowdsourcing and distributed participation


Antony Williams in  EMBnet. journal:” Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.”

Smart Machines: IBM's Watson and the Era of Cognitive Computing


New book from Columbia Business School Publishing: “We are crossing a new frontier in the evolution of computing and entering the era of cognitive systems. The victory of IBM’s Watson on the television quiz show Jeopardy! revealed how scientists and engineers at IBM and elsewhere are pushing the boundaries of science and technology to create machines that sense, learn, reason, and interact with people in new ways to provide insight and advice.
In Smart Machines, John E. Kelly III, director of IBM Research, and Steve Hamm, a writer at IBM and a former business and technology journalist, introduce the fascinating world of “cognitive systems” to general audiences and provide a window into the future of computing. Cognitive systems promise to penetrate complexity and assist people and organizations in better decision making. They can help doctors evaluate and treat patients, augment the ways we see, anticipate major weather events, and contribute to smarter urban planning. Kelly and Hamm’s comprehensive perspective describes this technology inside and out and explains how it will help us conquer the harnessing and understanding of “big data,” one of the major computing challenges facing businesses and governments in the coming decades. Absorbing and impassioned, their book will inspire governments, academics, and the global tech industry to work together to power this exciting wave in innovation.”
See also Why cognitive systems?

And Data for All: On the Validity and Usefulness of Open Government Data


Paper presented at the the 13th International Conference on Knowledge Management and Knowledge Technologies: “Open Government Data (OGD) stands for a relatively young trend to make data that is collected and maintained by state authorities available for the public. Although various Austrian OGD initiatives have been started in the last few years, less is known about the validity and the usefulness of the data offered. Based on the data-set on Vienna’s stock of trees, we address two questions in this paper. First of all, we examine the quality of the data by validating it according to knowledge from a related discipline. It shows that the data-set we used correlates with findings from meteorology. Then, we explore the usefulness and exploitability of OGD by describing a concrete scenario in which this data-set can be supportive for citizens in their everyday life and by discussing further application areas in which OGD can be beneficial for different stakeholders and even commercially used.”

Choose Your Own Route on Finland's Algorithm-Driven Public Bus


Brian Merchant at Motherboard: “Technology should probably be transforming public transit a lot faster than it is. Yes, apps like Hopstop have made finding stops easier and I’ve started riding the bus in unfamiliar parts of town a bit more often thanks to Google Maps’ route info. But these are relatively small steps, and it’s all limited to making scheduling information more widely available. Where’s the innovation on the other side? Where’s the Uber-like interactivity, the bus that comes to you after a tap on the iPhone?
In Finland, actually. The Kutsuplus is Helsinki’s groundbreaking mass transit hybrid program that lets riders choose their own routes, pay for fares on their phones, and summon their own buses. It’s a pretty interesting concept. With a ten minute lead time, you summon a Kutsuplus bus to a stop using the official app, just as you’d call a livery cab on Uber. Each minibus in the fleet seats at least nine people, and there’s room for baby carriages and bikes.
You can call your own private Kutsuplus, but if you share the ride, you share the costs—it’s about half the price of a cab fare, and a dollar or two more expensive than old school bus transit. You can then pick your own stop, also using the app.
The interesting part is the scheduling, which is entirely automated. If you’re sharing the ride, an algorithm determines the most direct route, and you only get charged as though you were riding solo. You can pay with a Kutsuplus wallet on the app, or, eventually, bill the charge to your phone bill.”

NEW Publication: “Reimagining Governance in Practice: Benchmarking British Columbia’s Citizen Engagement Efforts”


Over the last few years, the Government of British Columbia (BC), Canada has initiated a variety of practices and policies aimed at providing more legitimate and effective governance. Leveraging advances in technology, the BC Government has focused on changing how it engages with its citizens with the goal of optimizing the way it seeks input and develops and implements policy. The efforts are part of a broader trend among a wide variety of democratic governments to re-imagine public service and governance.
At the beginning of 2013, BC’s Ministry of Citizens’ Services and Open Government, now the Ministry of Technology, Innovation and Citizens’ Services, partnered with the GovLab to produce “Reimagining Governance in Practice: Benchmarking British Columbia’s Citizen Engagement Efforts.” The GovLab’s May 2013 report, made public today, makes clear that BC’s current practices to create a more open government, leverage citizen engagement to inform policy decisions, create new innovations, and provide improved public monitoring­—though in many cases relatively new—are consistently among the strongest examples at either the provincial or national level.
According to Stefaan Verhulst, Chief of Research at the GovLab: “Our benchmarking study found that British Columbia’s various initiatives and experiments to create a more open and participatory governance culture has made it a leader in how to re-imagine governance. Leadership, along with the elimination of imperatives that may limit further experimentation, will be critical moving forward. And perhaps even more important, as with all initiatives to re-imaging governance worldwide, much more evaluation of what works, and why, will be needed to keep strengthening the value proposition behind the new practices and polices and provide proof-of-concept.”
See also our TheGovLab Blog.

The Value of Personal Data


The Digital Enlightenment Yearbook 2013 is dedicated this year to Personal Data:  “The value of personal data has traditionally been understood in ethical terms as a safeguard for personality rights such as human dignity and privacy. However, we have entered an era where personal data are mined, traded and monetized in the process of creating added value – often in terms of free services including efficient search, support for social networking and personalized communications. This volume investigates whether the economic value of personal data can be realized without compromising privacy, fairness and contextual integrity. It brings scholars and scientists from the disciplines of computer science, law and social science together with policymakers, engineers and entrepreneurs with practical experience of implementing personal data management.
The resulting collection will be of interest to anyone concerned about privacy in our digital age, especially those working in the field of personal information management, whether academics, policymakers, or those working in the private sector.”

Using Big Data to Ask Big Questions


Chase Davis in the SOURCE: “First, let’s dispense with the buzzwords. Big Data isn’t what you think it is: Every federal campaign contribution over the last 30-plus years amounts to several tens of millions of records. That’s not Big. Neither is a dataset of 50 million Medicare records. Or even 260 gigabytes of files related to offshore tax havens—at least not when Google counts its data in exabytes. No, the stuff we analyze in pursuit of journalism and app-building is downright tiny by comparison.
But you know what? That’s ok. Because while super-smart Silicon Valley PhDs are busy helping Facebook crunch through petabytes of user data, they’re also throwing off intellectual exhaust that we can benefit from in the journalism and civic data communities. Most notably: the ability to ask Big Questions.
Most of us who analyze public data for fun and profit are familiar with small questions. They’re focused, incisive, and often have the kind of black-and-white, definitive answers that end up in news stories: How much money did Barack Obama raise in 2012? Is the murder rate in my town going up or down?
Big Questions, on the other hand, are speculative, exploratory, and systemic. As the name implies, they are also answered at scale: Rather than distilling a small slice of a dataset into a concrete answer, Big Questions look at entire datasets and reveal small questions you wouldn’t have thought to ask.
Can we track individual campaign donor behavior over decades, and what does that tell us about their influence in politics? Which neighborhoods in my city are experiencing spikes in crime this week, and are police changing patrols accordingly?
Or, by way of example, how often do interest groups propose cookie-cutter bills in state legislatures?

Looking at Legislation

Even if you don’t follow politics, you probably won’t be shocked to learn that lawmakers don’t always write their own bills. In fact, interest groups sometimes write them word-for-word.
Sometimes those groups even try to push their bills in multiple states. The conservative American Legislative Exchange Council has gotten some press, but liberal groups, social and business interests, and even sororities and fraternities have done it too.
On its face, something about elected officials signing their names to cookie-cutter bills runs head-first against people’s ideal of deliberative Democracy—hence, it tends to make news. Those can be great stories, but they’re often limited in scope to a particular bill, politician, or interest group. They’re based on small questions.
Data science lets us expand our scope. Rather than focusing on one bill, or one interest group, or one state, why not ask: How many model bills were introduced in all 50 states, period, by anyone, during the last legislative session? No matter what they’re about. No matter who introduced them. No matter where they were introduced.
Now that’s a Big Question. And with some basic data science, it’s not particularly hard to answer—at least at a superficial level.

Analyze All the Things!

Just for kicks, I tried building a system to answer this question earlier this year. It was intended as an example, so I tried to choose methods that would make intuitive sense. But it also makes liberal use of techniques applied often to Big Data analysis: k-means clustering, matrices, graphs, and the like.
If you want to follow along, the code is here….
To make exploration a little easier, my code represents similar bills in graph space, shown at the top of this article. Each dot (known as a node) represents a bill. And a line connecting two bills (known as an edge) means they were sufficiently similar, according to my criteria (a cosine similarity of 0.75 or above). Thrown into a visualization software like Gephi, it’s easy to click around the clusters and see what pops out. So what do we find?
There are 375 clusters in total. Because of the limitations of our data, many of them represent vague, subject-specific bills that just happen to have similar titles even though the legislation itself is probably very different (think things like “Budget Bill” and “Campaign Finance Reform”). This is where having full bill text would come handy.
But mixed in with those bills are a handful of interesting nuggets. Several bills that appear to be modeled after legislation by the National Conference of Insurance Legislators appear in multiple states, among them: a bill related to limited lines travel insurance; another related to unclaimed insurance benefits; and one related to certificates of insurance.”

Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo


New paper by Michael J. Madison: “The knowledge commons research framework is applied to a case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis via the Internet. In the second place Galaxy Zoo is a highly successful example of peer production, some times known colloquially as crowdsourcing, by which data are gathered, supplied, and/or analyzed by very large numbers of anonymous and pseudonymous contributors to an enterprise that is centrally coordinated or managed. In the third place Galaxy Zoo is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies. This chapter synthesizes these three perspectives on Galaxy Zoo via the knowledge commons framework.”