From Faith-Based to Evidence-Based: The Open Data 500 and Understanding How Open Data Helps the American Economy


Beth Noveck in Forbes: “Public funds have, after all, paid for their collection, and the law says that federal government data are not protected by copyright. By the end of 2009, the US and the UK had the only two open data one-stop websites where agencies could post and citizens could find open data. Now there are over 300 such portals for government data around the world with over 1 million available datasets. This kind of Open Data — including weather, safety and public health information as well as information about government spending — can serve the country by increasing government efficiency, shedding light on regulated industries, and driving innovation and job creation.

It’s becoming clear that open data has the potential to improve people’s lives. With huge advances in data science, we can take this data and turn it into tools that help people choose a safer hospital, pick a better place to live, improve the performance of their farm or business by having better climate models, and know more about the companies with whom they are doing business. Done right, people can even contribute data back, giving everyone a better understanding, for example of nuclear contamination in post-Fukushima Japan or incidences of price gouging in America’s inner cities.

The promise of open data is limitless. (see the GovLab index for stats on open data) But it’s important to back up our faith with real evidence of what works. Last September the GovLab began the Open Data 500 project, funded by the John S. and James L. Knight Foundation, to study the economic value of government Open Data extensively and rigorously.  A recent McKinsey study pegged the annual global value of Open Data (including free data from sources other than government), at $3 trillion a year or more. We’re digging in and talking to those companies that use Open Data as a key part of their business model. We want to understand whether and how open data is contributing to the creation of new jobs, the development of scientific and other innovations, and adding to the economy. We also want to know what government can do better to help industries that want high quality, reliable, up-to-date information that government can supply. Of those 1 million datasets, for example, 96% are not updated on a regular basis.

The GovLab just published an initial working list of 500 American companies that we believe to be using open government data extensively.  We’ve also posted in-depth profiles of 50 of them — a sample of the kind of information that will be available when the first annual Open Data 500 study is published in early 2014. We are also starting a similar study for the UK and Europe.

Even at this early stage, we are learning that Open Data is a valuable resource. As my colleague Joel Gurin, author of Open Data Now: the Secret to Hot Start-Ups, Smart Investing, Savvy Marketing and Fast Innovation, who directs the project, put it, “Open Data is a versatile and powerful economic driver in the U.S. for new and existing businesses around the country, in a variety of ways, and across many sectors. The diversity of these companies in the kinds of data they use, the way they use it, their locations, and their business models is one of the most striking things about our findings so far.” Companies are paradoxically building value-added businesses on top of public data that anyone can access for free….”

FULL article can be found here.

The Emergence Of The Connected City


Glen Martin at Forbes: “If the modern city is a symbol for randomness — even chaos — the city of the near future is shaping up along opposite metaphorical lines. The urban environment is evolving rapidly, and a model is emerging that is more efficient, more functional, more — connected, in a word.
This will affect how we work, commute, and spend our leisure time. It may well influence how we relate to one another, and how we think about the world. Certainly, our lives will be augmented: better public transportation systems, quicker responses from police and fire services, more efficient energy consumption. But there could also be dystopian impacts: dwindling privacy and imperiled personal data. We could even lose some of the ferment that makes large cities such compelling places to live; chaos is stressful, but it can also be stimulating.
It will come as no surprise that converging digital technologies are driving cities toward connectedness. When conjoined, ISM band transmitters, sensors, and smart phone apps form networks that can make cities pretty darn smart — and maybe more hygienic. This latter possibility, at least, is proposed by Samrat Saha of the DCI Marketing Group in Milwaukee. Saha suggests “crowdsourcing” municipal trash pick-up via BLE modules, proximity sensors and custom mobile device apps.
“My idea is a bit tongue in cheek, but I think it shows how we can gain real efficiencies in urban settings by gathering information and relaying it via the Cloud,” Saha says. “First, you deploy sensors in garbage cans. Each can provides a rough estimate of its fill level and communicates that to a BLE 112 Module.”
As pedestrians who have downloaded custom “garbage can” apps on their BLE-capable iPhone or Android devices pass by, continues Saha, the information is collected from the module and relayed to a Cloud-hosted service for action — garbage pick-up for brimming cans, in other words. The process will also allow planners to optimize trash can placement, redeploying receptacles from areas where need is minimal to more garbage-rich environs….
Garbage can connectivity has larger implications than just, well, garbage. Brett Goldstein, the former Chief Data and Information Officer for the City of Chicago and a current lecturer at the University of Chicago, says city officials found clear patterns between damaged or missing garbage cans and rat problems.
“We found areas that showed an abnormal increase in missing or broken receptacles started getting rat outbreaks around seven days later,” Goldstein said. “That’s very valuable information. If you have sensors on enough garbage cans, you could get a temporal leading edge, allowing a response before there’s a problem. In urban planning, you want to emphasize prevention, not reaction.”
Such Cloud-based app-centric systems aren’t suited only for trash receptacles, of course. Companies such as Johnson Controls are now marketing apps for smart buildings — the base component for smart cities. (Johnson’s Metasys management system, for example, feeds data to its app-based Paoptix Platform to maximize energy efficiency in buildings.) In short, instrumented cities already are emerging. Smart nodes — including augmented buildings, utilities and public service systems — are establishing connections with one another, like axon-linked neurons.
But Goldstein, who was best known in Chicago for putting tremendous quantities of the city’s data online for public access, emphasizes instrumented cities are still in their infancy, and that their successful development will depend on how well we “parent” them.
“I hesitate to refer to ‘Big Data,’ because I think it’s a terribly overused term,” Goldstein said. “But the fact remains that we can now capture huge amounts of urban data. So, to me, the biggest challenge is transitioning the fields — merging public policy with computer science into functional networks.”…”

Design in public and social innovation


New paper by Geoff Mulgan (Nesta): “What’s going right and what’s going wrong? Is design a key to more efficient and effective public services, or a costly luxury, good for conferences and consultants but not for the public?
This paper looks at the elements of the design method; the strengths of current models; some of their weaknesses and the common criticisms made of them; and what might be the way forward.

Contents:

  • Strengths of the design method in social innovation and public service
  • Social design tools table
  • Weaknesses of design projects and methods
  • The challenge
  • Design in the context of innovation”

A World Of Wikipedia And Bitcoin: Is That The Promise Of Open Collaboration?


Science 2.0: “Open Collaboration, defined in a new paper as “any system of innovation or production that relies on goal-oriented yet loosely coordinated participants who interact to create a product (or service) of economic value, which they make available to contributors and non-contributors alike” brought the world Wikipedia, Bitcoin and, yes, even Science 2.0.
But what does that mean, really? That’s the first problem with vague terms in an open environment. It is anything people want it to be and sometimes what people want it to be is money, but hidden behind a guise of public weal.
TED’s lesser cousin TEDx is a result of open collaboration but there is no doubt it has successfully leveraged the marketing of TED to sell seats in auditoriums, just as it was designed to do. Generally, Open Collaboration now is less like its early days, where a group of like-minded people got together to create an Open Source tool, and more like corporations. Only they avoid the label, they are not quite non-profits and not quite corporations.
And because they are neither they can operate free of the cultural stigma. Despite efforts to claim that Wikipedia is a hotbed of misogyny and blocks out minorities, the online encyclopedia has endured just fine. Their defense is a simple one; they have no idea what gender or race or religion anyone is and anyone can contribute – it is a true open collaboration. Open Collaboration is goal-oriented, they lack the infrastructure to obey demands that they become about social justice, so the environments can be less touchy-feely than corporations and avoid the social authoritarianism of academia.
Many open collaborations perform well even in ‘harsh’ environments, where some minorities are underrepresented and diversity is lacking or when products by different groups rival one another. It’s a real puzzle for sociologists. The authors conclude that open collaboration is likely to expand into new domains, displacing traditional organizations, because it is so mission-oriented. Business executives and civic leaders should take heed – the future could look a lot more like the 1940s.”
See also: Sheen S. Levine, Michael J. Prietula, ‘Open Collaboration for Innovation: Principles and Performance’, Organization Science December 30, 2014 DOI:10.1287/orsc.2013.0872

When Tech Culture And Urbanism Collide


John Tolva: “…We can build upon the success of the work being done at the intersection of technology and urban design, right now.

For one, the whole realm of social enterprise — for-profit startups that seek to solve real social problems — has a huge overlap with urban issues. Impact Engine in Chicago, for instance, is an accelerator squarely focused on meaningful change and profitable businesses. One of their companies, Civic Artworks, has set as its goal rebalancing the community planning process.

The Code for America Accelerator and Tumml, both located in San Francisco, morph the concept of social innovation into civic/urban innovation. The companies nurtured by CfA and Tumml are filled with technologists and urbanists working together to create profitable businesses. Like WorkHands, a kind of LinkedIn for blue collar trades. Would something like this work outside a city? Maybe. Are its effects outsized and scale-ready in a city? Absolutely. That’s the opportunity in urban innovation.

Scale is what powers the sharing economy and it thrives because of the density and proximity of cities. In fact, shared resources at critical density is one of the only good definitions for what a city is. It’s natural that entrepreneurs have overlaid technology on this basic fact of urban life to amplify its effects. Would TaskRabbit, Hailo or LiquidSpace exist in suburbia? Probably, but their effects would be minuscule and investors would get restless. The city in this regard is the platform upon which sharing economy companies prosper. More importantly, companies like this change the way the city is used. It’s not urban planning, but it is urban (re)design and it makes a difference.

A twist that many in the tech sector who complain about cities often miss is that change in a city is not the same thing as change in city government. Obviously they are deeply intertwined; change is mighty hard when it is done at cross-purposes with government leadership. But it happens all the time. Non-government actors — foundations, non-profits, architecture and urban planning firms, real estate developers, construction companies — contribute massively to the shape and health of our cities.

Often this contribution is powered through policies of open data publication by municipal governments. Open data is the raw material of a city, the vital signs of what has happened there, what is happening right now, and the deep pool of patterns for what might happen next.

Tech entrepreneurs would do well to look at the organizations and companies capitalizing on this data as the real change agents, not government itself. Even the data in many cases is generated outside government. Citizens often do the most interesting data-gathering, with tools like LocalData. The most exciting thing happening at the intersection of technology and cities today — what really makes them “smart” — is what is happening at the periphery of city government. It’s easy to belly-ache about government and certainly there are administrations that to do not make data public (or shut it down), but tech companies who are truly interested in city change should know that there are plenty of examples of how to start up and do it.

And yet, the somewhat staid world of architecture and urban-scale design presents the most opportunity to a tech community interested in real urban change. While technology obviously plays a role in urban planning — 3D visual design tools like Revit and mapping services like ArcGIS are foundational for all modern firms — data analytics as a serious input to design matters has only been used in specialized (mostly energy efficiency) scenarios. Where are the predictive analytics, the holistic models, the software-as-a-service providers for the brave new world of urban informatics and The Internet of Things? Technologists, it’s our move.

Something’s amiss when some city governments — rarely the vanguard in technological innovation — have more sophisticated tools for data-driven decision-making than the private sector firms who design the city. But some understand the opportunity. Vannevar Technology is working on it, as is Synthicity. There’s plenty of room for the most positive aspects of tech culture to remake the profession of urban planning itself. (Look to NYU’s Center for Urban Science and Progress and the University of Chicago’s Urban Center for Computation and Data for leadership in this space.)…”

Brainlike Computers, Learning From Experience


The New York Times: “Computers have entered the age when they are able to learn from their own mistakes, a development that is about to turn the digital world on its head.

The first commercial version of the new kind of computer chip is scheduled to be released in 2014. Not only can it automate tasks that now require painstaking programming — for example, moving a robot’s arm smoothly and efficiently — but it can also sidestep and even tolerate errors, potentially making the term “computer crash” obsolete.

The new computing approach, already in use by some large technology companies, is based on the biological nervous system, specifically on how neurons react to stimuli and connect with other neurons to interpret information. It allows computers to absorb new information while carrying out a task, and adjust what they do based on the changing signals.

In coming years, the approach will make possible a new generation of artificial intelligence systems that will perform some functions that humans do with ease: see, speak, listen, navigate, manipulate and control. That can hold enormous consequences for tasks like facial and speech recognition, navigation and planning, which are still in elementary stages and rely heavily on human programming.

Designers say the computing style can clear the way for robots that can safely walk and drive in the physical world, though a thinking or conscious computer, a staple of science fiction, is still far off on the digital horizon.

“We’re moving from engineering computing systems to something that has many of the characteristics of biological computing,” said Larry Smarr, an astrophysicist who directs the California Institute for Telecommunications and Information Technology, one of many research centers devoted to developing these new kinds of computer circuits.

Conventional computers are limited by what they have been programmed to do. Computer vision systems, for example, only “recognize” objects that can be identified by the statistics-oriented algorithms programmed into them. An algorithm is like a recipe, a set of step-by-step instructions to perform a calculation.

But last year, Google researchers were able to get a machine-learning algorithm, known as a neural network, to perform an identification task without supervision. The network scanned a database of 10 million images, and in doing so trained itself to recognize cats.

In June, the company said it had used those neural network techniques to develop a new search service to help customers find specific photos more accurately.

The new approach, used in both hardware and software, is being driven by the explosion of scientific knowledge about the brain. Kwabena Boahen, a computer scientist who leads Stanford’s Brains in Silicon research program, said that is also its limitation, as scientists are far from fully understanding how brains function.”

Crowdsourcing drug discovery: Antitumour compound identified


David Bradley in Spectroscopy.now: “American researchers have used “crowdsourcing” – the cooperation of a large number of interested non-scientists via the internet – to help them identify a new fungus. The species contains unusual metabolites, isolated and characterized, with the help of vibrational circular dichroism (VCD). One compound reveals itself to have potential antitumour activity.
So far, a mere 7 percent of the more than 1.5 million species of fungi thought to exist have been identified and an even smaller fraction of these have been the subject of research seeking bioactive natural products. …Robert Cichewicz of the University of Oklahoma, USA, and his colleagues hoped to remedy this situation by working with a collection of several thousand fungal isolates from three regions: Arctic Alaska, tropical Hawaii, and subtropical to semiarid Oklahoma. Collaborator Susan Mooberry of the University of Texas at San Antonio carried out biological assays on many fungal isolates looking for antitumor activity among the metabolites in Cichewicz’s collection. A number of interesting substances were identified…
However, the researchers realized quickly enough that the efforts of a single research team were inadequate if samples representing the immense diversity of the thousands of fungi they hoped to test were to be obtained and tested. They thus turned to the help of citizen scientists in a “crowdsourcing” initiative. In this approach, lay people with an interest in science, and even fellow scientists in other fields, were recruited to collect and submit soil from their gardens.
As the samples began to arrive, the team quickly found among them a previously unknown fungal strain – a Tolypocladium species – growing in a soil sample from Alaska. Colleague Andrew Miller of the University of Illinois did the identification of this new fungus, which was found to be highly responsive to making new compounds based on changes in its laboratory growth conditions. Moreover, extraction of the active chemicals from the isolate revealed a unique metabolite which was shown to have significant antitumour activity in laboratory tests. The team suggests that this novel substance may represent a valuable new approach to cancer treatment because it precludes certain biochemical mechanisms that lead to the emergence of drug resistance in cancer with conventional drugs…
The researchers point out the essential roles that citizen scientists can play. “Many of the groundbreaking discoveries, theories, and applied research during the last two centuries were made by scientists operating from their own homes,” Cichewicz says. “Although much has changed, the idea that citizen scientists can still participate in research is a powerful means for reinvigorating the public’s interest in science and making important discoveries,” he adds.”

The Postmodernity of Big Data


Essay by in the New Inquiry: “Big Data fascinates because its presence has always been with us in nature. Each tree, drop of rain, and the path of each grain of sand, both responds to and creates millions of data points, even on a short journey. Nature is the original algorithm, the most efficient and powerful. Mathematicians since the ancients have looked to it for inspiration; techno-capitalists now look to unlock its mysteries for private gain. Playing God has become all the more brisk and profitable thanks to cloud computing.
But beyond economic motivations for Big Data’s rise, are there also epistemological ones? Has Big Data come to try to fill the vacuum of certainty left by postmodernism? Does data science address the insecurities of the postmodern thought?
It turns out that trying to explain Big Data is like trying to explain postmodernism. Neither can be summarized effectively in a phrase, despite their champions’ efforts. Broad epistemological developments are compressed into cursory, ex post facto descriptions. Attempts to define Big Data, such as IBM’s marketing copy, which promises “insights gleaned” from “enterprise data warehouses that implement massively parallel processing,” “real-time scalability” and “parsing structured and unstructured sources,” focus on its implementation at the expense of its substance, decontextualizing it entirely . Similarly, definitions of postmodernism, like art critic Thomas McEvilley’s claim that it is “a renunciation that involves recognition of the relativity of the self—of one’s habit systems, their tininess, silliness, and arbitrariness” are accurate but abstract to the point of vagueness….
Big Data might come to be understood as Big Postmodernism: the period in which the influx of unstructured, non-teleological, non-narrative inputs ceased to destabilize the existing order but was instead finally mastered processed by sufficiently complex, distributed, and pluralized algorithmic regime. If Big Data has a skepticism built in, how this is different from the skepticism of postmodernism is perhaps impossible to yet comprehend”.

Big Data Becomes a Mirror


Book Review of ‘Uncharted,’ by Erez Aiden and Jean-Baptiste Michel in the New York Times: “Why do English speakers say “drove” rather than “drived”?

As graduate students at the Harvard Program for Evolutionary Dynamics about eight years ago, Erez Aiden and Jean-Baptiste Michel pondered the matter and decided that something like natural selection might be at work. In English, the “-ed” past-tense ending of Proto-Germanic, like a superior life form, drove out the Proto-Indo-European system of indicating tenses by vowel changes. Only the small class of verbs we know as irregular managed to resist.

To test this evolutionary premise, Mr. Aiden and Mr. Michel wound up inventing something they call culturomics, the use of huge amounts of digital information to track changes in language, culture and history. Their quest is the subject of “Uncharted: Big Data as a Lens on Human Culture,” an entertaining tour of the authors’ big-data adventure, whose implications they wildly oversell….

Invigorated by the great verb chase, Mr. Aiden and Mr. Michel went hunting for bigger game. Given a large enough storehouse of words and a fine filter, would it be possible to see cultural change at the micro level, to follow minute fluctuations in human thought processes and activities? Tiny factoids, multiplied endlessly, might assume imposing dimensions.

By chance, Google Books, the megaproject to digitize every page of every book ever printed — all 130 million of them — was starting to roll just as the authors were looking for their next target of inquiry.

Meetings were held, deals were struck and the authors got to it. In 2010, working with Google, they perfected the Ngram Viewer, which takes its name from the computer-science term for a word or phrase. This “robot historian,” as they call it, can search the 30 million volumes already digitized by Google Books and instantly generate a usage-frequency timeline for any word, phrase, date or name, a sort of stock-market graph illustrating the ups and downs of cultural shares over time.

Mr. Aiden, now director of the Center for Genome Architecture at Rice University, and Mr. Michel, who went on to start the data-science company Quantified Labs, play the Ngram Viewer (books.google.com/ngrams) like a Wurlitzer…

The Ngram Viewer delivers the what and the when but not the why. Take the case of specific years. All years get attention as they approach, peak when they arrive, then taper off as succeeding years occupy the attention of the public. Mentions of the year 1872 had declined by half in 1896, a slow fade that took 23 years. The year 1973 completed the same trajectory in less than half the time.

“What caused that change?” the authors ask. “We don’t know. For now, all we have are the naked correlations: what we uncover when we look at collective memory through the digital lens of our new scope.” Someone else is going to have to do the heavy lifting.”

Web Science: Understanding the Emergence of Macro-Level Features on the World Wide Web


Monograph by Kieron O’Hara, Noshir S. Contractor, Wendy Hall, James A. Hendler and Nigel Shadbolt in Foundations and Trends in Web Sciences: “Web Science considers the development of Web Science since the publication of ‘A Framework for Web Science’ (Berners-Lee et al., 2006). This monograph argues that the requirement for understanding should ideally be accompanied by some measure of control, which makes Web Science crucial in the future provision of tools for managing our interactions, our politics, our economics, our entertainment, and – not least – our knowledge and data sharing…
In this monograph we consider the development of Web Science since the launch of this journal and its inaugural publication ‘A Framework for Web Science’ [44]. The theme of emergence is discussed as the characteristic phenomenon of Web-scale applications, where many unrelated micro-level actions and decisions, uninformed by knowledge about the macro-level, still produce noticeable and coherent effects at the scale of the Web. A model of emergence is mapped onto the multitheoretical multilevel (MTML) model of communication networks explained in [252]. Four specific types of theoretical problem are outlined. First, there is the need to explain local action. Second, the global patterns that form when local actions are repeated at scale have to be detected and understood. Third, those patterns feed back into the local, with intricate and often fleeting causal connections to be traced. Finally, as Web Science is an engineering discipline, issues of control of this feedback must be addressed. The idea of a social machine is introduced, where networked interactions at scale can help to achieve goals for people and social groups in civic society; an important aim of Web Science is to understand how such networks can operate, and how they can control the effects they produce on their own environment.”