The trouble with Big Data? It is called the “recency bias”.


One of the problems with such a rate of information increase is that the present moment will always loom far larger than even the recent past. Imagine looking back over a photo album representing the first 18 years of your life, from birth to adulthood. Let’s say that you have two photos for your first two years. Assuming a rate of information increase matching that of the world’s data, you will have an impressive 2,000 photos representing the years six to eight; 200,000 for the years 10 to 12; and a staggering 200,000,000 for the years 16 to 18. That’s more than three photographs for every single second of those final two years.

The moment you start looking backwards to seek the longer view, you have far too much of the recent stuff and far too little of the old

This isn’t a perfect analogy with global data, of course. For a start, much of the world’s data increase is due to more sources of information being created by more people, along with far larger and more detailed formats. But the point about proportionality stands. If you were to look back over a record like the one above, or try to analyse it, the more distant past would shrivel into meaningless insignificance. How could it not, with so many times less information available?

Here’s the problem with much of the big data currently being gathered and analysed. The moment you start looking backwards to seek the longer view, you have far too much of the recent stuff and far too little of the old. Short-sightedness is built into the structure, in the form of an overwhelming tendency to over-estimate short-term trends at the expense of history.

To understand why this matters, consider the findings from social science about ‘recency bias’, which describes the tendency to assume that future events will closely resemble recent experience. It’s a version of what is also known as the availability heuristic: the tendency to base your thinking disproportionately on whatever comes most easily to mind. It’s also a universal psychological attribute. If the last few years have seen exceptionally cold summers where you live, for example, you might be tempted to state that summers are getting colder – or that your local climate may be cooling. In fact, you shouldn’t read anything whatsoever into the data. You would need to take a far, far longer view to learn anything meaningful about climate trends. In the short term, you’d be best not speculating at all – but who among us can manage that?

Short-term analyses aren’t only invalid – they’re actively unhelpful and misleading

The same tends to be true of most complex phenomena in real life: stock markets, economies, the success or failure of companies, war and peace, relationships, the rise and fall of empires. Short-term analyses aren’t only invalid – they’re actively unhelpful and misleading. Just look at the legions of economists who lined up to pronounce events like the 2009 financial crisis unthinkable right until it happened. The very notion that valid predictions could be made on that kind of scale was itself part of the problem.

It’s also worth remembering that novelty tends to be a dominant consideration when deciding what data to keep or delete. Out with the old and in with the new: that’s the digital trend in a world where search algorithms are intrinsically biased towards freshness, and where so-called link rot infests everything from Supreme Court decisions to entire social media services. A bias towards the present is structurally engrained in almost all the technology surrounding us, not least thanks to our habit of ditching most of our once-shiny machines after about five years.

What to do? This isn’t just a question of being better at preserving old data – although this wouldn’t be a bad idea, given just how little is currently able to last decades rather than years. More importantly, it’s about determining what is worth preserving in the first place – and what it means meaningfully to cull information in the name of knowledge.

What’s needed is something that I like to think of as “intelligent forgetting”: teaching our tools to become better at letting go of the immediate past in order to keep its larger continuities in view. It’s an act of curation akin to organising a photograph album – albeit with more maths….(More)

The Spanish Town That Runs on Twitter


Mark Scott at the New York Times: “…For the town’s residents, more than half of whom have Twitter accounts, their main way to communicate with local government officials is now the social network. Need to see the local doctor? Send a quick Twitter message to book an appointment. See something suspicious? Let Jun’s policeman know with a tweet.

People in Jun can still use traditional methods, like completing forms at the town hall, to obtain public services. But Mr. Rodríguez Salas said that by running most of Jun’s communications through Twitter, he not only has shaved on average 13 percent, or around $380,000, from the local budget each year since 2011, but he also has created a digital democracy where residents interact online almost daily with town officials.

“Everyone can speak to everyone else, whenever they want,” said Mr.Rodríguez Salas in his office surrounded by Twitter paraphernalia,while sporting a wristband emblazoned with #LoveTwitter. “We are onTwitter because that’s where the people are.”…

By incorporating Twitter into every aspect of daily life — even the localschool’s lunch menu is sent out through social media — this Spanishtown has become a test bed for how cities may eventually use socialnetworks to offer public services….

Using Twitter has also reduced the need for some jobs. Jun cut its police force by three-quarters, to just one officer, soon after turning to Twitter as its main form of communication when residents began tweeting potential problems directly to the mayor.

“We don’t have one police officer,” Mr. Rodríguez Salas said. “We have 3,500.”

For Justo Ontiveros, Jun’s remaining police officer, those benefits are up close and personal. He now receives up to 20, mostly private, messages from locals daily with concerns ranging from advice on filling out forms to reporting crimes like domestic abuse and speeding.

Mr. Ontiveros said his daily Twitter interactions have given him both greater visibility within the community and a higher level of personal satisfaction, as neighbors now regularly stop him in the street to discuss things that he has posted on Twitter.

“It gives people more power to come and talk to me about their problems,” said Mr. Ontiveros, whose department Twitter account has more than 3,500 followers.

Still, Jun’s reliance on Twitter has not been universally embraced….(More)”

Searching for Someone: From the “Small World Experiment” to the “Red Balloon Challenge,” and beyond


Essay by Manuel Cebrian, Iyad Rahwan, Victoriano Izquierdo, Alex Rutherford, Esteban Moro and Alex (Sandy) Pentland: “Our ability to search social networks for people and information is fundamental to our success. We use our personal connections to look for new job opportunities, to seek advice about what products to buy, to match with romantic partners, to find a good physician, to identify business partners, and so on.

Despite living in a world populated by seven billion people, we are able to navigate our contacts efficiently, only needing a handful of personal introductions before finding the answer to our question, or the person we are seeking. How does this come to be? In folk culture, the answer to this question is that we live in a “small world.” The catch-phrase was coined in 1929 by the visionary author Frigyes Karinthy in his Chain-Links essay, where these ideas are put forward for the first time.

Let me put it this way: Planet Earth has never been as tiny as it is now. It shrunk — relatively speaking of course — due to the quickening pulse of both physical and verbal communication. We never talked about the fact that anyone on Earth, at my or anyone’s will, can now learn in just a few minutes what I think or do, and what I want or what I would like to do. Now we live in fairyland. The only slightly disappointing thing about this land is that it is smaller than the real world has ever been. — Frigyes Karinthy, Chain-Links, 1929

Then, it was just a dystopian idea reflecting the anxiety of living in an increasingly more connected world. But there was no empirical evidence that this was actually the case, and it took almost 30 years to find any.

Six Degrees of Separation

In 1967, legendary psychologist Stanley Milgram conducted a ground-breaking experiment to test this “small world” hypothesis. He started with random individuals in the U.S. midwest, and asked them to send packages to people in Boston, Massachusetts, whose address was not given. They must contribute to this “search” only by sending the package to individuals known on a first-name basis. Milgram expected that successful searches (if any!) would require hundreds of individuals along the chain from the initial sender to the final recipient.

Surprisingly, however, Milgram found that the average path length was somewhere between five point five and six individuals, which made social search look astonishingly efficient. Although the experiment raised some methodological criticisms, its findings were profound. However, what it did not answer is why social networks have such short paths in the first place. The answer was not obvious. In fact, there were reasons to suspect that short paths were just a myth: social networks are very cliquish. Your friends’ friends are likely to also be your friends, and thus most social paths are short and circular. This “cliquishness” suggests that our search through the social network can easily get “trapped” within our close social community, making social search highly inefficient.

Architectures for Social Search

Again, it took a long time — more than 40 years — before this riddle was solved. In a 1998 seminal paper in Nature, Duncan Watts & Steven Strogatzcame up with an elegant mathematical model to explain the existence of these short paths. They started from a social network that is very cliquish, i.e., most of your friends are also friends of one another. In this model, the world is “large” since the social distance among individuals is very long. However, if we take only a tiny fraction of these connections (say one out of every hundred links), and rewire them to random individuals in the network, that same world suddenly becomes “small.” These random connections allow individuals to jump to faraway communities very quickly — using them as social network highways — thus reducing average path length in a dramatic fashion.

While this theoretical insight suggests that social networks are searchable due to the existence of short paths, it does not yet say much about the “procedure” that people use to find these paths. There is no reason, a priori, that we should know how to find these short chains, especially since there are many chains, and no individuals have knowledge of the network structure beyond their immediate communities. People do not know how the friends of their friends are connected among themselves, and therefore it is not obvious that they would have a good way of navigating their social network while searching.

Soon after Watts and Strogatz came up with this model at Cornell University, a computer scientist across campus, Jon Kleinberg, set out to investigate whether such “small world” networks are searchable. In a landmark Nature article, “Navigation in a Small World,” published in 200o, he showed that social search is easy without global knowledge of the network, but only for a very specific value of the probability of long-range connectivity (i.e., the probability that we know somebody far removed from us, socially, in the social network). With the advent of a publicly available social media dataset such as LiveJournal, David Liben-Nowell and colleagues showed that real-world social networks do indeed have these particular long-range ties. It appears the social architecture of the world we inhabit is remarkably fine-tuned for searchability….

The Tragedy of the Crowdsourcers

Some recent efforts have been made to try and disincentivize sabotage. If verification is also rewarded along the recruitment tree, then the individuals who recruited the saboteurs would have a clear incentive to verify, halt, and punish the saboteurs. This theoretical solution is yet to be tested in practice, and it is conjectured that a coalition of saboteurs, where saboteurs recruit other saboteurs pretending to “vet” them, would make recursive verification futile.

If we are to believe in theory, theory does not shed a promising light on reducing sabotage in social search. We recently proposed the “Crowdsourcing Dilemma.” In it, we perform a game-theoretic analysis of the fundamental tradeoff between the potential for increased productivity of social search and the possibility of being set back by malicious behavior, including misinformation. Our results show that, in competitive scenarios, such as those with multiple social searches competing for the same information, malicious behavior is the norm, not an anomaly — a result contrary to conventional wisdom. Even worse: counterintuitively, making sabotage more costly does not deter saboteurs, but leads all the competing teams to a less desirable outcome, with more aggression, and less efficient collective search for talent.

These empirical and theoretical findings have cautionary implications for the future of social search, and crowdsourcing in general. Social search is surprisingly efficient, cheap, easy to implement, and functional across multiple applications. But there are also surprises in the amount of evildoing that the social searchers will stumble upon while recruiting. As we get deeper and deeper into the recruitment tree, we stumble upon that evil force lurking in the dark side of the network.

Evil mutates and regenerates in the crowd in new forms impossible to anticipate by the designers or participants themselves. Crowdsourcing and its enemies will always be engaged in an co-evolutionary arms race.

Talent is there to be searched and recruited. But so are evil and malice. Ultimately, crowdsourcing experts need to figure out how to recruit more of the former, while deterring more of the later. We might be living on a small world, but the cost and fragility of navigating it could harm any potential strategy to leverage the power of social networks….

Being searchable is a way of being closely connected to everyone else, which is conducive to contagion, group-think, and, most crucially, makes it hard for individuals to differentiate from each other. Evolutionarily, for better or worse, our brain makes us mimic others, and whether this copying of others ends up being part of the Wisdom of the Crowds, or the “stupidity of many,” it is highly sensitive to the scenario at hand.

Katabasis, or the myth of the hero that descends to the underworld and comes back stronger, is as old as time and pervasive across ancient cultures. Creative people seem to need to “get lost.” Grigori Perelman, Shinichi Mochizuki, and Bob Dylan all disappeared for a few years to reemerge later as more creative versions of themselves. Others like J. D. Salinger and Bobby Fisher also vanished, and never came back to the public sphere. If others cannot search and find us, we gain some slack, some room to escape from what we are known for by others. Searching for our true creative selves may rest on the difficulty of others finding us….(More)”

Fan Favorites


Erin Reilly at Strategy + Business: “…In theory, new technological advances such as big data and machine learning, combined with more direct access to audience sentiment, behaviors, and preferences via social media and over-the-top delivery channels, give the entertainment and media industry unprecedented insight into what the audience actually wants. But as a professional in the television industry put it, “We’re drowning in data and starving for insights.” Just as my data trail didn’t trace an accurate picture of my true interest in soccer, no data set can quantify all that consumers are as humans. At USC’s Annenberg Innovation Lab, our research has led us to an approach that blends data collection with a deep understanding of the social and cultural context in which the data is created. This can be a powerful practice for helping researchers understand the behavior of fans — fans of sports, brands, celebrities, and shows.

A Model for Understanding Fans

Marketers and creatives often see audiences and customers as passive assemblies of listeners or spectators. But we believe it’s more useful to view them as active participants. The best analogy may be fans. Broadly characterized, fans have a continued connection with the property they are passionate about. Some are willing to declare their affinity through engagement, some have an eagerness to learn more about their passion, and some want to connect with others who share their interests. Fans are emotionally linked to the object of their passion, and experience their passion through their own subjective lenses. We all start out as audience members. But sometimes, when the combination of factors aligns in just the right way, we become engaged as fans.

For businesses, the key to building this engagement and solidifying the relationship is understanding the different types of fan motivations in different contexts, and learning how to turn the data gathered about them into actionable insights. Even if Jane Smith and her best friend are fans of the same show, the same team, or the same brand, they’re likely passionate for different reasons. For example, some viewers may watch the ABC melodrama Scandal because they’re fashionistas and can’t wait to see the newest wardrobe of star Kerry Washington; others may do so because they’re obsessed with politics and want to see how the newly introduced Donald Trump–like character will behave. And those differences mean fans will respond in varied ways to different situations and content.
Though traditional demographics may give us basic information about who fans are and where they’re located, current methods of understanding and measuring engagement are missing the answers to two essential questions: (1) Why is a fan motivated? and (2) What triggers the fan’s behavior? Our Innovation Lab research group is developing a new model called Leveraging Engagement, which can be used as a framework when designing media strategy….(More)”

Why Didn’t E-Gov Live Up To Its Promise?


Excerpt from the report Delivering on Digital: The Innovators and Technologies that are Transforming Government” by William Eggers: “Digital is becoming the new normal. Digital technologies have quietly and quickly pervaded every facet of our daily lives, transforming how we eat, shop, work, play and think.

An aging population, millennials assuming managerial positions, budget shortfalls and ballooning entitlement spending all will significantly impact the way government delivers services in the coming decade, but no single factor will alter citizens’ experience of government more than the pure power of digital technologies.

Ultimately, digital transformation means reimagining virtually every facet of what government does, from headquarters to the field, from health and human services to transportation and defense.

By now, some of you readers with long memories can’t be blamed for feeling a sense of déjà vu.

After all, technology was supposed to transform government 15 years ago; an “era of electronic government” was poised to make government faster, smaller, digitized and increasingly transparent.

Many analysts (including yours truly, in a book called “Government 2.0”) predicted that by 2016, digital government would already long be a reality. In practice, the “e-gov revolution” has been an exceedingly slow-moving one. Sure, technology has improved some processes, and scores of public services have moved online, but the public sector has hardly been transformed.

What initial e-gov efforts managed was to construct pretty storefronts—in the form of websites—as the entrance to government systems stubbornly built for the industrial age. Few fundamental changes altered the structures, systems and processes of government behind those websites.

With such halfhearted implementation, the promise of cost savings from information technology failed to materialize, instead disappearing into the black hole of individual agency and division budgets. Government websites mirrored departments’ short-term orientation rather than citizens’ long-term needs. In short, government became wired—but not transformed.

So why did the reality of e-gov fail to live up to the promise?

For one thing, we weren’t yet living in a digitized economy—our homes, cars and workplaces were still mostly analog—and the technology wasn’t as far along as we thought; without the innovations of cloud computing and open-source software, for instance, the process of upgrading giant, decades-old legacy systems proved costly, time-consuming and incredibly complex.

And not surprisingly, most governments—and private firms, for that matter—lacked deep expertise in managing digital services. What we now call “agile development”—an iterative development model that allows for constant evolution through recurrent testing and evaluation—was not yet mainstreamed.

Finally, most governments explicitly decided to focus first on the Hollywood storefront and postpone the bigger and tougher issues of reengineering underlying processes and systems. When budgets nosedived—even before the recession—staying solvent and providing basic services took precedence over digital transformation.

The result: Agencies automated some processes but failed to transform them; services were put online, but rarely were they focused logically and intelligently around the citizen.

Given this history, it’s natural to be skeptical after years of hype about government’s amazing digital future. But conditions on the ground (and in the cloud) are finally in place for change, and citizens are not only ready for digital government—many are demanding it.

Digital-native millennials are now consumers of public services, and millions of them work in and around government; they won’t tolerate balky and poorly designed systems, and they’ll let the world know through social media. Gen Xers and baby boomers, too, have become far more savvy consumers of digital products and services….(More)”

Soon Your City Will Know Everything About You


Currently, the biggest users of these sensor arrays are in cities, where city governments use them to collect large amounts of policy-relevant data. In Los Angeles, the crowdsourced traffic and navigation app Waze collects data that helps residents navigate the city’s choked highway networks. In Chicago, an ambitious program makes public data available to startups eager to build apps for residents. The city’s 49th ward has been experimenting with participatory budgeting and online votingto take the pulse of the community on policy issues. Chicago has also been developing the “Array of Things,” a network of sensors that track, among other things, the urban conditions that affect bronchitis.

Edmonton uses the cloud to track the condition of playground equipment. And a growing number of countries have purpose-built smart cities, like South Korea’s high tech utopia city of Songdo, where pervasive sensor networks and ubiquitous computing generate immense amounts of civic data for public services.

The drive for smart cities isn’t restricted to the developed world. Rio de Janeiro coordinates the information flows of 30 different city agencies. In Beijing and Da Nang (Vietnam), mobile phone data is actively tracked in the name of real-time traffic management. Urban sensor networks, in other words, are also developing in countries with few legal protections governing the usage of data.

These services are promising and useful. But you don’t have to look far to see why the Internet of Things has serious privacy implications. Public data is used for “predictive policing” in at least 75 cities across the U.S., including New York City, where critics maintain that using social media or traffic data to help officers evaluate probable cause is a form of digital stop-and-frisk. In Los Angeles, the security firm Palantir scoops up publicly generated data on car movements, merges it with license plate information collected by the city’s traffic cameras, and sells analytics back to the city so that police officers can decide whether or not to search a car. In Chicago, concern is growing about discriminatory profiling because so much information is collected and managed by the police department — an agency with a poor reputation for handling data in consistent and sensitive ways. In 2015, video surveillance of the police shooting Laquan McDonald outside a Burger King was erased by a police employee who ironically did not know his activities were being digitally recorded by cameras inside the restaurant.

Since most national governments have bungled privacy policy, cities — which have a reputation for being better with administrative innovations — will need to fill this gap. A few countries, such as Canada and the U.K., have independent “privacy commissioners” who are responsible for advocating for the public when bureaucracies must decide how to use or give out data. It is pretty clear that cities need such advocates too.

What would Urban Privacy Commissioners do? They would teach the public — and other government staff — about how policy algorithms work. They would evaluate the political context in which city agencies make big data investments. They would help a city negotiate contracts that protect residents’ privacy while providing effective analysis to policy makers and ensuring that open data is consistently serving the public good….(more)”.

Big Crisis Data: Social Media in Disasters and Time-Critical Situations


Book by Carlos Castillo: “Social media is an invaluable source of time-critical information during a crisis. However, emergency response and humanitarian relief organizations that would like to use this information struggle with an avalanche of social media messages that exceeds human capacity to process. Emergency managers, decision makers, and affected communities can make sense of social media through a combination of machine computation and human compassion – expressed by thousands of digital volunteers who publish, process, and summarize potentially life-saving information. This book brings together computational methods from many disciplines: natural language processing, semantic technologies, data mining, machine learning, network analysis, human-computer interaction, and information visualization, focusing on methods that are commonly used for processing social media messages under time-critical constraints, and offering more than 500 references to in-depth information…(More)”

Private Data and the Public Good


Gideon Mann‘s remarks on the occasion of the Robert Khan distinguished lecture at The City College of New York on 5/22/16: and opportunities about a specific aspect of this relationship, the broader need for computer science to engage with the real world. Right now, a key aspect of this relationship is being built around the risks and opportunities of the emerging role of data.

Ultimately, I believe that these relationships, between computer science andthe real world, between data science and real problems, hold the promise tovastly increase our public welfare. And today, we, the people in this room,have a unique opportunity to debate and define a more moral dataeconomy….

The hybrid research model proposes something different. The hybrid research model, embeds, as it were, researchers as practitioners.The thought was always that you would be going about your regular run of business,would face a need to innovate to solve a crucial problem, and would do something novel. At that point, you might choose to work some extra time and publish a paper explaining your innovation. In practice, this model rarely works as expected. Tight deadlines mean the innovation that people do in their normal progress of business is incremental..

This model separated research from scientific publication, and shortens thetime-window of research, to what can be realized in a few year time zone.For me, this always felt like a tremendous loss, with respect to the older so-called “ivory tower” research model. It didn’t seem at all clear how this kindof model would produce the sea change of thought engendered byShannon’s work, nor did it seem that Claude Shannon would ever want towork there. This kind of environment would never support the freestanding wonder, like the robot mouse that Shannon worked on. Moreover, I always believed that crucial to research is publication and participation in the scientific community. Without this engagement, it feels like something different — innovation perhaps.

It is clear that the monopolistic environment that enabled AT&T to support this ivory tower research doesn’t exist anymore. .

Now, the hybrid research model was one model of research at Google, butthere is another model as well, the moonshot model as exemplified byGoogle X. Google X brought together focused research teams to driveresearch and development around a particular project — Google Glass and the Self-driving car being two notable examples. Here the focus isn’t research, but building a new product, with research as potentially a crucial blocking issue. Since the goal of Google X is directly to develop a new product, by definition they don’t publish papers along the way, but they’re not as tied to short-term deliverables as the rest of Google is. However, they are again decidedly un-Bell-Labs like — a secretive, tightly focused, non-publishing group. DeepMind is a similarly constituted initiative — working, for example, on a best-in-the-world Go playing algorithm, with publications happening sparingly.

Unfortunately, both of these approaches, the hybrid research model and the moonshot model stack the deck towards a particular kind of research — research that leads to relatively short term products that generate corporate revenue. While this kind of research is good for society, it isn’t the only kind of research that we need. We urgently need research that is longterm, and that is undergone even without a clear financial local impact. Insome sense this is a “tragedy of the commons”, where a shared public good (the commons) is not supported because everyone can benefit from itwithout giving back. Academic research is thus a non-rival, non-excludible good, and thus reasonably will be underfunded. In certain cases, this takes on an ethical dimension — particularly in health care, where the choice ofwhat diseases to study and address has a tremendous potential to affect human life. Should we research heart disease or malaria? This decisionmakes a huge impact on global human health, but is vastly informed by the potential profit from each of these various medicines….

Private Data means research is out of reach

The larger point that I want to make, is that in the absence of places where long-term research can be done in industry, academia has a tremendous potential opportunity. Unfortunately, it is actually quite difficult to do the work that needs to be done in academia, since many of the resources needed to push the state of the art are only found in industry: in particular data.

Of course, academia also lacks machine resources, but this is a simpler problem to fix — it’s a matter of money, resources form the government could go to enabling research groups building their own data centers or acquiring the computational resources from the market, e.g. Amazon. This is aided by the compute philanthropy that Google and Microsoft practice that grant compute cycles to academic organizations.

But the data problem is much harder to address. The data being collected and generated at private companies could enable amazing discoveries and research, but is impossible for academics to access. The lack of access to private data from companies actually is much more significant effects than inhibiting research. In particular, the consumer level data, collected by social networks and internet companies could do much more than ad targeting.

Just for public health — suicide prevention, addiction counseling, mental health monitoring — there is enormous potential in the use of our online behavior to aid the most needy, and academia and non-profits are set-up to enable this work, while companies are not.

To give a one examples, anorexia and eating disorders are vicious killers. 20 million women and 10 million men suffer from a clinically significant eating disorder at some time in their life, and sufferers of eating disorders have the highest mortality rate of any other mental health disorder — with a jaw-dropping estimated mortality rate of 10%, both directly from injuries sustained by the disorder and by suicide resulting from the disorder.

Eating disorders are particular in that sufferers often seek out confirmatory information, blogs, images and pictures that glorify and validate what sufferers see as “lifestyle” choices. Browsing behavior that seeks out images and guidance on how to starve yourself is a key indicator that someone is suffering. Tumblr, pinterest, instagram are places that people host and seek out this information. Tumblr has tried to help address this severe mental health issue by banning blogs that advocate for self-harm and by adding PSA announcements to query term searches for queries for or related to anorexia. But clearly — this is not the be all and end all of work that could be done to detect and assist people at risk of dying from eating disorders. Moreover, this data could also help understand the nature of those disorders themselves…..

There is probably a role for a data ombudsman within private organizations — someone to protect the interests of the public’s data inside of an organization. Like a ‘public editor’ in a newspaper according to how you’ve set it up. There to protect and articulate the interests of the public, which means probably both sides — making sure a company’s data is used for public good where appropriate, and making sure the ‘right’ to privacy of the public is appropriately safeguarded (and probably making sure the public is informed when their data is compromised).

Next, we need a platform to make collaboration around social good between companies and between companies and academics. This platform would enable trusted users to have access to a wide variety of data, and speed process of research.

Finally, I wonder if there is a way that government could support research sabbaticals inside of companies. Clearly, the opportunities for this research far outstrip what is currently being done…(more)”

How Open Data Is Creating New Opportunities in the Public Sector


Martin Yan at GovTech: Increased availability of open data in turn increases the ease with which citizens and their governments can collaborate, as well as equipping citizens to be active in identifying and addressing issues themselves. Technology developers are able to explore innovative uses of open data in combination with digital tools, new apps or other products that can tackle recognized inefficiencies. Currently, both the public and private sectors are teeming with such apps and projects….

Open data has proven to be a catalyst for the creation of new tools across industries and public-sector uses. Examples of a few successful projects include:

  • Citymapper — The popular real-time public transport app uses open data from Apple, Google, Cyclestreets, OpenStreetMaps and more sources to help citizens navigate cities. Features include A-to-B trip planning with ETA, real-time departures, bike routing, transit maps, public transport line status, real-time disruption alerts and integration with Uber.
  • Dataverse Project — This project from Harvard’s Institute for Quantitative Social Science makes it easy to share, explore and analyze research data. By simplifying access to this data, the project allows researchers to replicate others’ work to the benefit of all.
  • Liveplasma — An interactive search engine, Liveplasma lets users listen to music and view a web-like visualization of similar songs and artists, seeing how they are related and enabling discovery. Content from YouTube is streamed into the data visualizations.
  • Provenance — The England-based online platform lets users trace the origin and history of a product, also providing its manufacturing information. The mission is to encourage transparency in the practices of the corporations that produce the products we all use.

These examples demonstrate open data’s reach, value and impact well beyond the public sector. As open data continues to be put to wider use, the results will not be limited to increased efficiency and reduced wasteful spending in government, but will also create economic growth and jobs due to the products and services using the information as a foundation.

However, in the end, it won’t be the data alone that solves issues. Rather, it will be dependent on individual citizens, developers and organizations to see the possibilities, take up the call to arms and use this available data to introduce changes that make our world better….(More)”

Is civic technology the killer app for democracy?


 at TechCrunch: “Smartphone apps have improved convenience for public transportation in many urban centers. In Washington, DC, riders can download apps to help them figure out where to go, when to show up and how long to wait for a bus or train. However, the problem with public transport in DC is not the lack of modern, helpful and timely information. The problem is that the Metro subway system is onfire. 

Critical infrastructure refers to the vital systems that connect us. Like the water catastrophe in Flint, Michigan and our crumbling roads, bridges and airports, the Metro system in DC is experiencing a systems failure. The Metro’s problems arise from typical public challenges like  poor management and deferred maintenance.

Upgrades of physical infrastructure are not easy and nimble like a software patch or an agile design process. They are slow, expensive and subject to deliberation and scrutiny. In other words, they are the fundamental substance of democratic decision-making: big decisions with long-term implications that require thoughtful strategy, significant investment, political leadership and public buy-in.

A killer app is an application you love so much you buy into a whole new way of doing things. Email and social media are good examples of killer apps. The killer app for Metro would have to get political leaders to look beyond their narrow, short-term interests and be willing to invest in modern public transportation for our national capital region.

The same is true for fixing our critical infrastructure throughout the nation. The killer apps for the systems on which we rely daily won’t be technical, they will be human. It will be Americans working together to a build a technology-enabled resilient democracy —one that is inclusive, responsive and successful in the Information Age.

In 2007, the I-35 bridge in Minneapolis collapsed into the Mississippi river. During his presidential bid, Senator John McCain used this event as an example of the failure of our leaders to make trade-offs for common national purpose. Case in point, an extravagantly expensive congressionally funded Alaskan “bridge to nowhere” that served just a handful of people on an island. But how many apps to nowhere are we building?.

In DC, commuters who can afford alternatives will leave Metro. They’ll walk, drive, ordera car service or locate a bikeshare. The people who suffer from the public service risk and imbalance of the current Metro system are those who have no choice.

So here’s the challenge: Modern technology needs to create an inclusive society. Our current technical approach too often means that we’re prioritizing progress or profit for the few over the many. This pattern defeats the purpose of both the technology revolution and American democracy. Government and infrastructure are supposed to serve everyone, but technology thus far has made it so that public failures affect some Americans more than others. …

For democracy to succeed in the Information Age, we’ll need some new rules of engagement with technology. The White House recently released its third report on data and its implications for society. The 2016 report pays special attention to the ethics of machine automation and algorithms. The authors stress the importance of ethical analytics and propose the principle of “equal opportunity by design.” It’s an excellent point of departure as we recalibrate old systems and build new bridges to a more resilient, inclusive and prosperous nation….(more)”