5 Big Data Projects That Could Impact Your Life


Mashable: “We reached out to a few organizations using information, both hand- and algorithm-collected, to create helpful tools for their communities. This is only a small sample of what’s out there — plenty more pop up each day, and as more information becomes public, the trend will only grow….
1. Transit Time NYC
Transit Time NYC, an interactive map developed by WNYC, lets New Yorkers click a spot in any of the city’s five boroughs for an estimate of subway or train travel times. To create it, WNYC lead developer Steve Melendez broke the city into 2,930 hexagons, then pulled data from open source itinerary platform OpenTripPlanner — the Wikipedia of mapping software — and coupled it with the MTA’s publicly downloadable subway schedule….
2. Twitter’s ‘Topography of Tweets
In a blog post, Twitter unveiled a new data visualization map that displays billions of geotagged tweets in a 3D landscape format. The purpose is to display, topographically, which parts of certain cities most people are tweeting from…
3. Homicide Watch D.C.
Homicide Watch D.C. is a community-driven data site that aims to cover every murder in the District of Columbia. It’s sorted by “suspect” and “victim” profiles, where it breaks down each person’s name, age, gender and race, as well as original articles reported by Homicide Watch staff…
4. Falling Fruit
Can you find a hidden apple tree along your daily bike commute? Falling Fruit can.
The website highlights overlooked or hidden edibles in urban areas across the world. By collecting public information from the U.S. Department of Agriculture, municipal tree inventories, foraging maps and street tree databases, the site has created a network of 615 types of edibles in more than 570,000 locations. The purpose is to remind urban dwellers that agriculture does exist within city boundaries — it’s just more difficult to find….
5. AIDSvu
AIDSVu is an interactive map that illustrates the prevalence of HIV in the United States. The data is pulled from the U.S. Center for Disease Control’s national HIV surveillance reports, which are collected at both state and county levels each year…”

‘Medical Instagram’ helps build a library of reference photos for doctors


Springwise: “The power of the visual sharing that makes platforms such as Instagram so popular has been harnessed by retailers like Ask CT Food to share knowledge about cooking, but could the same be done for the medical world? Figure1 enables health professionals to upload and share photos of conditions, creating online discussion as well as crowdsourcing a database of reference images.
Developed by healthcare tech startup Movable Science, the platform is designed in a similar vein to Instagram and enables medical professionals to create their own feed of images from the cases they deal with. In order to protect patients’ identities, the app uses facial recognition to block out faces, while users can add their own marks to cover up other indentifiable marks. They can also add pointers and annotations, as well as choosing who sees it, before uploading the image. Photos can be tagged with relevant terms to allow the community to easily find them through search and others can comment on the images, fostering discussion among users. Images can also be starred, which acts simultaneously as an indication of quality as well as enabling users to save useful images for later reference. …
Although Instagram was developed with the broad purpose of entertainment and social sharing, Figure1 has tweaked the platform’s functions to provide a tool that could help doctors and students share their knowledge and learn from others in an engaging way…”

Metadata Liberation Movement


Holman Jenkins in the Wall Street Journal: “The biggest problem, then, with metadata surveillance may simply be that the wrong agencies are in charge of it. One particular reason why this matters is that the potential of metadata surveillance might actually be quite large but is being squandered by secret agencies whose narrow interest is only looking for terrorists….
“Big data” is only as good as the algorithms used to find out things worth finding out. The efficacy and refinement of big-data techniques are advanced by repetition, by giving more chances to find something worth knowing. Bringing metadata out of its black box wouldn’t only be a way to improve public trust in what government is doing. It would be a way to get more real value for society out of techniques that are being squandered on a fairly minor threat.
Bringing metadata out of the black box would open up new worlds of possibility—from anticipating traffic jams to locating missing persons after a disaster. It would also create an opportunity to make big data more consistent with the constitutional prohibition of unwarranted search and seizure. In the first instance, with the computer withholding identifying details of the individuals involved, any red flag could be examined by a law-enforcement officer to see, based on accumulated experience, whether the indication is of interest.
If so, a warrant could be obtained to expose the identities involved. If not, the record could immediately be expunged. All this could take place in a reasonably aboveboard, legal fashion, open to inspection in court when and if charges are brought or—this would be a good idea—a court is informed of investigations that led to no action.
Our guess is that big data techniques would pop up way too many false positives at first, and only considerable learning and practice would allow such techniques to become a useful tool. At the same time, bringing metadata surveillance out of the shadows would help the Googles, Verizons and Facebooks defend themselves from a wholly unwarranted suspicion that user privacy is somehow better protected by French or British or (heavens) Chinese companies from their own governments than U.S. data is from the U.S. government.
Most of all, it would allow these techniques to be put to work on solving problems that are actual problems for most Americans, which terrorism isn’t.”

Eduardo Paes on Open Government


Mayor, Rio de Janeiro, Brazil in the Huffington Post: “The Internet revolution has transformed the way knowledge is disseminated and how people unite over causes. Social networks are playing a key role in this movement, just as books and the press have done over the last six centuries. During the recent demonstrations in Brazil, approximately 62 percent of the people were informed of the event via Facebook, a much higher rate than TV, which was first source of information to 14 percent of attendees, according to Ibope Institute. Three out of four agitators used social networks to round up support. As generations succeed and the digital gap narrows, these statistics could possibly rise.
This revolution is also accentuating the imperfections of the representative democracy, the only plausible alternative, as Churchill famously said. We live in an era of “Liquid Modernity” as defined by sociologist Zygmunt Bauman, which describes the ephemeral nature of contemporary social interactions. Bauman says that these days society, in a similar manner to liquid, adopts various unstable forms under small amounts of pressure. They are incapable of stabilizing in a consistent form, which results in consequences to social relationships and politics. Meanwhile, political parties, bureaucracy and institutions seem to remain firmly in the 17th Century.
Democracy has to reinvent itself in accordance with this new “liquid society” where collaboration happens between many millions of people directly. Leadership is not vertical, as in the past, but horizontal. Nowadays some say following is more important than leading. Cyber culture understands open code as a principle, something the music industry has reluctantly had to learn. There is no time and space limitation for public accountability on the Internet. Creative commonality is standard and does not resemble the authoritarian style of the dead communist experience. It seems that it is no longer society’s obligation to understand legislation, it is a duty for governments to be understood by their people.”

Feedback Labs


Feedback Labs: FeedBackLabs_circleLogo_HiRes100pxIf you find yourself asking the following three questions, then you have come to the right place:

  1. “What do citizens want?”
  2. “Are they getting it?”
  3. “If not, how will things change?”

Much excellent work has been done over recent years to answer the first and second questions. Our goal is to catalyze that work and make it matter by focusing on the third question – “How will things change?”
Aid, philanthropy, and government programs are often designed, implemented and evaluated by experts.  We think that citizens should increasingly be in the driver’s seat.  Experts are still important, but in many cases their role needs to shift from being a decision-maker to being people who enrich and inform conversations among citizens.

What will Feedback Labs do?

Based on what we have heard so far, we think we can add value in three ways:

  • Frame the issues – for example, what exactly do we mean by feedback loops? What works and what doesn’t? What is the evidence for impact?
  • Help close the feedback loop – uncover approaches that are succeeding at finding out what people want and whether they are getting it, and then helping to close the loop by understanding (and in some cases funding) what it takes to translate citizen voice into real changes in programs.
  • Facilitate mainstreaming ­– i.e., assist aid, philanthropy and government organizations adopt feedback loops in their normal course of operation. We want to make feedback loops the norm rather than the exception.

Historically we have often assumed that the flow of knowledge is from the richer countries to the poorer.  But learning goes both ways, and in the case of feedback loops, some of the most innovative approaches are being pioneered in developing countries.  So we plan to support work both internationally and domestically.”

Facebook Is Being Redefined by Its Developing World Users


Tom Simonite in MIT Technology Review: “…as Facebook’s user base continues to expand, a growing proportion of its users think of it quite differently, as a luxury brand, badge of status, and or even a place to make a little extra money. That’s due to the rapid growth in the number of Facebook users signing on from developing countries, a trend underscored by news from the company today that more than 100 million people use a mobile app the company makes for feature phones
Little research has been done on Facebook’s growth in developing countries (and a lot would be needed to capture even some of the diversity included under the blanket term “developing world”). Two small, recent studies of Kenyan Facebook users in poor areas by Susan Wyche of Michigan State University are among the first to be published, and they provide some interesting insights.
One of Wyche’s ethnographic studies took place in rural Internet cafes, where the researchers were told that “Facebook is a luxury,” only to be indulged if someone had money to spare (here’s a PDF of Wyche’s paper). When study participants thought about social networking, the challenges of low bandwidth and sometimes unreliable electricity supplies were foremost in their minds.
The barriers of cost and infrastructure associated with Facebook led people in another community Wyche and colleagues visited, a slum of Nairobi, to see the service as for more than just socializing. They used it—with mixed success—as a way to make a little money, look for jobs, market themselves, and seek remittances from friends and family overseas. (This reminded me of a recent report on people in Kuwait using Instagram to sell things and run retail businesses.)…
Should it want to, Facebook could even become a powerful tool for efforts to improve the lives of people in poor areas, where the site is gaining traction. The company has already dabbled with using social engineering to boost organ donations in the U.S. (see “Thank God for Facebook: When Platforms Proselytize”). There’s no shortage of similar experiments that could be run in places with more fundamental health problems, where Facebook’s status as a luxury could make it very influential.”

Predictive Policing: Don’t even think about it


The Economist: “PredPol is one of a range of tools using better data, more finely crunched, to predict crime. They seem to promise better law-enforcement. But they also bring worries about privacy, and of justice systems run by machines not people.
Criminal offences, like infectious disease, form patterns in time and space. A burglary in a placid neighbourhood represents a heightened risk to surrounding properties; the threat shrinks swiftly if no further offences take place. These patterns have spawned a handful of predictive products which seem to offer real insight. During a four-month trial in Kent, 8.5% of all street crime occurred within PredPol’s pink boxes, with plenty more next door to them; predictions from police analysts scored only 5%. An earlier trial in Los Angeles saw the machine score 6% compared with human analysts’ 3%.
Intelligent policing can convert these modest gains into significant reductions in crime…
Predicting and forestalling crime does not solve its root causes. Positioning police in hotspots discourages opportunistic wrongdoing, but may encourage other criminals to move to less likely areas. And while data-crunching may make it easier to identify high-risk offenders—about half of American states use some form of statistical analysis to decide when to parole prisoners—there is little that it can do to change their motivation.
Misuse and overuse of data can amplify biases…But mathematical models might make policing more equitable by curbing prejudice.”

9 models to scale open data – past, present and future


Open Knowledge Foundation Blog: “The possibilities of open data have been enthralling us for 10 years…But that excitement isn’t what matters in the end. What matters is scale – which organisational structures will make this movement explode?  This post quickly and provocatively goes through some that haven’t worked (yet!) and some that have.
Ones that are working now
1) Form a community to enter in new data. Open Street Map and MusicBrainz are two big examples. It works as the community is the originator of the data. That said, neither has dominated its industry as much as I thought they would have by now.
2) Sell tools to an upstream generator of open data. This is what CKAN does for central Governments (and the new ScraperWiki CKAN tool helps with). It’s what mySociety does, when selling FixMyStreet installs to local councils, thereby publishing their potholes as RSS feeds.
3) Use open data (quietly). Every organisation does this and never talks about it. It’s key to quite old data resellers like Bloomberg. It is what most of ScraperWiki’s professional services customers ask us to do. The value to society is enormous and invisible. The big flaw is that it doesn’t help scale supply of open data.
4) Sell tools to downstream users. This isn’t necessarily open data specific – existing software like spreadsheets and Business Intelligence can be used with open or closed data. Lots of open data is on the web, so tools like the new ScraperWiki which work well with web data are particularly suited to it.
Ones that haven’t worked
5) Collaborative curation ScraperWiki started as an audacious attempt to create an open data curation community, based on editing scraping code in a wiki. In its original form (now called ScraperWiki Classic) this didn’t scale. …With a few exceptions, notably OpenCorporates, there aren’t yet open data curation projects.
6) General purpose data marketplaces, particularly ones that are mainly reusing open data, haven’t taken off. They might do one day, however I think they need well-adopted higher level standards for data formatting and syncing first (perhaps something like dat, perhaps something based on CSV files).
Ones I expect more of in the future
These are quite exciting models which I expect to see a lot more of.
7) Give labour/money to upstream to help them create better data. This is quite new. The only, and most excellent, example of it is the UK’s National Archive curating the Statute Law Database. They do the work with the help of staff seconded from commercial legal publishers and other parts of Government.
It’s clever because it generates money for upstream, which people trust the most, and which has the most ability to improve data quality.
8) Viral open data licensing. MySQL made lots of money this way, offering proprietary dual licenses of GPLd software to embedded systems makers. In data this could use OKFN’s Open Database License, and organisations would pay when they wanted to mix the open data with their own closed data. I don’t know anyone actively using it, although Chris Taggart from OpenCorporates mentioned this model to me years ago.
9) Corporations release data for strategic advantage. Companies are starting to release their own data for strategic gain. This is very new. Expect more of it.”

Let’s Shake Up the Social Sciences


Nicholas Christakis in The New York Times:”TWENTY-FIVE years ago, when I was a graduate student, there were departments of natural science that no longer exist today. Departments of anatomy, histology, biochemistry and physiology have disappeared, replaced by innovative departments of stem-cell biology, systems biology, neurobiology and molecular biophysics. Taking a page from Darwin, the natural sciences are evolving with the times. The perfection of cloning techniques gave rise to stem-cell biology; advances in computer science contributed to systems biology. Whole new fields of inquiry, as well as university departments and majors, owe their existence to fresh discoveries and novel tools.

In contrast, the social sciences have stagnated. They offer essentially the same set of academic departments and disciplines that they have for nearly 100 years: sociology, economics, anthropology, psychology and political science. This is not only boring but also counterproductive, constraining engagement with the scientific cutting edge and stifling the creation of new and useful knowledge. Such inertia reflects an unnecessary insecurity and conservatism, and helps explain why the social sciences don’t enjoy the same prestige as the natural sciences.

One reason citizens, politicians and university donors sometimes lack confidence in the social sciences is that social scientists too often miss the chance to declare victory and move on to new frontiers. Like natural scientists, they should be able to say, “We have figured this topic out to a reasonable degree of certainty, and we are now moving our attention to more exciting areas.” But they do not.”

Digital Public Spaces


FutureEverything Publications: “This publication gathers a range of short explorations of the idea of the Digital Public Space. The central vision of the Digital Public Space is to give everyone everywhere unrestricted access to an open resource of culture and knowledge. This vision has emerged from ideas around building platforms for engagement around cultural archives to become something wider, which this publication is seeking to hone and explore.
This is the first publication to look at the emergence of the Digital Public Space. Contributors include some of the people who are working to make the Digital Public Space happen.
The Digital Public Spaces publication has been developed by FutureEverything working with Bill Thompson of the BBC and in association with The Creative Exchange.”