Big health data: the need to earn public trust


Tjeerd-Pieter van Staa et al in the BMJ: “Better use of large scale health data has the potential to benefit patient care, public health, and research. The handling of such data, however, raises concerns about patient privacy, even when the risks of disclosure are extremely small.

The problems are illustrated by recent English initiatives trying to aggregate and improve the accessibility of routinely collected healthcare and related records, sometimes loosely referred to as “big data.” One such initiative, care.data, was set to link and provide access to health and social care information from different settings, including primary care, to facilitate the planning and provision of healthcare and to advance health science.1 Data were to be extracted from all primary care practices in England. A related initiative, the Clinical Practice Research Datalink (CPRD), evolved from the General Practice Research Database (GPRD). CPRD was intended to build on GPRD by linking patients’ primary care records to hospital data, around 50 disease registries and clinical audits, genetic information from UK Biobank, and even the loyalty cards of a large supermarket chain, creating an integrated data repository and linked services for all of England that could be sold to universities, drug companies, and non-healthcare industries. Care.data has now been abandoned and CPRD has stalled. The flawed implementation of care.data plus earlier examples of data mismanagement have made privacy issues a mainstream public concern. We look at what went wrong and how future initiatives might gain public support….(More)”

US start-up aims to steer through flood of data


Richard Waters in the Financial Times: “The “open data” movement has produced a deluge of publicly available information this decade, as governments like those in the UK and US have released large volumes of data for general use.

But the flood has left researchers and data scientists with a problem: how do they find the best data sets, ensure these are accurate and up to date, and combine them with other sources of information?

The most ambitious in a spate of start-ups trying to tackle this problem is set to be unveiled on Monday, when data.world opens for limited release. A combination of online repository and social network, the site is designed to be a central platform to support the burgeoning activity around freely available data.

The aim closely mirrors Github, which has been credited with spurring the open source software movement by becoming both a place to store and find free programs as well as a crowdsourcing tool for identifying the most useful.

“We are at an inflection point,” said Jeff Meisel, chief marketing officer for the US Census Bureau. A “massive amount of data” has been released under open data provisions, he said, but “what hasn’t been there are the tools, the communities, the infrastructure to make that data easier to mash up”….

Data.world plans to seed its site with about a thousand data sets and attract academics as its first users, said Mr Hurt. By letting users create personal profiles on the site, follow others and collaborate around the information they are working on, the site hopes to create the kind of social dynamic that makes it more useful the more it is used.

An attraction of the service is the ability to upload data in any format and then use common web standards to link different data sets and create mash-ups with the information, said Dean Allemang, an expert in online data….(More)”

The big health data sale


Philip Hunter at the EMBO Journal: “Personal health and medical data are a valuable commodity for a number of sectors from public health agencies to academic researchers to pharmaceutical companies. Moreover, “big data” companies are increasingly interested in tapping into this resource. One such firm is Google, whose subsidiary Deep Mind was granted access to medical records on 1.6 million patients who had been treated at some time by three major hospitals in London, UK, in order to develop a diagnostic app. The public discussion it raised was just another sign of the long‐going tensions between drug companies, privacy advocates, regulators, legislators, insurers and patients about privacy, consent, rights of access and ownership of medical data that is generated in pharmacies, hospitals and doctors’ surgeries. In addition, the rapid growth of eHealth will add a boon of even more health data from mobile phones, portable diagnostic devices and other sources.

These developments are driving efforts to create a legal framework for protecting confidentiality, controlling communication and governing access rights to data. Existing data protection and human rights laws are being modified to account for personal medical and health data in parallel to the campaign for greater transparency and access to clinical trial data. Healthcare agencies in particular will have to revise their procedures for handling medical or research data that is associated with patients.

Google’s foray into medical data demonstrates the key role of health agencies, in this case the Royal Free NHS Trust, which operates the three London hospitals that granted Deep Mind access to patient data. Royal Free approached Deep Mind with a request to develop an app for detecting acute kidney injury, which, according to the Trust, affects more than one in six inpatients….(More)”

Power to the people: how cities can use digital technology to engage and empower citizens


Tom Saunders at NESTA: “You’re sat in city hall one day and you decide it would be a good idea to engage residents in whatever it is you’re working on – next year’s budget, for example, or the redevelopment of a run down shopping mall. How do you go about it?

In the past, you might have held resident meetings and exhibitions where people could view proposed designs or talk to city government employees. You can still do that today, but now there’s digital: apps, websites and social media. So you decide on a digital engagement strategy: you build a website or you run a social media campaign inviting feedback on your proposals. What happens next?

Two scenarios: 1) You get 50 responses, mostly from campaign groups and local political activists; or 2) you receive such a huge number of responses that you don’t know what to do with them. Besides which, you don’t have the power or budget to implement 90 per cent of the suggestions and neither do you have the time to tell people why their proposals will be ignored. The main outcome of your citizen engagement exercise seems to be that you have annoyed the very people you were trying to get buy in from. What went wrong?

Four tips for digital engagement

With all the apps and platforms out there, it’s hard to make sense of what is going on in the world of digital tools for citizen engagement. It seems there are three distinct activities that digital tools enable: delivering council services online – say applying for a parking permit; using citizen generated data to optimise city government processes and engaging citizens in democratic exercises. In Conneced Councils Nesta sets out what future models of online service delivery could look like. Here I want to focus on the ways that engaging citizens with digital technology can help city governments deliver services more efficiently and improve engagement in democratic processes.

  1. Resist the temptation to build an app…

  1. Think about what you want to engage citizens for…

Sometimes engagement is statutory: communities have to be shown new plans for their area. Beyond this, there are a number of activities that citizen engagement is useful for. When designing a citizen engagement exercise it may help to think which of the following you are trying to achieve (note: they aren’t mutually exclusive):

  • Better understanding of the facts

If you want to use digital technologies to collect more data about what is happening in your city, you can buy a large number of sensors and install them across the city, to track everything from people movements to how full bins are. A cheaper and possibly more efficient way for cities to do this might involve working with people to collect this data – making use of the smartphones that an increasing number of your residents carry around with them. Prominent examples of this included flood mapping in Jakarta using geolocated tweets and pothole mapping in Boston using a mobile app.

For developed world cities, the thought of outsourcing flood mapping to citizens might fill government employees with horror. But for cities in developing countries, these technologies present an opportunity, potentially, for them to leapfrog their peers – to reach a level of coverage now that would normally require decades of investment in infrastructure to achieve. This is currently a hypothetical situation: cities around the world are only just starting to pilot these ideas and technologies and it will take a number of years before we know how useful they are to city governments.

  • Generating better ideas and options

The examples above involve passive data collection. Moving beyond this to more active contributions, city governments can engage citizens to generate better ideas and options. There are numerous examples of this in urban planning – the use of Minecraft by the UN in Nairobi to collect and visualise ideas for the future development of the community, or the Carticipe platform in France, which residents can use to indicate changes they would like to see in their city on a map.

It’s all very well to create a digital suggestion box, but there is a lot of evidence that deliberation and debate lead to much better ideas. Platforms like BetterReykjavic include a debate function for any idea that is proposed. Based on feedback, the person who submitted the idea can then edit it, before putting it to a public vote – only then, if the proposal gets the required number of votes, is it sent to the city council for debate.

  • Better decision making

As well as enabling better decision making by giving city government employees, better data and better ideas, digital technologies can give the power to make decisions directly to citizens. This is best encapsulated by participatory budgeting – which involves allowing citizens to decide how a percentage of the city budget is spent. Participatory budgeting emerged in Brazil in the 1980s, but digital technologies help city governments reach a much larger audience. ‘Madame Mayor, I have an idea’ is a participatory budgeting process that lets citizens propose and vote on ideas for projects in Paris. Over 20,000 people have registered on the platform and the pilot phase of the project received over 5000 submissions.

  1. Remember that there’s a world beyond the internet…

  1. Pick the right question for the right crowd…

When we talk to city governments and local authorities, they express a number of fears about citizen engagement: Fear of relying on the public for the delivery of critical services, fear of being drowned in feedback and fear of not being inclusive – only engaging with those that are online and motivated. Hopefully, thinking through the issues discussed above may help alleviate some of these fears and make city government more enthusiastic about digital engagement….(More)

Data at the Speed of Life


Marc Gunther at The Chronicle of Philanthropy: “Can pregnant women in Zambia be persuaded to deliver their babies in hospitals or clinics rather than at home? How much are villagers in Cambodia willing to pay for a simple latrine? What qualities predict success for a small-scale entrepreneur who advises farmers?

Governments, foundations, and nonprofits that want to help the world’s poor regularly face questions like these. Answers are elusive. While an estimated $135 billion in government aid and another $15 billion in charitable giving flow annually to developing countries, surprisingly few projects benefit from rigorous evaluations. Those that do get scrutinized in academic studies often don’t see the results for years, long after the projects have ended.

IDinsight puts data-driven research on speed. Its goal is to produce useful, low-cost research results fast enough that nonprofits can use it make midcourse corrections to their programs….

IDinsight calls this kind of research “decision-focused evaluation,” which sets it apart from traditional monitoring and evaluation (M&E) and academic research. M&E, experts say, is mostly about accountability and outputs — how many training sessions were held, how much food was distributed, and so on. Usually, it occurs after a program is complete. Academic studies are typically shaped by researchers’ desire to break new ground and publish on topics of broad interest. The IDinsight approach aims instead “for contemporaneous decision-making rather than for publication in the American Economic Review,” says Ruth Levine, who directs the global development program at the William and Flora Hewlett Foundation.

A decade ago, Ms. Levine and William Savedoff, a senior fellow at the Center for Global Development, wrote an influential paper entitled “When Will We Ever Learn? Improving Lives Through Impact Evaluation.” They lamented that an “absence of evidence” for the effectiveness of global development programs “not only wastes money but denies poor people crucial support to improve their lives.”

Since then, impact evaluation has come a “huge distance,” Ms. Levine says….

Actually, others are. Innovations for Poverty Action recently created the Goldilocks Initiative to do what it calls “right fit” evaluations leading to better policy and programs, according to Thoai Ngo, who leads the effort. Its first clients include GiveDirectly, which facilitates cash transfers to the extreme poor, and Splash, a water charity….All this focus on data has generated pushback. Many nonprofits don’t have the resources to do rigorous research, according to Debra Allcock Tyler, chief executive at Directory of Social Change, a British charity that provides training, data, and other resources for social enterprises.

All this focus on data has generated pushback. Many nonprofits don’t have the resources to do rigorous research, according to Debra Allcock Tyler, chief executive at Directory of Social Change, a British charity that provides training, data, and other resources for social enterprises.

“A great deal of the time, data is pointless,” Allcock Tyler said last year at a London seminar on data and nonprofits. “Very often it is dangerous and can be used against us, and sometimes it takes away precious resources from other things that we might more usefully do.”

A bigger problem may be that the accumulation of knowledge does not necessarily lead to better policies or practices.

“People often trust their experience more than a systematic review,” says Ms. Levine of the Hewlett Foundation. IDinsight’s Esther Wang agrees. “A lot of our frustration is looking at the development world and asking why are we not accountable for the money that we are spending,” she says. “That’s a waste that none of us really feels is justifiable.”…(More)”

OpenData.Innovation: an international journey to discover innovative uses of open government data


Nesta: “This paper by Mor Rubinstein (Open Knowledge International) and Josh Cowls and Corinne Cath (Oxford Internet Institute) explores the methods and motivations behind innovative uses of open government data in five specific country contexts – Chile, Argentine, Uruguay, Israel, and Denmark; and considers how the insights it uncovers might be adopted in a UK context.

Through a series of interviews with ‘social hackers’ and open data practitioners and experts in countries with recognised open government data ‘hubs’, the authors encountered a diverse range of practices and approaches in how actors in different sectors of society make innovative uses of open government data. This diversity also demonstrated how contextual factors shape the opportunities and challenges for impactful open government data use.

Based on insights from these international case studies, the paper offers a number of recommendations – around community engagement, data literacy and practices of opening data – which aim to support governments and citizens unlock greater knowledge exchange and social impact through open government data….(More)”

Due Diligence? We need an app for that


Ken Banks at kiwanja.net: “The ubiquity of mobile phones, the reach of the Internet, the shear number of problems facing the planet, competitions and challenges galore, pots of money and strong media interest in tech-for-good projects has today created the perfect storm. Not a day goes by without the release of an app hoping to solve something, and the fact so many people are building so many apps to fix so many problems can only be a good thing. Right?

The only problem is this. It’s become impossible to tell good from bad, even real from fake. It’s something of a Wild West out there. So it was no surprise to see this happening recently. Quoting The Guardian:

An app which purported to offer aid to refugees lost in the Mediterranean has been pulled from Apple’s App Store after it was revealed as a fake. The I Sea app, which also won a Bronze medal at the Cannes Lions conference on Monday night, presented itself as a tool to help report refugees lost at sea, using real-time satellite footage to identify boats in trouble and highlighting their location to the Malta-based Migrant Offshore Aid Station (Moas), which would provide help.

In fact, the app did nothing of the sort. Rather than presenting real-time satellite footage – a difficult and expensive task – it instead simply shows a portion of a static, unchanging image. And while it claims to show the weather in the southern Mediterranean, that too isn’t that accurate: it’s for Western Libya.

The worry isn’t only that someone would decide to build a fake app which ‘tackles’ such an emotive subject, but the fact that this particular app won an award and received favourable press. Wired, Mashable, the Evening Standard and Reuters all spoke positively about it. Did no-one check that it did what it said it did?

This whole episode reminds me of something Joel Selanikio wrote in his contributing chapter to two books I’ve recently edited and published. In his chapters, which touch on his work on the Magpi data collection tool in addition to some of the challenges facing the tech-for-development community, Joel wrote:

In going over our user activity logs for the online Magpi app, I quickly realised that no-one from any of our funding organisations was listed. Apparently no-one who was paying us had ever seen our working software! This didn’t seem to make sense. Who would pay for software without ever looking at it? And if our funders hadn’t seen the software, what information were they using when they decided whether to fund us each year?

…The shear number of apps available that claim to solve all manner of problems may seem encouraging on the surface – 1,500 (and counting) to help refugees might be a case in point – but how many are useful? How many are being used? How many solve a problem? And how many are real?

Due diligence? Maybe it’s time we had an app for that…(More)”

The Surprising History of the Infographic


Clive Thompson at the Smithsonian magazine: “As the 2016 election approaches, we’re hearing a lot about “red states” and “blue states.” That idiom has become so ingrained that we’ve almost forgotten where it originally came from: a data visualization.

In the 2000 presidential election, the race between Al Gore and George W. Bush was so razor close that broadcasters pored over electoral college maps—which they typically colored red and blue. What’s more, they talked about those shadings. NBC’s Tim Russert wondered aloud how George Bush would “get those remaining 61 electoral red states, if you will,” and that language became lodged in the popular imagination. America became divided into two colors—data spun into pure metaphor. Now Americans even talk routinely about “purple” states, a mental visualization of political information.

We live in an age of data visualization. Go to any news website and you’ll see graphics charting support for the presidential candidates; open your iPhone and the Health app will generate personalized graphs showing how active you’ve been this week, month or year. Sites publish charts showing how the climate is changing, how schools are segregating, how much housework mothers do versus fathers. And newspapers are increasingly finding that readers love “dataviz”: In 2013, the New York Times’ most-read story for the entire year was a visualization of regional accents across the United States. It makes sense. We live in an age of Big Data. If we’re going to understand our complex world, one powerful way is to graph it.

But this isn’t the first time we’ve discovered the pleasures of making information into pictures. Over a hundred years ago, scientists and thinkers found themselves drowning in their own flood of data—and to help understand it, they invented the very idea of infographics.

**********

The idea of visualizing data is old: After all, that’s what a map is—a representation of geographic information—and we’ve had maps for about 8,000 years. But it was rare to graph anything other than geography. Only a few examples exist: Around the 11th century, a now-anonymous scribe created a chart of how the planets moved through the sky. By the 18th century, scientists were warming to the idea of arranging knowledge visually. The British polymath Joseph Priestley produced a “Chart of Biography,” plotting the lives of about 2,000 historical figures on a timeline. A picture, he argued, conveyed the information “with more exactness, and in much less time, than it [would take] by reading.”

Still, data visualization was rare because data was rare. That began to change rapidly in the early 19th century, because countries began to collect—and publish—reams of information about their weather, economic activity and population. “For the first time, you could deal with important social issues with hard facts, if you could find a way to analyze it,” says Michael Friendly, a professor of psychology at York University who studies the history of data visualization. “The age of data really began.”

An early innovator was the Scottish inventor and economist William Playfair. As a teenager he apprenticed to James Watt, the Scottish inventor who perfected the steam engine. Playfair was tasked with drawing up patents, which required him to develop excellent drafting and picture-drawing skills. After he left Watt’s lab, Playfair became interested in economics and convinced that he could use his facility for illustration to make data come alive.

“An average political economist would have certainly been able to produce a table for publication, but not necessarily a graph,” notes Ian Spence, a psychologist at the University of Toronto who’s writing a biography of Playfair. Playfair, who understood both data and art, was perfectly positioned to create this new discipline.

In one famous chart, he plotted the price of wheat in the United Kingdom against the cost of labor. People often complained about the high cost of wheat and thought wages were driving the price up. Playfair’s chart showed this wasn’t true: Wages were rising much more slowly than the cost of the product.

JULAUG2016_H04_COL_Clive.jpg
Playfair’s trade-balance time-series chart, published in his Commercial and Political Atlas, 1786 (Wikipedia)

“He wanted to discover,” Spence notes. “He wanted to find regularities or points of change.” Playfair’s illustrations often look amazingly modern: In one, he drew pie charts—his invention, too—and lines that compared the size of various country’s populations against their tax revenues. Once again, the chart produced a new, crisp analysis: The British paid far higher taxes than citizens of other nations.

Neurology was not yet a robust science, but Playfair seemed to intuit some of its principles. He suspected the brain processed images more readily than words: A picture really was worth a thousand words. “He said things that sound almost like a 20th-century vision researcher,” Spence adds. Data, Playfair wrote, should “speak to the eyes”—because they were “the best judge of proportion, being able to estimate it with more quickness and accuracy than any other of our organs.” A really good data visualization, he argued, “produces form and shape to a number of separate ideas, which are otherwise abstract and unconnected.”

Soon, intellectuals across Europe were using data visualization to grapple with the travails of urbanization, such as crime and disease….(More)”

Directory of crowdsourcing websites


Directory by Donelle McKinley: “…Here is just a selection of websites for crowdsourcing cultural heritage. Websites are actively crowdsourcing unless indicated with an asterisk…The directory is organized by the type of crowdsourcing process involved, using the typology for crowdsourcing in the humanities developed by Dunn & Hedges (2012). In their study they explain that, “a process is a sequence of tasks, through which an output is produced by operating on an asset”. For example, the Your Paintings Tagger website is for the process of tagging, which is an editorial task. The assets being tagged are images, and the output of the project is metadata, which makes the images easier to discover, retrieve and curate.

Transcription

Alexander Research Library, Wanganui Library * (NZ) Transcription of index cards from 1840 to 2002.

Ancient Lives*, University of Oxford (UK) Transcription of papyri from Greco-Roman Egypt.

AnnoTate, Tate Britain (UK) Transcription of artists’ diaries, letters and sketchbooks.

Decoding the Civil War, The Huntington Library, Abraham Lincoln Presidential Library and Museum &  North Carolina State University (USA). Transcription and decoding of Civil War telegrams from the Thomas T. Eckert Papers.

DIY History, University of Iowa Libraries (USA) Transcription of historical documents.

Emigrant City, New York Public Library (USA) Transcription of handwritten mortgage and bond ledgers from the Emigrant Savings Bank records.

Field Notes of Laurence M. Klauber, San Diego Natural History Museum (USA) Transcription of field notes by the celebrated herpetologist.

Notes from Nature Transcription of natural history museum records.

Measuring the ANZACs, Archives New Zealand and Auckland War Memorial Museum (NZ). Transcription of first-hand accounts of NZ soldiers in WW1.

Old Weather (UK) Transcription of Royal Navy ships logs from the early twentieth century.

Scattered Seeds, Heritage Collections, Dunedin Public Libraries (NZ) Transcription of index cards for Dunedin newspapers 1851-1993

Shakespeare’s World, Folger Shakespeare Library (USA) & Oxford University Press (UK). Transcription of handwritten documents by Shakespeare’s contemporaries. Identification of words that have yet to be recorded in the authoritative Oxford English Dictionary.

Smithsonian Digital Volunteers Transcription Center (USA) Transcription of multiple collections.

Transcribe Bentham, University College London (UK) Transcription of historical manuscripts by philosopher and reformer Jeremy Bentham,

What’s on the menu? New York Public Library (USA) Transcription of historical restaurant menus. …

(Full Directory).

Civic Data Initiatives


Burak Arikan at Medium: “Big data is the term used to define the perpetual and massive data gathered by corporations and governments on consumers and citizens. When the subject of data is not necessarily individuals but governments and companies themselves, we can call it civic data, and when systematically generated in large amounts, civic big data. Increasingly, a new generation of initiatives are generating and organizing structured data on particular societal issues from human rights violations, to auditing government budgets, from labor crimes to climate justice.

These civic data initiatives diverge from the traditional civil society organizations in their outcomes,that they don’t just publish their research as reports, but also open it to the public as a database.Civic data initiatives are quite different in their data work than international non-governmental organizations such as UN, OECD, World Bank and other similar bodies. Such organizations track social, economical, political conditions of countries and concentrate upon producing general statistical data, whereas civic data initiatives aim to produce actionable data on issues that impact individuals directly. The change in the GDP value of a country is useless for people struggling for free transportation in their city. Incarceration rate of a country does not help the struggle of the imprisoned journalists. Corruption indicators may serve as a parameter in a country’s credit score, but does not help to resolve monopolization created with public procurement. Carbon emission statistics do not prevent the energy deals between corrupt governments that destroy the nature in their region.

Needless to say, civic data initiatives also differ from governmental institutions, which are reluctant to share any more that they are legally obligated to. Many governments in the world simply dump scanned hardcopies of documents on official websites instead of releasing machine-readable data, which prevents systematic auditing of government activities.Civic data initiatives, on the other hand, make it a priority to structure and release their data in formats that are both accessible and queryable.

Civic data initiatives also deviate from general purpose information commons such as Wikipedia. Because they consistently engage with problems, closely watch a particular societal issue, make frequent updates,even record from the field to generate and organize highly granular data about the matter….

Several civic data initiatives generate data on variety of issues at different geographies, scopes, and scales. The non-exhaustive list below have information on founders, data sources, and financial support. It is sorted according to each initiative’s founding year. Please send your suggestions to contact at graphcommons.com. See more detailed information and updates on the spreadsheet of civic data initiatives.

Open Secrets tracks data about the money flow in the US government, so it becomes more accessible for journalists, researchers, and advocates.Founded as a non-profit in 1983 by Center for Responsive Politics, gets support from variety of institutions.

PolitiFact is a fact-checking website that rates the accuracy of claims by elected officials and others who speak up in American politics. Uses on-the-record interviews as its data source. Founded in 2007 as a non-profit organization by Tampa Bay Times. Supported by Democracy Fund, Bill &Melinda Gates Foundation, John S. and James L. Knight Foundation, FordFoundation, Knight Foundation, Craigslist Charitable Fund, and the CollinsCenter for Public Policy…..

La Fabrique de La loi (The Law Factory) maps issues of local-regional socio-economic development, public investments, and ecology in France.Started in 2014, the project builds a database by tracking bills from government sources, provides a search engine as well as an API. The partners of the project are CEE Sciences Po, médialab Sciences Po, RegardsCitoyens, and Density Design.

Mapping Media Freedom identifies threats, violations and limitations faced by members of the press throughout European Union member states,candidates for entry and neighbouring countries. Initiated by Index onCensorship and European Commission in 2004, the project…(More)”