Meeting the Challenges of Big Data


Opinion by the European Data Protection Supervisor: “Big data, if done responsibly, can deliver significant benefits and efficiencies for society and individuals not only in health, scientific research, the environment and other specific areas. But there are serious concerns with the actual and potential impact of processing of huge amounts of data on the rights and freedoms of individuals, including their right to privacy. The challenges and risks of big data therefore call for more effective data protection.

Technology should not dictate our values and rights, but neither should promoting innovation and preserving fundamental rights be perceived as incompatible. New business models exploiting new capabilities for the massive collection, instantaneous transmission, combination and reuse of personal information for unforeseen purposes have placed the principles of data protection under new strains, which calls for thorough consideration on how they are applied.

European data protection law has been developed to protect our fundamental rights and values, including our right to privacy. The question is not whether to apply data protection law to big data, but rather how to apply it innovatively in new environments. Our current data protection principles, including transparency, proportionality and purpose limitation, provide the base line we will need to protect more dynamically our fundamental rights in the world of big data. They must, however, be complemented by ‘new’ principles which have developed over the years such as accountability and privacy by design and by default. The EU data protection reform package is expected to strengthen and modernise the regulatory framework .

The EU intends to maximise growth and competitiveness by exploiting big data. But the Digital Single Market cannot uncritically import the data-driven technologies and business models which have become economic mainstream in other areas of the world. Instead it needs to show leadership in developing accountable personal data processing. The internet has evolved in a way that surveillance – tracking people’s behaviour – is considered as the indispensable revenue model for some of the most successful companies. This development calls for critical assessment and search for other options.

In any event, and irrespective of the business models chosen, organisations that process large volumes of personal information must comply with applicable data protection law. The European Data Protection Supervisor (EDPS) believes that responsible and sustainable development of big data must rely on four essential elements:

  • organisations must be much more transparent about how they process personal data;
  • afford users a higher degree of control over how their data is used;
  • design user friendly data protection into their products and services; and;
  • become more accountable for what they do….(More)

How Fast is Your Carrier? Crowdsourcing Mobile Network Quality with OpenSignal


Discover: I was on a call with Teresa Murphy-Skorzova, Community Growth Manager for OpenSignal, an app that uses crowd-sourcing to aggregate cell phone signals and WiFi strength data throughout the world. …She explains that while cell phone networks like Verizon and AT&T measure the percent of the population that usually has coverage, OpenSignal is “measuring the experience of the user,” mapping signals from the devices themselves in real time. Individuals record their connection as they go about their day. The app recognizes that people and their cell phone devices are, well… mobile.

In reception to Teresa’s curiosity about my connection, I opened the app and pressed the start button, trying a “Speedtest”. A number begins to fluctuate on my screen. Download speed: 14.9 mbps. A new number begins to fluctuate, testing upload speed. 5.3 mbps. I felt like I had just played slots, already anticipating my next results. I tried again, and saw that my download speed was up to 17.5 mbps. I wondered what my speeds were at the coffee shops I frequent. What about in the woods where I took a hike last weekend, or in the subway tunnel where my texts rarely send?

opensignal app gif

…While individuals learn where to find their own best signals, they contribute to a much larger voice about network quality, Teresa explained. “When a user discovers an area that hasn’t been measured or when they discover an area with poor signal, they’re eager to contribute.” While users are interested in their personal signals, OpenSignal is interesting in tracking the aggregated signal of all devices of a particular location and network. Individual device data is therefore kept anonymous.

Some surprising research projects have used OpenSignal’s data to discover implications about health, the economy, and weather. In one of these projects a team at the Royal Netherlands Meteorological Institute (RNMI) collaborated with OpenSignal to expand the rain radar program. Rainfall gradually weakens reception between cell phone towers creating a space-time map of rainfall, or rain radar map, with cellular link data. RNMI looked at OpenSignal data from unlikely rain radar locations. Some areas were remote or impoverished while others had fairly arid climates. They can now determine whether rain radar is feasible on a larger scale….(More)”

Data enriched research, data enhanced impact: the importance of UK data infrastructure.


Matthew Woollard at LSE Impact Blog: “…Data made available for reuse, such as those in the UK Data Service collection have huge potential. They can unlock new discoveries in research, provide evidence for policy decisions and help promote core data skills in the next generation of researchers. By being part of a single infrastructure, data owners and data creators can work together with the UK Data Service – rather than duplicating efforts – to engage with the people who can drive the impact of their research further to provide real benefit to society. As a service we are also identifying new ways to understand and promote our impact, and our Impact Fellow and Director of Impact and Communications, Victoria Moody, is focusing on raising the visibility of the UK Data Service holdings and developing and promoting the use and impact of the data and resources in policy-relevant research, especially to new audiences such as policymakers, government sectors, charities, the private sector and the media…..

We are improving how we demonstrate the impact of both the Service and the data which we hold, by focusing on generating more and more authentic user corroboration. Our emphasis is on drawing together evidence about the reach and significance of the impact of our data and resources, and of the Service as a whole through our infrastructure and expertise. Headline impact indicators through which we will better understand our impact cover a range of areas (outlined above) where the Service brings efficiency to data access and re-use, benefit to its users and a financial and social return on investment.

We are working to understand more about how Service data contributes to impact by tracking the use of Service data in a range of initiatives focused on developing impact from research and by developing our insight into usage of our data by our users. Data in the collection have featured in a range of impact case studies in the Research Excellence Framework 2014. We are also developing a focus on understanding the specific beneficial effect, rather than simply that data were used in an output, that is – as it appears in policy, debate or the evidential process (although important). Early thoughts in developing this process are where (ideally) cited data can be tracked through the specific beneficial outcome and on to an evidenced effect, corroborated by the end user.

data service 1

Our impact case studies demonstrate how the data have supported research which has led to policy change in a range of areas including; the development of mathematical models for Practice based Commissioning budgets for adult mental health in the UK and informing public policy on obesity; both using the Health Survey for England. Service data have also informed the development of impact around understanding public attitudes towards the police and other legal institutions using the Crime Survey for England and Wales and research to support the development of the national minimum wage using the Labour Force Survey. The cutting-edge new Demos Integration Hub maps the changing face of Britain’s diversity, revealing a mixed picture in the integration and upward mobility of ethnic minority communities and uses 2011 Census aggregate data (England and Wales) and Understanding Society….(More)”

Open government data: Out of the box


The Economist on “The open-data revolution has not lived up to expectations. But it is only getting started…

The app that helped save Mr Rich’s leg is one of many that incorporate government data—in this case, supplied by four health agencies. Six years ago America became the first country to make all data collected by its government “open by default”, except for personal information and that related to national security. Almost 200,000 datasets from 170 outfits have been posted on the data.gov website. Nearly 70 other countries have also made their data available: mostly rich, well-governed ones, but also a few that are not, such as India (see chart). The Open Knowledge Foundation, a London-based group, reckons that over 1m datasets have been published on open-data portals using its CKAN software, developed in 2010.

Will Open Data Policies Contribute to Solving Development Challenges?


Fabrizio Scrollini at IODC: “As the international open data charter  gains momentum  in the context of the wider development agenda related to the sustainable development goals set by the United Nations, a pertinent question to ask is: will open data policies contribute to solve development challenges? In this post  I try to answer this question grounded in recent Latin American experience to contribute to a global debate.

Latin America has been exploring open data since 2013, when  the first open data unconference (Abrelatam)and  conference took place in Montevideo. In September 2015 in Santiago de Chile a vibrant community of activists, public servants, and entrepreneurs gathered  in the third edition of Abrelatam and Condatos. It is now a more mature community. The days where it was sufficient to  just open a few datasets and set  up a portal are now gone. The focus of this meeting was on collaboration and use of data to address several social challenges.

Take for instance the health sector. Transparency in this sector is key to deliver better development goals. One of the panels at Condatos showed three different ways to use data to promote transparency and citizen empowerment in this sector. A tu servicio, a joint venture of DATA  and the Uruguayan Ministry of Health helped to standardize and open public datasets that allowed around 30,000 users to improve the way they choose health providers. Government-civil society collaboration was crucial in this process in terms pooling resources and skills. The first prototype was only possible because some data was already open.

This contrasts with Cuidados Intensivos, a Peruvian endeavour  aiming to provide key information about the health sector. Peruvian activists had to fill right to information requests, transform, and standardize data to eventually release it. Both experiences demanded a great deal of technical, policy, and communication craft. And both show the attitudes the public sector can take: either engaging or at the very best ignoring the potential of open data.

In the same sector look at a recent study dealing with Dengue and open data developed by our research initiative. If international organizations and countries were persuaded to adopt common standards for Dengue outbreaks, they could be potentially predicted if the right public data is available and standardized. Open data in this sector not only delivers accountability but also efficiency and foresight to allocate scarce resources.

Latin American countries – gathered in the open data group of the Red Gealc – acknowledge the increasing public value of open data. This group engaged constructively in Condatos with the principles enshrined in the charter and will foster the formalization of open data policies in the region. A data revolution won’t yield results if data is closed. When you open data you allow for several initiatives to emerge and show its value.

Once a certain level of maturity is reached in a particular sector, more than data is needed.  Standards are crucial to ensure comparability and ease the collection, processing, and use of open government data. To foster and engage with open data users is also needed,  as several strategies deployed by some Latin American cities show.

Coming back to our question: will open data policies contribute to solve development challenges?  The Latin American experience shows evidence that  it will….(More)”

Batea: a Wikipedia hack for medical students


Tom Sullivan at HealthCareIT: “Medical students use Wikipedia in great numbers, but what if it were a more trusted source of information?

That’s the idea behind Batea, a piece of software that essentially collects data from clinical reference URLs medical students visit, then aggregates that information to share with WikiProject Medicine, such that relevant medical editors can glean insights about how best to enhance Wikipedia’s medical content.

Batea takes its name from the Spanish name for gold pan, according to Fred Trotter, a data journalist at DocGraph.

“It’s a data mining project,” Trotter explained, “so we wanted a short term that positively referenced mining.”

DocGraph built Batea with support from the Robert Wood Johnson Foundation and, prior to releasing it on Tuesday, operated beta testing pilots of the browser extension at the University of California, San Francisco and the University of Texas, Houston.

UCSF, for instance, has what Trotter described as “a unique program where medical students edit Wikipedia for credit. They helped us tremendously in testing the alpha versions of the software.”

Wikipedia houses some 25,000 medical articles that receive more than 200 million views each month, according to the DocGraph announcement, while 8,000 pharmacology articles are read more than 40 million times a month.

DocGraph is encouraging medical students around the country to download the Batea extension – and anonymously donate their clinical-related browsing history. Should Batea gain critical mass, the potential exists for it to substantively enhance Wikipedia….(More)”

The War on Campus Sexual Assault Goes Digital


As the problem of sexual assault on college campuses has become a hot-button issue for school administrators and federal education regulators, one question keeps coming up: Why don’t more students report attacks?

According to a recent study of 27 schools, about one-quarter of female undergraduates and students who identified as queer or transgender said they had experienced nonconsensual sex or touching since entering college, but most of the students said they did not report it to school officials or support services.

Some felt the incidents weren’t serious enough. Others said they did not think anyone would believe them or they feared negative social consequences. Some felt it would be too emotionally difficult.

Now, in an effort to give students additional options — and to provide schools with more concrete data — a nonprofit software start-up in San Francisco called Sexual Health Innovations has developed an online reporting system for campus sexual violence.

Students at participating colleges can use its site, called Callisto, to record details of an assault anonymously. The site saves and time-stamps those records. That allows students to decide later whether they want to formally file reports with their schools — identifying themselves by their school-issued email addresses — or download their information and take it directly to the police. The site also offers a matching system in which a user can elect to file a report with the school electronically only if someone else names the same assailant.

Callisto’s hypothesis is that some college students — who already socialize, study and shop online — will be more likely initially to document a sexual assault on a third-party site than to report it to school officials on the phone or in person.

“If you have to walk into a building to report, you can only go at certain times of day and you’re not certain who you have to talk to, how many people you have to talk to, what they will ask,” Jessica Ladd, the nonprofit’s founder and chief executive, said in a recent interview in New York. “Whereas online, you can fill out a form at any time of day or night from anywhere and push a button.”

Callisto is part of a wave of apps and sites that tackle different facets of the sexual assault problem on campus. Some colleges and universities have introduced third-party mobile apps that enable students to see maps of local crime hot spots, report suspicious activity, request a ride from campus security services or allow their friends to track their movements virtually as they walk home. Many schools now ask students to participate in online or in-person training programs that present different situations involving sexual assault, relationship violence and issues of consent…..(More)”

Looking for Open Data from a different country? Try the European Data portal


Wendy Carrara in DAE blog: “The Open Data movement is reaching all countries in Europe. Data Portals give you access to re-usable government information. But have you ever tried to find Open Data from another country whose language you do not speak? Or have you tried to see whether data from one country exist also in a similar way in another? The European Data Portal that we just launched can help you….

The European Data Portal project main work streams is the development of a new pan-European open data infrastructure. Its goal is to be a gateway offering access to data published by administrations in countries across Europe, from the EU and beyond.
The portal is launched during the European Data Forum in Luxembourg.

Additionally we will support public administrations in publishing more data as open data and have targeted actions to stimulate re-use. By taking a look at the data released by other countries and made available on the European Data Portal, governments can also be inspired to publish new data sets they had not though about in the first place.

The re-use of Open Data will further boost the economy. The benefits of Open Data are diverse and range from improved performance of public administrations and economic growth in the private sector to wider social welfare. The economic studyconducted by the European Data Portal team estimates that between 2016 and 2020, the market size of Open Data is expected to increase by 36.9% to a value of 75.7 bn EUR in 2020.

For data to be re-used, it has to be accessible

Currently, the portal includes over 240.000 datasets from 34 European countries. Information about the data available is structured into thirteen different categories ranging from agriculture to transport, including science, justice, health and so on. This enables you to quickly browse through categories and feel inspired by the data made accessible….(More)”

Building Trust and Protecting Privacy: Progress on the President’s Precision Medicine Initiative


The White House: “Today, the White House is releasing the Privacy and Trust Principles for the President’s Precision Medicine Initiative (PMI). These principles are a foundation for protecting participant privacy and building trust in activities within PMI.

PMI is a bold new research effort to transform how we characterize health and treat disease. PMI will pioneer a new model of patient-powered research that promises to accelerate biomedical discoveries and provide clinicians with new tools, knowledge, and therapies to select which treatments will work best for which patients. The initiative includes development of a new voluntary research cohort by the National Institutes of Health (NIH), a novel regulatory approach to genomic technologies by the Food and Drug Administration, and new cancer clinical trials by the National Cancer Institute at NIH.  In addition, PMI includes aligned efforts by the Federal government and private sector collaborators to pioneer a new approach for health research and healthcare delivery that prioritizes patient empowerment through access to information and policies that enable safe, effective, and innovative technologies to be tested and made available to the public.

Following President Obama’s launch of PMI in January 2015, the White House Office of Science and Technology Policy worked with an interagency group to develop the Privacy and Trust Principles that will guide the Precision Medicine effort. The White House convened experts from within and outside of government over the course of many months to discuss their individual viewpoints on the unique privacy challenges associated with large-scale health data collection, analysis, and sharing. This group reviewed the bioethics literature, analyzed privacy policies for large biobanks and research cohorts, and released a draft set of Principles for public comment in July 2015…..

The Privacy and Trust Principles are organized into 6 broad categories:

  1. Governance that is inclusive, collaborative, and adaptable;
  2. Transparency to participants and the public;
  3. Respecting participant preferences;
  4. Empowering participants through access to information;
  5. Ensuring appropriate data sharing, access, and use;
  6. Maintaining data quality and integrity….(More)”

Privacy in a Digital, Networked World: Technologies, Implications and Solutions


Book edited by Zeadally, Sherali and Badra, Mohamad: “This comprehensive textbook/reference presents a focused review of the state of the art in privacy research, encompassing a range of diverse topics. The first book of its kind designed specifically to cater to courses on privacy, this authoritative volume provides technical, legal, and ethical perspectives on privacy issues from a global selection of renowned experts. Features: examines privacy issues relating to databases, P2P networks, big data technologies, social networks, and digital information networks; describes the challenges of addressing privacy concerns in various areas; reviews topics of privacy in electronic health systems, smart grid technology, vehicular ad-hoc networks, mobile devices, location-based systems, and crowdsourcing platforms; investigates approaches for protecting privacy in cloud applications; discusses the regulation of personal information disclosure and the privacy of individuals; presents the tools and the evidence to better understand consumers’ privacy behaviors….(More)”