Open data, open mind: Why you should share your company data with the world


Mark Samuels at ZDnet: “If information really is the lifeblood of modern organisations, then CIOs could create huge benefits from opening their data to new, creative pairs of eyes. Research from consultant McKinsey suggests that seven sectors alone could generate more than $3 trillion a year in additional value as a result of open data: that is, taking previously proprietary data (often starting with public sector data) and opening up access.

So, should your business consider giving outsiders access to insider information? ZDNet speaks to three experts.

More viewpoints can mean better results

Former Tullow Oil CIO Andrew Marks says debates about the potential openness of data in a private sector context are likely to be dominated by one major concern: information security.

“It’s a perfectly reasonable debate until people start thinking about privacy,” he says. “Putting information at risk, both in terms of customer data and competitive advantage, will be a risk too far for many senior executives.”

But what if CIOs could allay c-suite peers’ concerns and create a new opportunity? Marks points to the Goldcorp Challenge, which saw the mining specialist share its proprietary geological data to allow outside experts pick likely spots for mining. The challenge, which included prize money of $575,000 helped identify more than 110 sites, 50 per cent of which were previously unknown to the company. The value of gold found through the competition exceeded $6bn. Marks wonders whether other firms could take similarly brave steps.
“There is a period of time when information is very sensitive,” he says. “Once the value of data starts to become finite, then it might be beneficial for businesses to open the doors and to let outsiders play with the information. That approach, in terms of gamification, might lead to the creation of new ideas and innovations.”…

Marks says these projects help prove that, when it comes to data, more is likely to mean different – and possibly better – results. “Whether using big data algorithms or the human touch, the more viewpoints you bring together, the more you can increases chances of success and reduce risk,” he says.

“There is, therefore, always likely to be value in seeking an alternative perspective. Opening access to data means your firm is going to get more ideas, but CIOs and other senior executives need to think very carefully about what such openness means for the business, and the potential benefits.”….Some leading firms are already taking steps towards openness. Take Christina Scott, chief product and information officer at the Financial Times, who says the media organisation has used data analysts to help push the benefits of information-led insight across the business.

Her team has democratised data in order to make sure that all parts of the organisation can get the information they need to complete their day-to-day jobs. Scott says the approach is best viewed as an open data strategy, but within the safe confines of the existing enterprise firewall. While the tactic is internally focused currently, Scott says the FT is keen to find ways to make the most of external talent in the future.

“We’re starting to consider how we might open data beyond the organisation, too,” she says. “Our data holds a lot of value and insight, including across the metadata we’ve created. So it would be great to think about how we could use that information in a more open way.” Part of the FT’s business includes trade-focused magazines. Scott says opening the data could provide new insight to its B2B customers across a range of sectors. In fact, the firm has already dabbled at a smaller scale.

“We’ve run hackathons, where we’ve exposed our APIs and given people the chance to come up with some new ideas,” she says. “But I don’t think we’ve done as much work on open data as we could. And I think that’s the direction in which better organisations are moving. They recognise that not all innovation is going to happen within the company.”…

CIO Omid Shiraji is another IT expert who recognises that there is a general move towards a more open society. Any executive who expects to work within a tightly defined enterprise firewall is living in cloud cuckoo land, he argues. More to the point, they will miss out on big advantages.
“If you can expose your sources to a range of developers, you can start to benefit from massive innovation,” he says. “You can get really big benefits from opening your data to external experts who can focus on areas that you don’t have the capability to develop internally.”

Many IT leaders would like to open data to outside experts, suggests Shiraji. For CIOs who are keen to expose their sources, he suggests letting small-scale developers take a close look at in-house data silos in an attempt to discover what relationships might exist and what advantages could accrue….(More)”

Introducing Government as a Platform


Peter Williams, Jan Gravesen and Trinette Brownhill in Government Executive: “Governments around the world are facing competitive pressures and expectations from their constituents that are prompting them to innovate and dissolve age-old structures. Many governments have introduced a digital strategy in which at least one of the goals is aimed at bringing their organizations closer to citizens and businesses.

To achieve this, ideally IT and data in government would not be constrained by the different functional towers that make up the organization, as is often the case. They would not be constrained by complex, monolithic application design philosophies and lengthy implementation cycles, nor would development be constrained by the assumption that all activity has to be executed by the government itself.

Instead, applications would be created rapidly and cheaply, and modules would be shared as reusable blocks of code and integrated data. It would be relatively straightforward to integrate data from multiple departments to enable a focus on the complex needs of, say, a single parent who is diabetic and a student. Delivery would be facilitated in the manner best required, or preferred, by the citizen. Third parties would also be able to access these modules of code and data to build higher value government services that multiple agencies would then buy into. The code would run on a cloud infrastructure that maximizes the efficiency in which processing resources are used.

GaaP an organized set of ideas and principles that allows organizations to approach these ideals. It allows governments to institute more efficient sharing of IT resources as well as unlock data and functionality via application programming interfaces to allow third parties to build higher value citizen services. In doing so, security plays a crucial role protecting the privacy of constituents and enterprise assets.

We see increasingly well-established examples of GaaP services in many parts of the world. The notion has significantly influenced strategic thinking in the UK, Australia, Denmark, Canada and Singapore. In particular, it has evolved in a deliberate way in the UK’s Government Data Services, building on the Blairite notion of “joined up government”; in Australia’s e-government strategy and its myGov program; and as a significant influencer in Singapore’s entire approach to building its “smarter nation” infrastructure.

Collaborative Government

GaaP assumes a transformational shift in efficiency, effectiveness and transparency, in which agencies move toward a collaborative government and away from today’s siloed approach. That collaboration may be among agencies, but also with other entities (nongovernmental organizations, the private sector, citizens, etc.).

GaaP’s focus on collaboration enables public agencies to move away from their traditional towered approach to IT and increasingly make use of shared and composable services offered by a common – usually a virtualized, cloud-enabled – platform. This leads to more efficient use of development resources, platforms and IT support. We are seeing examples of this already with a group of townships in New York state and also with two large Spanish cities that are embarking on this approach.

While efficient resource and service sharing is central to the idea of GaaP, it is not sufficient. The idea is that GaaP must allow app developers, irrespective of whether they are citizens, private organizations or other public agencies, to develop new value-added services using published government data and APIs. In this sense, the platform becomes a connecting layer between public agencies’ systems and data on the one hand, and private citizens, organizations and other public agencies on the other.

In its most fundamental form, GaaP is able to:

  • Consume data and government services from existing departmental systems.
  • Consume syndicated services from platform-as-a-service or software-as-a-service providers in the public marketplace.
  • Securely unlock these data and services and allow third parties –citizens, private organizations or other agencies – to combine services and data into higher-order services or more citizen-centric or business-centric services.

It is the openness, the secure interoperability, and the ability to compose new services on the basis of existing services and data that define the nature of the platform.

The Challenges

At one time, the challenge of creating a GaaP structure would have been technology: Today, it is governance….(More)”

The big questions for research using personal data


 at Royal Society’s “Verba”: “We live in an era of data. The world is generating 1.7 million billion bytes of data every minute and the total amount of global data is expected to grow 40% year on year for the next decade (PDF). In 2003 scientists declared the mapping of the human genome complete. It took over 10 years and cost $1billion – today it takes mere days and can be done at a fraction of the cost.

Making the most of the data revolution will be key to future scientific and economic progress. Unlocking the value of data by improving the way that we collect, analyse and use data has the potential to improve lives across a multitude of areas, ranging from business to health, and from tackling climate change to aiding civic engagement. However, its potential for public benefit must be balanced against the need for data to be used intelligently and with respect for individuals’ privacy.

Getting regulation right

The UK Data Protection Act was transposed into UK law following the 1995 European Data Protection Directive. This was at a time before wide-spread use of internet and smartphones. In 2012, recognising the pace of technological change, the European Commission proposed a comprehensive reform of EU data protection rules including a new Data Protection Regulation that would update and harmonise these rules across the EU.

The draft regulation is currently going through the EU legislative process. During this, the European Parliament has proposed changes to the Commission’s text. These changes have raised concerns for researchers across Europe that the Regulation could risk restricting the use of personal data for research which could prevent much vital health research. For example, researchers currently use these data to better understand how to prevent and treat conditions such as cancer, diabetes and dementia. The final details of the regulation are now being negotiated and the research community has come together to highlight the importance of data in research and articulate their concerns in a joint statement, which the Society supports.

The Society considers that datasets should be managed according to a system of proportionate governance. Personal data should only be shared if it is necessary for research with the potential for high public value and should be proportionate to the particular needs of a research project. It should also draw on consent, authorisation and safe havens – secure sites for databases containing sensitive personal data that can only be accessed by authorised researchers – as appropriate…..

However, many challenges remain that are unlikely to be resolved in the current European negotiations. The new legislation covers personal data but not anonymised data, which are data that have had information that can identify persons removed or replaced with a code. The assumption is that anonymisation is a foolproof way to protect personal identity. However, there have been examples of reidentification from anonymised data and computer scientists have long pointed out the flaws of relying on anonymisation to protect an individual’s privacy….There is also a risk of leaving the public behind with lack of information and failed efforts to earn trust; and it is clear that a better understanding of the role of consent and ethical governance is needed to ensure the continuation of cutting edge research which respects the principles of privacy.

These are problems that will require attention, and questions that the Society will continue to explore. …(More)”

The Internet of Things: Frequently Asked Questions


Eric A. Fischer at the Congressional Research Service: “Internet of Things” (IoT) refers to networks of objects that communicate with other objects and with computers through the Internet. “Things” may include virtually any object for which remote communication, data collection, or control might be useful, such as vehicles, appliances, medical devices, electric grids, transportation infrastructure, manufacturing equipment, or building systems. In other words, the IoT potentially includes huge numbers and kinds of interconnected objects. It is often considered the next major stage in the evolution of cyberspace. Some observers believe it might even lead to a world where cyberspace and human space would seem to effectively merge, with unpredictable but potentially momentous societal and cultural impacts.

Two features makes objects part of the IoT—a unique identifier and Internet connectivity. Such “smart” objects each have a unique Internet Protocol (IP) address to identify the object sending and receiving information. Smart objects can form systems that communicate among themselves, usually in concert with computers, allowing automated and remote control of many independent processes and potentially transforming them into integrated systems. Those systems can potentially impact homes and communities, factories and cities, and every sector of the economy, both domestically and globally. Although the full extent and nature of the IoT’s impacts remain uncertain, economic analyses predict that it will contribute trillions of dollars to economic growth over the next decade. Sectors that may be particularly affected include agriculture, energy, government, health care, manufacturing, and transportation.

The IoT can contribute to more integrated and functional infrastructure, especially in “smart cities,” with projected improvements in transportation, utilities, and other municipal services. The Obama Administration announced a smart-cities initiative in September 2015. There is no single federal agency that has overall responsibility for the IoT. Agencies may find IoT applications useful in helping them fulfill their missions. Each is responsible for the functioning and security of its own IoT, although some technologies, such as drones, may fall under the jurisdiction of other agencies as well. Various agencies also have relevant regulatory, sector-specific, and other mission-related responsibilities, such as the Departments of Commerce, Energy, and Transportation, the Federal Communications Commission, and the Federal Trade Commission.

Security and privacy are often cited as major issues for the IoT, given the perceived difficulties of providing adequate cybersecurity for it, the increasing role of smart objects in controlling components of infrastructure, and the enormous increase in potential points of attack posed by the proliferation of such objects. The IoT may also pose increased risks to privacy, with cyberattacks potentially resulting in exfiltration of identifying or other sensitive information about an individual. With an increasing number of IoT objects in use, privacy concerns also include questions about the ownership, processing, and use of the data they generate….(More)”

Data Science of the People, for the People, by the People: A Viewpoint on an Emerging Dichotomy


Paper by Kush R. Varshney: “This paper presents a viewpoint on an emerging dichotomy in data science: applications in which predictions of datadriven algorithms are used to support people in making consequential decisions that can have a profound effect on other people’s lives and applications in which data-driven algorithms act autonomously in settings of low consequence and large scale. An example of the first type of application is prison sentencing and of the second type is selecting news stories to appear on a person’s web portal home page. It is argued that the two types of applications require data, algorithms and models with vastly different properties along several dimensions, including privacy, equitability, robustness, interpretability, causality, and openness. Furthermore, it is argued that the second type of application cannot always be used as a surrogate to develop methods for the first type of application. To contribute to the development of methods for the first type of application, one must really be working on the first type of application….(More)”

Web design plays a role in how much we reveal online


European Commission: “A JRC study, “Nudges to Privacy Behaviour: Exploring an Alternative Approach to Privacy Notices“, used behavioural sciences to look at how individuals react to different types of privacy notices. Specifically, the authors analysed users’ reactions to modified choice architecture (i.e. the environment in which decisions take place) of web interfaces.

Two types of privacy behaviour were measured: passive disclosure, when people unwittingly disclose personal information, and direct disclosure, when people make an active choice to reveal personal information. After testing different designs with over 3 000 users from the UK, Italy, Germany and Poland, results show web interface affects decisions on disclosing personal information. The study also explored differences related to country of origin, gender, education level and age.

A depiction of a person’s face on the website led people to reveal more personal information. Also, this design choice and the visualisation of the user’s IP or browsing history had an impact on people’s awareness of a privacy notice. If confirmed, these features are particularly relevant for habitual and instinctive online behaviour.

With regard to education, users who had attended (though not necessarily graduated from) college felt significantly less observed or monitored and more comfortable answering questions than those who never went to college. This result challenges the assumption that the better educated are more aware of information tracking practices. Further investigation, perhaps of a qualitative nature, could help dig deeper into this issue. On the other hand, people with a lower level of education were more likely to reveal personal information unwittingly. This behaviour appeared to be due to the fact that non-college attendees were simply less aware that some online behaviour revealed personal information about themselves.

Strong differences between countries were noticed, indicating a relation between cultures and information disclosure. Even though participants in Italy revealed the most personal information in passive disclosure, in direct disclosure they revealed less than in other countries. Approximately 75% of participants in Italy chose to answer positively to at least one stigmatised question, compared to 81% in Poland, 83% in Germany and 92% in the UK.

Approximately 73% of women answered ‘never’ to the questions asking whether they had ever engaged in socially stigmatised behaviour, compared to 27% of males. This large difference could be due to the nature of the questions (e.g. about alcohol consumption, which might be more acceptable for males). It could also suggest women feel under greater social scrutiny or are simply more cautious when disclosing personal information.

These results could offer valuable insights to inform European policy decisions, despite the fact that the study has targeted a sample of users in four countries in an experimental setting. Major web service providers are likely to have extensive amounts of data on how slight changes to their services’ privacy controls affect users’ privacy behaviour. The authors of the study suggest that collaboration between web providers and policy-makers can lead to recommendations for web interface design that allow for conscientious disclosure of privacy information….(More)”

Nudge 2.0


Philipp Hacker: “This essay is both a review of the excellent book “Nudge and the Law. A European Perspective”, edited by Alberto Alemanno and Anne-Lise Sibony, and an assessment of the major themes and challenges that the behavioural analysis of law will and should face in the immediate future.

The book makes important and novel contributions in a range of topics, both on a theoretical and a substantial level. Regarding theoretical issues, four themes stand out: First, it highlights the differences between the EU and the US nudging environments. Second, it questions the reliance on expertise in rulemaking. Third, it unveils behavioural trade-offs that have too long gone unnoticed in behavioural law and economics. And fourth, it discusses the requirement of the transparency of nudges and the related concept of autonomy. Furthermore, the different authors discuss the impact of behavioural regulation on a number of substantial fields of law: health and lifestyle regulation, privacy law, and the disclosure paradigm in private law.

This paper aims to take some of the book’s insights one step further in order to point at crucial challenges – and opportunities – for the future of the behavioural analysis of law. In the last years, the movement has gained tremendously in breadth and depth. It is now time to make it scientifically even more rigorous, e.g. by openly embracing empirical uncertainty and by moving beyond the neo-classical/behavioural dichotomy. Simultaneously, the field ought to discursively readjust its normative compass. Finally and perhaps most strikingly, however, the power of big data holds the promise of taking behavioural interventions to an entirely new level. If these challenges can be overcome, this paper argues, the intersection between law and behavioural sciences will remain one of the most fruitful approaches to legal analysis in Europe and beyond….(More)”

Data-Driven Innovation: Big Data for Growth and Well-Being


“A new OECD report on data-driven innovation finds that countries could be getting much more out of data analytics in terms of economic and social gains if governments did more to encourage investment in “Big Data” and promote data sharing and reuse.

The migration of economic and social activities to the Internet and the advent of The Internet of Things – along with dramatically lower costs of data collection, storage and processing and rising computing power – means that data-analytics is increasingly driving innovation and is potentially an important new source of growth.

The report suggest countries act to seize these benefits, by training more and better data scientists, reducing barriers to cross-border data flows, and encouraging investment in business processes to incorporate data analytics.

Few companies outside of the ICT sector are changing internal procedures to take advantage of data. For example, data gathered by companies’ marketing departments is not always used by other departments to drive decisions and innovation. And in particular, small and medium-sized companies face barriers to the adoption of data-related technologies such as cloud computing, partly because they have difficulty implementing organisational change due to limited resources, including the shortage of skilled personnel.

At the same time, governments will need to anticipate and address the disruptive effects of big data on the economy and overall well-being, as issues as broad as privacy, jobs, intellectual property rights, competition and taxation will be impacted. Read the Policy Brief

TABLE OF CONTENTS
Preface
Foreword
Executive summary
The phenomenon of data-driven innovation
Mapping the global data ecosystem and its points of control
How data now drive innovation
Drawing value from data as an infrastructure
Building trust for data-driven innovation
Skills and employment in a data-driven economy
Promoting data-driven scientific research
The evolution of health care in a data-rich environment
Cities as hubs for data-driven innovation
Governments leading by example with public sector data

 

Health Data Governance: Privacy, Monitoring and Research


OECD publishing: “All countries are investing in health data, however; there are significant cross-country differences in data availability and use. Some countries stand out for their innovative practices enabling privacy-protective respectful data use; while others are falling behind with insufficient data and restrictions that limit access to and use of data, even by government itself. Countries that develop a data governance framework that enables privacy-protective data use will not only have the information needed to promote quality, efficiency and performance in their health systems, they will become a more attractive centre for medical research. After examining the current situation in OECD countries, a multi-disciplinary advisory panel of experts identified eight key data governance mechanisms to maximise benefits to patients and to societies from the collection, linkage and analysis of health data and to, at the same time, minimise risks to the privacy of patients and to the security of health data. These mechanisms include coordinated developming of high-value, privacy-protective health information systems; legislation that permits privacy-protective data use; open and transparent public communication ; accreditation or certification of health data processors; transparent and fair project approval processes; data de-identification and data security practices that meet legal requirements and public expectations without compromising data utility; and a process to continually assess and renew the data governance framework as new data and new risks emerge…”

.

Big Data Privacy Scenarios


E. Bruce, K. Sollins, M. Vernon, and D. Weitzner at D-Space@MIT: “This paper is the first in a series on privacy in Big Data. As an outgrowth of a series of workshops on the topic, the Big Data Privacy Working Group undertook a study of a series of use scenarios to highlight the challenges to privacy that arise in the Big Data arena. This is a report on those scenarios. The deeper question explored by this exercise is what is distinctive about privacy in the context of Big Data. In addition, we discuss an initial list of issues for privacy that derive specifically from the nature of Big Data. These derive from observations across the real world scenarios and use cases explored in this project as well as wider reading and discussions:

* Scale: The sheer size of the datasets leads to challenges in creating, managing and applying privacy policies.

* Diversity: The increased likelihood of more and more diverse participants in Big Data collection, management, and use, leads to differing agendas and objectives. By nature, this is likely to lead to contradictory agendas and objectives.

* Integration: With increased data management technologies (e.g. cloud services, data lakes, and so forth), integration across datasets, with new and often surprising opportunities for cross-product inferences, will also come new information about individuals and their behaviors.

* Impact on secondary participants: Because many pieces of information are reflective of not only the targeted subject, but secondary, often unattended, participants, the inferences and resulting information will increasingly be reflective of other people, not originally considered as the subject of privacy concerns and approaches.

* Need for emergent policies for emergent information: As inferences over merged data sets occur, emergent information or understanding will occur.

Although each unique data set may have existing privacy policies and enforcement mechanisms, it is not clear that it is possible to develop the requisite and appropriate emerged privacy policies and appropriate enforcement of them automatically…(More)”