A Research Briefing by Wood, Alexandra and O’Brien, David and Gasser, Urs: “Political leaders and civic advocates are increasingly recommending that open access be the “default state” for much of the information held by government agencies. Over the past several years, they have driven the launch of open data initiatives across hundreds of national, state, and local governments. These initiatives are founded on a presumption of openness for government data and have led to the public release of large quantities data through a variety of channels. At the same time, much of the data that have been released, or are being considered for release, pertain to the behavior and characteristics of individual citizens, highlighting tensions between open data and privacy. This research briefing offers a snapshot of recent developments in the open data and privacy landscape, outlines an action map of various governance approaches to protecting privacy when releasing open data, and identifies key opportunities for decision-makers seeking to respond to challenges in this space….(More)”
Twitter, UN Global Pulse announce data partnership
PressRelease: “Twitter and UN Global Pulse today announced a partnership that will provide the United Nations with access to Twitter’s data tools to support efforts to achieve the Sustainable Development Goals, which were adopted by world leaders last year.
Every day, people around the world send hundreds of millions of Tweets in dozens of languages. This public data contains real-time information on many issues including the cost of food, availability of jobs, access to health care, quality of education, and reports of natural disasters. This partnership will allow the development and humanitarian agencies of the UN to turn these social conversations into actionable information to aid communities around the globe.
“The Sustainable Development Goals are first and foremost about people, and Twitter’s unique data stream can help us truly take a real-time pulse on priorities and concerns — particularly in regions where social media use is common — to strengthen decision-making. Strong public-private partnerships like this show the vast potential of big data to serve the public good,” said Robert Kirkpatrick, Director of UN Global Pulse.
“We are incredibly proud to partner with the UN in support of the Sustainable Development Goals,” said Chris Moody, Twitter’s VP of Data Services. “Twitter data provides a live window into the public conversations that communities around the world are having, and we believe that the increased potential for research and innovation through this partnership will further the UN’s efforts to reach the Sustainable Development Goals.”
Organizations and business around the world currently use Twitter data in many meaningful ways, and this unique data source enables them to leverage public information at scale to better inform their policies and decisions. These partnerships enable innovative uses of Twitter data, while protecting the privacy and safety of Twitter users.
UN Global Pulse’s new collaboration with Twitter builds on existing R&D that has shown the power of social media for social impact, like measuring the impact of public health campaigns, tracking reports of rising food prices, or prioritizing needs after natural disasters….(More)”
Data and Analytics Innovation
GAO report from the Data and Analytics Innovation Forum Convened by the Comptroller General of the United States: “….discussions considered the implications of new data-related technologies and developments that are revolutionizing the basic three-step innovation process in the figure below. As massive amounts of varied data become available in many fields, data generation (step 1 in the process) is transformed. Continuing technological advances are bringing more powerful analytics and changing analysis possibilities (step 2 in the process). And approaches to new decision making include intelligent machines that may, for example, guide human decision makers. Additionally, data may be automatically generated on actions taken in response to data analytic results, creating an evaluative feedback loop.
Forum participants:
• saw the newly revolutionized and still-evolving process of data and analytics innovation (DAI) as generating far-reaching new economic opportunities, including a new Industrial Revolution based on combining data-transmitting cyber systems and physical systems, resulting in cyber-physical systems—which have alternatively been termed the Industrial Internet, also the Internet of Things;
• warned of an ongoing and potentially widening mismatch between the kinds of jobs that are or will be available and the skill levels of the U.S. labor force;
• identified beneficial DAI impacts that could help efforts to reach key societal goals—through defining DAI pathways to greater efficiency and effectiveness—in areas such as
• saw the newly revolutionized and still-evolving process of data and analytics innovation (DAI) as generating far-reaching new economic opportunities, including a new Industrial Revolution based on combining data-transmitting cyber systems and physical systems, resulting in cyber-physical systems—which have alternatively been termed the Industrial Internet, also the Internet of Things;
• warned of an ongoing and potentially widening mismatch between the kinds of jobs that are or will be available and the skill levels of the U.S. labor force; • identified beneficial DAI impacts that could help efforts to reach key societal goals—through defining DAI pathways to greater efficiency and effectiveness—in areas such as
• identified beneficial DAI impacts that could help efforts to reach key societal goals—through defining DAI pathways to greater efficiency and effectiveness—in areas such as health care, transportation, financial markets, and “smart cities,” among others; and
• outlined areas of data-privacy concern, including for example, possible threats to personal autonomy, which could occur as data on individual persons are collected and used without their knowledge or against their will.
The overall goal of the forum’s discussions and of this report is to help lay the groundwork for future efforts to maximize DAI benefits and minimize potential drawbacks. As such, the forum was not directed toward identifying a specific set of policies relevant to DAI. However, participants suggested that efforts to help realize the promise of DAI opportunities would be directed toward improving data access, assessing the validity of new data and models, creating a welcoming DAI ecosystem, and more generally, raising awareness of DAI’s potential among both policymakers and the general public. Participants also noted a likely need for higher U.S. educational achievement and a measured approach to privacy issues that recognizes both their import and their complexity….(More)”
Responsible Data in Agriculture
Report by Lindsay Ferris and Zara Rahman for GODAN: “The agriculture sector is creating increasing amounts of data, from many different sources. From tractors equipped with GPS tracking, to open data released by government ministries, data is becoming ever more valuable, as agricultural business development and global food policy decisions are being made based upon data. But the sector is also home to severe resource inequality. The largest agricultural companies make billions of dollars per year, in comparison with subsistence farmers growing just enough to feed themselves, or smallholder farmers who grow enough to sell on a year-by-year basis. When it comes to data and technology, these differences in resources translate to stark power imbalances in data access and use. The most well resourced actors are able to delve into new technologies and make the most of those insights, whereas others are unable to take any such risks or divert any of their limited resources. Access to and use of data has radically changed the business models and behaviour of some of those well resourced actors, but in contrast, those with fewer resources are receiving the same, limited access to information that they always have.
In this paper, we have approached these issues from a responsible data perspective, drawing upon the experience of the Responsible Data community1 who over the past three years have created tools, questions and resources to deal with the ethical, legal, privacy and security challenges that come from new uses of data in various sectors. This piece aims to provide a broad overview of some of the responsible data challenges facing these actors, with a focus on the power imbalance between actors, and looking into how that inequality affects behaviour when it comes to the agricultural data ecosystem. What are the concerns of those with limited resources, when it comes to this new and rapidly changing data environment? In addition, what are the ethical grey areas or uncertainties that we need to address in the future? As a first attempt to answer these questions, we spoke to 14 individuals with various perspectives on the sector to understand what the challenges are for them and for the people they work with. We also carried out desk research to dive deeper into these issues, and we provide here an analysis of our findings and responsible data challenges….(More)”
Research Handbook on Digital Transformations
Book edited by F. Xavier Olleros and Majlinda Zhegu: “The digital transition of the world economy is now entering a phase of broad and deep societal impact. While there is one overall transition, there are many different sectoral transformations, from health and legal services to tax reports and taxi rides, as well as a rising number of transversal trends and policy issues, from widespread precarious employment and privacy concerns to market monopoly and cybercrime. This Research Handbook offers a rich and interdisciplinary synthesis of some of the recent research on the digital transformations currently under way.
This comprehensive study contains chapters covering sectoral and transversal analyses, all of which are specially commissioned and include cutting-edge research. The contributions featured are global, spanning four continents and seven different countries, as well as interdisciplinary, including experts in economics, sociology, law, finance, urban planning and innovation management. The digital transformations discussed are fertile ground for researchers, as established laws and regulations, organizational structures, business models, value networks and workflow routines are contested and displaced by newer alternatives….(More)”
Recent Developments in Open Data Policy
Presentation by Paul Uhlir: “Several International organizations have issued policy statements on open data policies in the past two years. This presentation provides an overview of those statements and their relevance to developing countries.
International Statements on Open Data Policy
Open data policies have become much more supported internationally in recent years. Policy statements in just the most recent 2014-2016 period that endorse and promote openness to research data derived from public funding include: the African Data Consensus (UNECA 2014); the CODATA Nairobi Principles for Data Sharing for Science and Development in Developing Countries (PASTD 2014); the Hague Declaration on Knowledge Discovery in the Digital Age (LIBER 2014); Policy Guidelines for Open Access and Data Dissemination and Preservation (RECODE 2015); Accord on Open Data in a Big Data World (Science International 2015). This presentation will present the principal guidelines of these policy statements.
The Relevance of Open Data from Publicly Funded Research for Development
There are many reasons that publicly funded research data should be made as freely and openly available as possible. Some of these are noted here, although many other benefits are possible. For research, it is closing the gap with more economically developed countries, making researchers more visible on the web, enhancing their collaborative potential, and linking them globally. For educational benefits, open data assists greatly in helping students learn how to do data science and to manage data better. From a socioeconomic standpoint, open data policies have been shown to enhance economic opportunities and to enable citizens to improve their lives in myriad ways. Such policies are more ethical in allowing access to those that have no means to pay and not having to pay for the data twice—once through taxes to create the data in the first place and again at the user level . Finally, access to factual data can improve governance, leading to better decision making by policymakers, improved oversight by constituents, and digital repatriation of objects held by former colonial powers.
Some of these benefits are cited directly in the policy statements themselves, while others are developed more fully in other documents (Bailey Mathae and Uhlir 2012, Uhlir 2015). Of course, not all publicly funded data and information can be made available and there are appropriate reasons—such as the protection of national security, personal privacy, commercial concerns, and confidentiality of all kinds—that make the withholding of them legal and ethical. However, the default rule should be one of openness, balanced against a legitimate reason not to make the data public….(More)”
Open data, transparency and accountability
Topic guide by Liz Carolan: “…introduces evidence and lessons learned about open data, transparency and accountability in the international development context. It discusses the definitions, theories, challenges and debates presented by the relationship between these concepts, summarises the current state of open data implementation in international development, and highlights lessons and resources for designing and implementing open data programmes.
Open data involves the release of data so that anyone can access, use and share it. The Open DataCharter (2015) describes six principles that aim to make data easier to find, use and combine:
- open by default
- timely and comprehensive
- accessible and usable
- comparable and interoperable
- for improved governance and citizen engagement
- for inclusive development and innovation
One of the main objectives of making data open is to promote transparency.
Transparency is a characteristic of government, companies, organisations and individuals that are open in the clear disclosure of information, rules, plans, processes and actions. Transparency of information is a crucial part of this. Within a development context, transparency and accountability initiatives have emerged over the last decade as a way to address developmental failures and democratic deficits.
There is a strong intersection between open data and transparency as concepts, yet as fields of study and practice, they have remained somewhat separate. This guide draws extensively on analysis and evidence from both sets of literature, beginning by outlining the main concepts and the theories behind the relationships between them.
Data release and transparency are parts of the chain of events leading to accountability. For open data and transparency initiatives to lead to accountability, the required conditions include:
- getting the right data published, which requires an understanding of the politics of data publication
- enabling actors to find, process and use information, and to act on any outputs, which requires an accountability ecosystem that includes equipped and empowered intermediaries
- enabling institutional or social forms of enforceability or citizens’ ability to choose better services,which requires infrastructure that can impose sanctions, or sufficient choice or official support for citizens
Programmes intended to increase access to information can be impacted by and can affect inequality. They can also pose risks to privacy and may enable the misuse of data for the exploitation of individuals and markets.
Despite a range of international open data initiatives and pressures, developing countries are lagging behind in the implementation of reforms at government level, in the overall availability of data, and in the use of open data for transparency and accountability. What is more, there are signs that ‘open-washing’ –superficial efforts to publish data without full integration with transparency commitments – may be obscuring backsliding in other aspects of accountability.
The topic guide pulls together lessons and guidance from open data, transparency and accountability work,including an outline of technical and non-technical aspects of implementing a government open data initiative. It also lists further resources, tools and guidance….(More)”
Managing Federal Information as a Strategic Resource
White House: “Today the Office of Management and Budget (OMB) is releasing an update to the Federal Government’s governing document for the management of Federal information resources: Circular A-130, Managing Information as a Strategic Resource.
The way we manage information technology(IT), security, data governance, and privacy has rapidly evolved since A-130 was last updated in 2000. In today’s digital world, we are creating and collecting large volumes of data to carry out the Federal Government’s various missions to serve the American people. This data is duplicated, stored, processed, analyzed, and transferred with ease. As government continues to digitize, we must ensure we manage data to not only keep it secure, but also allow us to harness this information to provide the best possible service to our citizens.
Today’s update to Circular A-130 gathers in one resource a wide range of policy updates for Federal agencies regarding cybersecurity, information governance, privacy, records management, open data, and acquisitions. It also establishes general policy for IT planning and budgeting through governance, acquisition, and management of Federal information, personnel, equipment, funds, IT resources, and supporting infrastructure and services. In particular, A-130 focuses on three key elements to help spur innovation throughout the government:
- Real Time Knowledge of the Environment. In today’s rapidly changing environment, threats and technology are evolving at previously unimagined speeds. In such a setting, the Government cannot afford to authorize a system and not look at it again for years at a time. In order to keep pace, we must move away from periodic, compliance-driven assessment exercises and, instead, continuously assess our systems and build-in security and privacy with every update and re-design. Throughout the Circular, we make clear the shift away from check-list exercises and toward the ongoing monitoring, assessment, and evaluation of Federal information resources.
- Proactive Risk Management. To keep pace with the needs of citizens, we must constantly innovate. As part of such efforts, however, the Federal Government must modernize the way it identifies, categorizes, and handles risk to ensure both privacy and security. Significant increases in the volume of data processed and utilized by Federal resources requires new ways of storing, transferring, and managing it Circular A-130 emphasizes the need for strong data governance that encourages agencies to proactively identify risks, determine practical and implementable solutions to address said risks, and implement and continually test the solutions. This repeated testing of agency solutions will help to proactively identify additional risks, starting the process anew.
- Shared Responsibility. Citizens are connecting with each other in ways never before imagined. From social media to email, the connectivity we have with one another can lead to tremendous advances. The updated A-130 helps to ensure everyone remains responsible and accountable for assuring privacy and security of information – from managers to employees to citizens interacting with government services. …(More)”
How Big Data Analytics is Changing Legal Ethics
Renee Knake at Bloomberg Law: “Big data analytics are changing how lawyers find clients, conduct legal research and discovery, draft contracts and court papers, manage billing and performance, predict the outcome of a matter, select juries, and more. Ninety percent of corporate legal departments, law firms, and government lawyers note that data analytics are applied in their organizations, albeit in limited ways, according to a 2015 survey. The Legal Services Corporation, the largest funder of civil legal aid for low-income individuals in the United States, recommended in 2012 that all states collect and assess data on case progress/outcomes to improve the delivery of legal services. Lawyers across all sectors of the market increasingly recognize how big data tools can enhance their work.
A growing literature advocates for businesses and governmental bodies to adopt data ethics policies, and many have done so. It is not uncommon to find data-use policies prominently displayed on company or government websites, or required a part of a click-through consent before gaining access to a mobile app or webpage. Data ethics guidelines can help avoid controversies, especially when analytics are used in potentially manipulative or exploitive ways. Consider, for example, Target’s data analytics that uncovered a teen’s pregnancy before her father did, or Orbitz’s data analytics offered pricier hotels to Mac users. These are just two of numerous examples in recent years where companies faced criticism for how they used data analytics.
While some law firms and legal services organizations follow data-use policies or codes of conduct, many do not. Perhaps this is because the legal profession was not transformed as early or rapidly as other industries, or because until now, big data in legal was largely limited to e-discovery, where the data use is confined to the litigation and is subject to judicial oversight. Another reason may be that lawyers believe their rules of professional conduct provide sufficient guidance and protection. Unlike other industries, lawyers are governed by a special code of ethical obligations to clients, the justice system, and the public. In most states, this code is based in part upon the American Bar Association (ABA) Model Rules of Professional Conduct, though rules often vary from jurisdiction to jurisdiction. Several of the Model Rules are relevant to big data use. That said, the Model Rules are insufficient for addressing a number of fundamental ethical concerns.
At the moment, legal ethics for big data analytics is at best an incomplete mix of professional conduct rules and informal policies adopted by some, but not all law practices. Given the increasing prevalence of data analytics in legal services, lawyers and law students should be familiar not only with the relevant professional conduct rules, but also the ethical questions left unanswered. Listed below is a brief summary of both, followed by a proposed legal ethics agenda for data analytics. …
Questions Unanswered by Lawyer Ethics Rules
Access/Ownership. Who owns the original data — the individual source or the holder of the pooled information? Who owns the insights drawn from its analysis? Who should receive access to the data compilation and the results?
Anonymity/Identity. Should all personally identifiable or sensitive information be removed from the data? What protections are necessary to respect individual autonomy? How should individuals be able to control and shape their electronic identity?
Consent. Should individuals affirmatively consent to use of their personal data? Or is it sufficient to provide notice, perhaps with an opt-out provision?
Privacy/Security. Should privacy be protected beyond the professional obligation of client confidentiality? How should data be secured? The ABA called upon private and public sector lawyers to implement cyber-security policies, including data use, in a 2012resolution and produced a cyber-security handbook in 2013.
Process. How involved should lawyers be in the process of data collection and analysis? In the context of e-discovery, for example, a lawyer is expected to understand how documents are collected, produced, and preserved, or to work with a specialist. Should a similar level of knowledge be required for all forms of data analytics use?
Purpose. Why was the data first collected from individuals? What is the purpose for the current use? Is there a significant divergence between the original and secondary purposes? If so, is it necessary for the individuals to consent to the secondary purpose? How will unintended consequences be addressed?
Source. What is the source of the data? Did the lawyer collect it directly from clients, or is the lawyer relying upon a third-party source? Client-based data is, of course, subject to the lawyer’s professional conduct rules. Data from any source should be trustworthy, reasonable, timely, complete, and verifiable….(More)”
Why Zika, Malaria and Ebola should fear analytics
Frédéric Pivetta at Real Impact Analytics: “Big data is a hot business topic. It turns out to be an equally hot topic for the non profit sector now that we know the vital role analytics can play in addressing public health issues and reaching sustainable development goals.
Big players like IBM just announced they will help fight Zika by analyzing social media, transportation and weather data, among other indicators. Telecom data takes it further by helping to predict the spread of disease, identifying isolated and fragile communities and prioritizing the actions of aid workers.
The power of telecom data
Human mobility contributes significantly to epidemic transmission into new regions. However, there are gaps in understanding human mobility due to the limited and often outdated data available from travel records. In some countries, these are collected by health officials in the hospitals or in occasional surveys.
Telecom data, constantly updated and covering a large portion of the population, is rich in terms of mobility insights. But there are other benefits:
- it’s recorded automatically (in the Call Detail Records, or CDRs), so that we avoid data collection and response bias.
- it contains localization and time information, which is great for understanding human mobility.
- it contains info on connectivity between people, which helps understanding social networks.
- it contains info on phone spending, which allows tracking of socio-economic indicators.
Aggregated and anonymized, mobile telecom data fills the public data gap without questioning privacy issues. Mixing it with other public data sources results in a very precise and reliable view on human mobility patterns, which is key for preventing epidemic spreads.
Using telecom data to map epidemic risk flows
So how does it work? As in any other big data application, the challenge is to build the right predictive model, allowing decision-makers to take the most appropriate actions. In the case of epidemic transmission, the methodology typically includes five steps :
- Identify mobility patterns relevant for each particular disease. For example, short-term trips for fast-spreading diseases like Ebola. Or overnight trips for diseases like Malaria, as it spreads by mosquitoes that are active only at night. Such patterns can be deduced from the CDRs: we can actually find the home location of each user by looking at the most active night tower, and then tracking calls to identify short or long-term trips. Aggregating data per origin-destination pairs is useful as we look at intercity or interregional transmission flows. And it protects the privacy of individuals, as no one can be singled out from the aggregated data.
- Get data on epidemic incidence, typically from local organisations like national healthcare systems or, in case of emergency, from NGOs or dedicated emergency teams. This data should be aggregated on the same level of granularity than CDRs.
- Knowing how many travelers go from one place to another, for how long, and the disease incidence at origin and destination, build an epidemiological model that can account for the way and speed of transmission of the particular disease.
- With an import/export scoring model, map epidemic risk flows and flag areas that are at risk of becoming the new hotspots because of human travel.
- On that base, prioritize and monitor public health measures, focusing on restraining mobility to and from hotspots. Mapping risk also allows launching prevention campaigns at the right places and setting up the necessary infrastructure on time. Eventually, the tool reduces public health risks and helps stem the epidemic.
That kind of application works in a variety of epidemiological contexts, including Zika, Ebola, Malaria, Influenza or Tuberculosis. No doubt the global boom of mobile data will proof extraordinarily helpful in fighting these fierce enemies….(More)”