Open & Shut


Harsha Devulapalli: “Welcome to Open & Shut — a new blog dedicated to exploring the opportunities and challenges of working with open data in closed societies around the world. Although we’ll be exploring questions relevant to open data practitioners worldwide, we’re particularly interested in seeing how civil society groups and actors in the Global South are using open data to push for greater government transparency, and tackle daunting social and economic challenges facing their societies….Throughout this series we’ll be profiling and interviewing organisations working with open data worldwide, and providing do-it-yourself data tutorials that will be useful for beginners as well as data experts. …

What do we mean by the terms ‘open data’ and ‘closed societies’?

It’s important to be clear about what we’re dealing with, here. So let’s establish some key terms. When we talk about ‘open data’, we mean data that anyone can access, use and share freely. And when we say ‘closed societies’, we’re referring to states or regions in which the political and social environment is actively hostile to notions of openness and public scrutiny, and which hold principles of freedom of information in low esteem. In closed societies, data is either not published at all by the government, or else is only published in inaccessible formats, is missing data, is hard to find or else is just not digitised at all.

Iran is one such state that we would characterise as a ‘closed society’. At Small Media, we’ve had to confront the challenges of poor data practice, secrecy, and government opaqueness while undertaking work to support freedom of information and freedom of expression in the country. Based on these experiences, we’ve been working to build Iran Open Data — a civil society-led open data portal for Iran, in an effort to make Iranian government data more accessible and easier for researchers, journalists, and civil society actors to work with.

Iran Open Data — an open data portal for Iran, created by Small Media

.

..Open & Shut will shine a light on the exciting new ways that different groups are using data to question dominant narratives, transform public opinion, and bring about tangible change in closed societies. At the same time, it’ll demonstrate the challenges faced by open data advocates in opening up this valuable data. We intend to get the community talking about the need to build cross-border alliances in order to empower the open data movement, and to exchange knowledge and best practices despite the different needs and circumstances we all face….(More)

Where’s the ‘Civic’ in CivicTech?


Blog by Pius Enywaru: “The ideology of community participation and development is a crucial topic for any nation or community seeking to attain sustainable development. Here in Uganda, oftentimes when the opportunity for public participation either in local planning or in holding local politicians to account — the ‘don’t care’ attitude reigns….

What works?

Some of these tools include Ask Your Government Uganda, a platform built to help members of the public get the information they want about from 106 public agencies in Uganda. U-Report developed by UNICEF provides an SMS-based social monitoring tool designed to address issues affecting the youth of Uganda. Mentioned in a previous blog post, Parliament Watchbrings the proceedings of the Par­lia­ment of Uganda to the citizens. The or­ga­ni­za­tion lever­ages tech­nol­ogy to share live up­dates on so­cial me­dia and pro­vides in-depth analy­sis to cre­ate a bet­ter un­der­stand­ing on the busi­ness of Par­lia­ment. Other tools used include citizen scorecards, public media campaigns and public petitions. Just recently, we have had a few calls to action to get people to sign petitions, with somewhat lackluster results.

What doesn’t work?

Although the usage of these tools have dramatically grown, there is still a lack of awareness and consequently, community participation. In order to understand the interventions which the Government of Uganda believes are necessary for sustainable urban development, it is important to examine the realities pertaining to urban areas and their planning processes. There are many challenges in deploying community participation tools based on ICT such as limited funding and support for such initiatives, low literacy levels, low technical literacy, a large digital divide, low rates of seeking input from communities in developing these tools, lack of adequate government involvement and resistance/distrust of change by both government and citizens. Furthermore, in many of these initiatives, a large marketing or sensitization push is needed to let citizens know that these services exist for their benefit.

There are great minds who have brilliant ideas to try and bring literally everyone on board though civic engagement. When you have a look at their ideas, you will agree that indeed they might make a reputable service and bring about remarkable change in different communities. However, the biggest question has always been, “How do these ideas get executed and adopted by these communities that they target”? These ideas suffer a major setback of lack of inclusivity to enhance community participation. This still remains a puzzle for most folks that have these ideas….(More)”.

Why We Should Care About Bad Data


Blog by Stefaan G. Verhulst: “At a time of open and big data, data-led and evidence-based policy making has great potential to improve problem solving but will have limited, if not harmful, effects if the underlying components are riddled with bad data.

Why should we care about bad data? What do we mean by bad data? And what are the determining factors contributing to bad data that if understood and addressed could prevent or tackle bad data? These questions were the subject of my short presentation during a recent webinar on  Bad Data: The Hobgoblin of Effective Government, hosted by the American Society for Public Administration and moderated by Richard Greene (Partner, Barrett and Greene Inc.). Other panelists included Ben Ward (Manager, Information Technology Audits Unit, California State Auditor’s Office) and Katherine Barrett (Partner, Barrett and Greene Inc.). The webinar was a follow-up to the excellent Special Issue of Governing on Bad Data written by Richard and Katherine….(More)”

Formalised data citation practices would encourage more authors to make their data available for reuse


 Hyoungjoo Park and Dietmar Wolfram at the LSE Impact Blog: “Today’s researchers work in a heavily data-intensive and collaborative environment in order to further scientific discovery across and within fields. It is becoming routine for researchers (i.e. authors and data publishers) to submit their research data, such as datasets, biological samples in biomedical fields, and computer code, as supplementary information in order to comply with data sharing requirements of major funding agencies, high-profile journals, and data journals. This is part of open science, where data and any publication products are expected to be made available to anyone interested.

Given that researchers benefit from publicly shared data through data reuse in their own research, researchers who provide access to data should be acknowledged for their contributions, much in the same way that authors are recognised for their research publications through citation. Researchers who use shared data or other shared research products (e.g. open access software, tissue cultures) should also acknowledge the providers of these resources through formal citation. At present, data citation is not widely practised in most disciplines and as an object of study remains largely overlooked….

We found that data citations appear in the references section of an article less frequently than in the main text, making it difficult to identify the reward and credit for data authors (i.e. data sharers). Consistent data citation formats could not be found. Current data citation practices do not (yet) benefit data sharers. Also, data citation was sometimes located in the supplementary information, outside of the references. Data that had been reused was often not acknowledged in the reference lists, but was rather hidden in the representation of data (e.g. tables, figures, images, graphs, and other elements), which may be a consequence of the fact that data citation practices are not yet common in scholarly communications.

Ongoing challenges remain in identifying and documenting data citation. First, the practice of informal data citation presents a challenge for accurately documenting data citation. …

Second, data recitation by one or more co-authors of earlier studies (i.e. self-citation) is common, which reduces the broader impact of data sharing by limiting much of the reuse to the original authors..

Third, currently indexed data citations may not include rapidly advancing areas, such as in the hard sciences or computer engineering, because approximately 90% of indexed works were associated with journal articles…

Fourth, the number of authors associated with shared datasets raises questions of the ownership of and responsibility for a collective work, although some journals require one author to be responsible for the data used in the study…(More). (See also An examination of research data sharing and re-use: implications for data citation practice, published in Scientometrics)

Avoiding Garbage In – Garbage Out: Improving Administrative Data Quality for Research


Blog by : “In June, I presented the webinar, “Improving Administrative Data Quality for Research and Analysis”, for members of the Association of Public Data Users (APDU). APDU is a national network that provides a venue to promote education, share news, and advocate on behalf of public data users.

The webinar served as a primer to help smaller organizations begin to use their data for research. Participants were given the tools to transform their administrative data into “research-ready” datasets.

I first reviewed seven major issues for administrative data quality and discussed how these issues can affect research and analysis. For instance, issues with incorrect value formats, unit of analysis, and duplicate records can make the data difficult to use. Invalid or inconsistent values lead to inaccurate analysis results. Missing or outlier values can produce inaccurate and biased analysis results. All these issues make the data less useful for research.

Next, I presented concrete strategies for reviewing the data to identify each of these quality issues. I also discussed several tips to make the data review process easier, faster, and easy to replicate. Most importantly among these tips are: (1) reviewing everyvariable in the data set, whether you expect problems or not, and (2) relying on data documentation to understand how the data should look….(More)”.

Four lessons NHS Trusts can learn from the Royal Free case


Blog by Elizabeth Denham, Information Commissioner in the UK: “Today my office has announced that the Royal Free London NHS Foundation Trust did not comply with the Data Protection Act when it turned over the sensitive medical data of around 1.6 million patients to Google DeepMind, a private sector firm, as part of a clinical safety initiative. As a result of our investigation, the Trust has been asked to sign an undertaking committing it to changes to ensure it is acting in accordance with the law, and we’ll be working with them to make sure that happens.

But what about the rest of the sector? As organisations increasingly look to unlock the huge potential that creative uses of data can have for patient care, what are the lessons to be learned from this case?

It’s not a choice between privacy or innovation

It’s welcome that the trial looks to have been positive. The Trust has reported successful outcomes. Some may reflect that data protection rights are a small price to pay for this.

But what stood out to me on looking through the results of the investigation is that the shortcomings we found were avoidable. The price of innovation didn’t need to be the erosion of legally ensured fundamental privacy rights….

Don’t dive in too quickly

Privacy impact assessments are a key data protection tool of our era, as evolving law and best practice around the world demonstrate. Privacy impact assessments play an increasingly prominent role in data protection, and they’re a crucial part of digital innovation. ….

New cloud processing technologies mean you can, not that you always should

Changes in technology mean that vast data sets can be made more readily available and can be processed faster and using greater data processing technologies. That’s a positive thing, but just because evolving technologies can allow you to do more doesn’t mean these tools should always be fully utilised, particularly during a trial initiative….

Know the law, and follow it

No-one suggests that red tape should get in the way of progress. But when you’re setting out to test the clinical safety of a new service, remember that the rules are there for a reason….(More)”

Blockchains, personal data and the challenge of governance


Theo Bass at NESTA: “…There are a number of dominant internet platforms (Google, Facebook, Amazon, etc.) that hoard, analyse and sell information about their users in the name of a more personalised and efficient service. This has become a problem.

People feel they are losing control over how their data is used and reused on the web. 500 million adblocker downloads is a symptom of a market which isn’t working well for people. As Irene Ng mentions in a recent guest blog on the Nesta website, the secondary data market is thriving (online advertising is a major player), as companies benefit from the opacity and lack of transparency about where profit is made from personal data.

It’s said that blockchain’s key characteristics could provide a foundational protocol for a fairer digital identity system on the web. Beyond its application as digital currency, blockchain could provide a new set of technical standards for transparency, openness, and user consent, on top of which a whole new generation of services might be built.

While the aim is ambitious, a handful of projects are rising to the challenge.

Blockstack is creating a global system of digital IDs, which are written into the bitcoin blockchain. Nobody can touch them other than the owner of that ID. Blockstack are building a new generation of applications on top of this infrastructure which promises to provide “a new decentralized internet where users own their data and apps run locally”.

Sovrin attempts to provide users with “self-sovereign identity”. The argument is that “centralized” systems for storing personal data make it a “treasure chest for attackers”. Sovrin argues that users should more easily be able to have “ownership” over their data, and the exchange of data should be made possible through a decentralised, tamper-proof ledger of transactions between users.

Our own DECODE project is piloting a set of collaboratively owned, local sharing economy platforms in Barcelona and Amsterdam. The blockchain aims to provide a public record of entitlements over where people’s data is stored, who can access it and for what purpose (with some additional help from new techniques in zero-knowledge cryptography to preserve people’s privacy).

There’s no doubt this is an exciting field of innovation. But the debate is characterised by a lot of hype. The following sections therefore discuss some of the challenges thrown up when we start thinking about implementations beyond bitcoin.

Blockchains and the challenge of governance

As mentioned above, bitcoin is a “bearer asset”. This is a necessary feature of decentralisation — all users maintain sole ownership over the digital money they hold on the network. If users get hacked (digital wallets sometimes do), or if a password gets lost, the money is irretrievable.

While the example of losing a password might seem trivial, it highlights some difficult questions for proponents of blockchain’s wider uses. What happens if there’s a dispute over an online transaction, but no intermediary to settle it? What happens if a someone’s digital assets or their digital identity is breached and sensitive data falls into the wrong hands? It might be necessary to assign responsibility to a governing actor to help resolve the issue, but of course this would require the introduction of a trusted middleman.

Bitcoin doesn’t try to answer these questions; its anonymous creators deliberately tried to avoid implementing a clear model of governance over the network, probably because they knew that bitcoin would be used by people as a method for subverting the law. Bitcoin still sees a lot of use in gray economies, including for the sale of drugs and gambling.

But if blockchains are set to enter the mainstream, providing for businesses, governments and nonprofits, then they won’t be able to function irrespective of the law. They will need to find use-cases that can operate alongside legal frameworks and jurisdictional boundaries. They will need to demonstrate regulatory compliance, create systems of rules and provide accountability when things go awry. This cannot just be solved through increasingly sophisticated coding.

All of this raises a potential paradox recently elaborated in a post by Vili Lehdonvirta of the Oxford Internet Institute: is it possible to successfully govern blockchains without undermining their entire purpose?….

If blockchain advocates only work towards purely technical solutions and ignore real-world challenges of trying to implement decentralisation, then we’ll only ever see flawed implementations of the technology. This is already happening in the form of centrally administered, proprietary or ‘half-baked’ blockchains, which don’t offer much more value than traditional databases….(More)”.

Facebook Disaster Maps


Molly Jackman et al at Facebook: “After a natural disaster, humanitarian organizations need to know where affected people are located, what resources are needed, and who is safe. This information is extremely difficult and often impossible to capture through conventional data collection methods in a timely manner. As more people connect and share on Facebook, our data is able to provide insights in near-real time to help humanitarian organizations coordinate their work and fill crucial gaps in information during disasters. This morning we announced a Facebook disaster map initiative to help organizations address the critical gap in information they often face when responding to natural disasters.

Facebook disaster maps provide information about where populations are located, how they are moving, and where they are checking in safe during a natural disaster. All data is de-identified and aggregated to a 360 square meter tile or local administrative boundaries (e.g. census boundaries). [1]

This blog describes the disaster maps datasets, how insights are calculated, and the steps taken to ensure that we’re preserving privacy….(More)”.

UK government watchdog examining political use of data analytics


“Given the big data revolution, it is understandable that political campaigns are exploring the potential of advanced data analysis tools to help win votes,” Elizabeth Denham, the information commissioner, writes on the ICO’s blog. However, “the public have the right to expect” that this takes place in accordance with existing data protection laws, she adds.

Political parties are able to use Facebook to target voters with different messages, tailoring the advert to recipients based on their demographic. In the 2015 UK general election, the Conservative party spent £1.2 million on Facebook campaigns and the Labour party £16,000. It is expected that Labour will vastly increase that spend for the general election on 8 June….

Political parties and third-party companies are allowed to collect data from sites like Facebook and Twitter that lets them tailor these ads to broadly target different demographics. However, if those ads target identifiable individuals, it runs afoul of the law….(More)”

How to increase public support for policy: understanding citizens’ perspectives


Peter van Wijck and Bert Niemeijer at LSE Blog: “To increase public support, it is essential to anticipate what reactions they will have to policy. But how to do that? Our framework combines insights from scenario planning and frame analysis. Scenario planning starts from the premise that we cannot predict the future. We can, however, imagine different plausible scenarios, different plausible future developments. Scenarios can be used to ask a ‘what if’ question. If a certain scenario were to develop, what policy measures would be required?  By the same token, scenarios may be used as test-conditions for policy-measures. Kees van der Heijden calls this ‘wind tunnelling’.

Frame-analysis is about how we interpret the world around us. Frames are mental structures that shape the way we see the world. Based on a frame, an individual perceives societal problems, attributes these problems to causes, and forms ideas on instruments to address the problems. Our central idea is that policy-makers may use citizens’ frames to reflect on their policy frame. Citizens’ frames may, in other words, be used to test conditions in a wind tunnel. The line of reasoning is summarized in the figure.

Policy frames versus citizens’ frames

policy framinng

The starting-points of the figure are the policy frame and the citizens’ frames. Arrow 1 and 2 indicate that citizens’ reactions depend on both frames. A citizen can be expected to respond positively in case of frame alignment. Negative responses can be expected if policy-makers do not address “the real problems”, do not attribute problems to “the real causes”, or do not select “adequate instruments”. If frames do not align, policy-makers are faced with the question of how to deal with it (arrow 3). First, they may reconsider the policy frame (arrow 4). That is, are there reasons to reconsider the definition of problems, the attribution to causes, and/or the selection of instruments? Such a “reframing” effectively amounts to the formulation of a new (or adjusted) policy-frame. Second, policy-makers may try to influence citizens’ frames (arrow 5). This may lead to a change in what citizens define as problems, what they consider to be the causes of problems and what they consider to be adequate instruments to deal with the problems.

Two cases: support for victims and confidence in the judiciary

To apply our framework in practice, we developed a three-step method. Firstly, we reconstruct the policy frame. Here we investigate what policy-makers see as social problems, what they assume to be the causes of these problems, and what they consider to be appropriate instruments to address these problems. Secondly, we reconstruct contrasting citizens’ frames. Here we use focus groups, where contrasting groups are selected based on a segmentation model. Finally, we engage in a “wind tunnelling exercise”. We present the citizens’ frames to policy-makers. And we ask them to reflect on the question of how the different groups can be expected to react on the policy measures selected by the policy-makers. In fact, this step is what Schön and Rein called “frame reflection”….(More)”.