research paper / thesis / dissertation

Crowdbreaks: Tracking Health Trends using Public Social Media Data and Crowdsourcing

Curated on May 19, 2018November 15, 2018 by admin

Paper by Martin Mueller and Marcel Salath: “In the past decade, tracking health trends using social media data has shown great promise, due to a powerful combination of massive adoption of social media around the world, and increasingly potent hardware and software that enables us to work with these new big data streams.

At the same time, many challenging problems have been identified. First, there is often a mismatch between how rapidly online data can change, and how rapidly algorithms are updated, which means that there is limited reusability for algorithms trained on past data as their performance decreases over time. Second, much of the work is focusing on specific issues during a specific past period in time, even though public health institutions would need flexible tools to assess multiple evolving situations in real time. Third, most tools providing such capabilities are proprietary systems with little algorithmic or data transparency, and thus little buy-in from the global public health and research community.

Here, we introduce Crowdbreaks, an open platform which allows tracking of health trends by making use of continuous crowdsourced labelling of public social media content. The system is built in a way which automatizes the typical workflow from data collection, filtering, labelling and training of machine learning classifiers and therefore can greatly accelerate the research process in the public health domain. This work introduces the technical aspects of the platform and explores its future use cases…(More)”.

Smarter Crowdsourcing for Anti-Corruption: A Handbook of Innovative Legal, Technical, and Policy Proposals and a Guide to their Implementation

Curated on May 16, 2018November 15, 2018 by admin

Paper by Noveck, Beth Simone; Koga, Kaitlin; Aceves Garcia, Rafael; Deleanu, Hannah; Cantú-Pedraza, Dinorah: “Corruption presents a fundamental threat to the stability and prosperity of Mexico and combating it demands approaches that are both principled and practical. In 2017, the Inter-American Development Bank (IDB) approved project ME-T1351 to support Mexico in its fight against corruption using Open Innovation. Thus, the IDB partnered with the Governance Lab at NYU to support Mexico’s Secretariat of Public Service (Secretaría de la Función Pública) to identify innovative ideas and then turns them into practical implementation plans for the measurement, detection, and prevention of corruption in Mexico using the GovLab’s open innovation methodology named Smarter Crowdsourcing.

The purpose of Smarter Crowdsourcing was to identify concrete solutions that include the use of data analysis and technology to tackle corruption in the public sector. This document contains 13 implementation plans laying out practical ways to address corruption. The plans emerged from “Smarter Crowdsourcing Anti-Corruption,” a method that is an agile process, which begins with robust problem definition followed by online sourcing of global expertise to surface innovative solutions. Smarter Crowdsourcing Anti-Corruption focused on six specific challenges: (i) measuring corruption and its costs, (ii) strengthening integrity in the judiciary, (iii) engaging the public in anti-corruption efforts, (iv) whistleblowing, (v) effective prosecution, and (vi) tracking and analyzing money flows…(More)”.

International Data Flows and Privacy: The Conflict and its Resolution

Curated on May 14, 2018November 15, 2018 by admin

World Bank Policy Research Working Paper by Aaditya Mattoo and Joshua P Meltzer: “The free flow of data across borders underpins today’s globalized economy. But the flow of personal dataoutside the jurisdiction of national regulators also raises concerns about the protection of privacy. Addressing these legitimate concerns without undermining international integration is a challenge. This paper describes and assesses three types of responses to this challenge: unilateral development of national or regional regulation, such as the European Union’s Data Protection Directive and forthcoming General Data Protection Regulation; international negotiation of trade disciplines, most recently in the Comprehensive and Progressive Agreement for Trans-Pacific Partnership (CPTPP); and international cooperation involving regulators, most significantly in the EU-U.S. Privacy Shield Agreement.

The paper argues that unilateral restrictions on data flows are costly and can hurt exports, especially of data-processing and other data-based services; international trade rules that limit only the importers’ freedom to regulate cannot address the challenge posed by privacy; and regulatory cooperation that aims at harmonization and mutual recognition is not likely to succeed, given the desirable divergence in national privacy regulation. The way forward is to design trade rules (as the CPTPP seeks to do) that reflect the bargain central to successful international cooperation (as in the EU-US Privacy Shield): regulators in data destination countries would assume legal obligations to protect the privacy of foreign citizens in return for obligations on data source countries not to restrict the flow of data. Existing multilateral rules can help ensure that any such arrangements do not discriminate against and are open to participation by other countries….(More)”.

New Zealand explores machine-readable laws to transform government

Curated on May 14, 2018May 29, 2019 by Stefaan Verhulst

Apolitical: “The team working to drive New Zealand’s government into the digital age believes that part of the problem is the ways that laws themselves are written. Earlier this year, over a three-week experiment, they’ve tested the theory by rewriting legislation itself as software code.…

The team in New Zealand, led by the government’s service innovations team LabPlus, has attempted to improve the interpretation of legislation and vastly ease the creation of digital services by rewriting legislation as code.

Legislation-as-code means taking the “rules” or components of legislation — its logic, requirements and exemptions — and laying them out programmatically so that it can be parsed by a machine. If law can be broken down by a machine, then anyone, even those who aren’t legally trained, can work with it. It helps to standardise the rules in a consistent language across an entire system, giving a view of services, compliance and all the different rules of government.

Over the course of three weeks the team in New Zealand rewrote two sets of legislation as software code: the Rates Rebate Act, a tax rebate designed to lower the costs of owning a home for people on low incomes, and the Holidays Act, which was enacted to grant each employee in New Zealand a guaranteed four weeks a year of holiday.

The way that both policies are written makes them difficult to interpret, and, consequently, deliver. They were written for a paper-based world, and require different service responses from distinct bodies within government based on what the legal status of the citizen using them is. For instance, the residents of retirement villages are eligible to rebates through the Rates Rebate Act, but access it via different people and provide different information than normal ratepayers.

The teams worked to rewrite the legislation, first as “pseudocode” — the rules behind the legislation in a logical chain — then as human-readable legislation and finally as software code, designed to make it far easier for public servants and the public to work out who was eligible for what outcome. In the end, the team had working code for how to digitally deliver two policies.

A step towards digital government

The implications of such techniques are significant. Firstly, machine-readable legislation could speed up interactions between government and business, sparing private organisations the costs in time and money they currently spend interpreting the laws they need to comply with.

If legislation changes, the machine can process it automatically and consistently, saving the cost of employing an expert, or a lawyer, to do this job.

More transformatively for policymaking itself, machine-readable legislation allows public servants to test the impact of policy before they implement it.

“What happens currently is that people design the policy up front and wait to see how it works when you eventually deploy it,” said Richard Pope, one of the original pioneers in the UK’s Government Digital Service (GDS) and the co-author of the UK’s digital service standard. “A better approach is to design the legislation in such a way that gives the teams that are making and delivering a service enough wiggle room to be able to test things.”…(More)”.

Open data privacy and security policy issues and its influence on embracing the Internet of Things

Curated on May 9, 2018August 3, 2018 by Stefaan Verhulst

Radhika Garg in First Monday: “Information and communication technologies (ICT) are changing the way people interact with each other. Today, every physical device can have the capability to connect to the Internet (digital presence) to send and receive data. Internet connected cameras, home automation systems, connected cars are all examples of interconnected Internet of Things (IoT). IoT can bring benefits to users in terms of monitoring and intelligent capabilities, however, these devices collect, transmit, store, and have a potential to share vast amount of personal and individual data that encroach private spaces and can be vulnerable to security breaches. The ecosystem of IoT comprises not only of users, various sensors, and devices but also other stakeholders of IoT such as data collectors, processors, regulators, and policy-makers. Even though the number of commercially available IoT devices is on steep rise, the uptake of these devices has been slow, and abandonment rapid. This paper explains how stakeholders (including users) and technologies form an assemblage in which these stakeholders are cumulatively responsible for making IoT an essential element of day-to-day living and connectivity. To this end, this paper examines open issues in data privacy and security policies (from perspectives of the European Union and North America), and its effects on stakeholders in the ecosystem. This paper concludes by explaining how these open issues, if unresolved, can lead to another wave of digital division and discrimination in the use of IoT….(More)”.

Crowdsourcing & Data Analytics: The New Settlement Tools

Curated on May 9, 2018August 3, 2018 by Stefaan Verhulst

Paper by Chao, Bernard and Robertson, Christopher T. and Yokum, David V: “In the jury trial rights, the State and Federal Constitutions recognize the fundamental value of having laypersons resolve civil and criminal disputes. Nonetheless, settlement allows parties to avoid the risks and cost of trials, and settlements help clear court dockets efficiently. But achieving settlement can be a challenge. Parties naturally view their cases from different perspectives, and these perspectives often cause both sides to be overly optimistic. This article describes a novel method of providing parties more accurate information about the value of their case by incorporating layperson perspectives. Specifically, we suggest that working with mediators or settlement judges, the parties should create mini-trials and then recruit hundreds of online mock jurors to render decisions. By applying modern statistical techniques to these results, the mediators can show the parties the likelihood of possible outcomes and also collect qualitative information about strengths and weaknesses for each side. These data will counter the parties’ unrealistic views and thereby facilitate settlement….(More)”.

A Framework for Strengthening Data Ecosystems to Serve Humanitarian Purposes

Curated on May 8, 2018August 3, 2018 by Stefaan Verhulst

Paper by Marc van den Homberg et al: “The incidence of natural disasters worldwide is increasing. As a result, a growing number of people is in need of humanitarian support, for which limited resources are available. This requires an effective and efficient prioritization of the most vulnerable people in the preparedness phase, and the most affected people in the response phase of humanitarian action. Data-driven models have the potential to support this prioritization process. However, the applications of these models in a country requires a certain level of data preparedness.

To achieve this level of data preparedness on a large scale we need to know how to facilitate, stimulate and coordinate data-sharing between humanitarian actors. We use a data ecosystem perspective to develop success criteria for establishing a “humanitarian data ecosystem”. We first present the development of a general framework with data ecosystem governance success criteria based on a systematic literature review. Subsequently, the applicability of this framework in the humanitarian sector is assessed through a case study on the “Community Risk Assessment and Prioritization toolbox” developed by the Netherlands Red Cross. The empirical evidence led to the adaption the framework to the specific criteria that need to be addressed when aiming to establish a successful humanitarian data ecosystem….(More)”.

Data sharing in PLOS ONE: An analysis of Data Availability Statements

Curated on May 8, 2018October 23, 2018 by Stefaan Verhulst

Lisa M. Federer et al at PLOS One: “A number of publishers and funders, including PLOS, have recently adopted policies requiring researchers to share the data underlying their results and publications. Such policies help increase the reproducibility of the published literature, as well as make a larger body of data available for reuse and re-analysis. In this study, we evaluate the extent to which authors have complied with this policy by analyzing Data Availability Statements from 47,593 papers published in PLOS ONE between March 2014 (when the policy went into effect) and May 2016. Our analysis shows that compliance with the policy has increased, with a significant decline over time in papers that did not include a Data Availability Statement. However, only about 20% of statements indicate that data are deposited in a repository, which the PLOS policy states is the preferred method. More commonly, authors state that their data are in the paper itself or in the supplemental information, though it is unclear whether these data meet the level of sharing required in the PLOS policy. These findings suggest that additional review of Data Availability Statements or more stringent policies may be needed to increase data sharing….(More)”.

Optimal Scope for Free Flow of Non-Personal Data in Europe

Curated on May 7, 2018August 3, 2018 by Stefaan Verhulst

Paper by Simon Forge for the European Parliament Think Tank: “Data is not static in a personal/non-personal classification – with modern analytic methods, certain non-personal data can help to generate personal data – so the distinction may become blurred. Thus, de-anonymisation techniques with advances in artificial intelligence (AI) and manipulation of large datasets will become a major issue. In some new applications, such as smart cities and connected cars, the enormous volumes of data gathered may be used for personal information as well as for non-personal functions, so such data may cross over from the technical and non-personal into the personal domain. A debate is taking place on whether current EU restrictions on confidentiality of personal private information should be relaxed so as to include personal information in free and open data flows. However, it is unlikely that a loosening of such rules will be positive for the growth of open data. Public distrust of open data flows may be exacerbated because of fears of potential commercial misuse of such data, as well of leakages, cyberattacks, and so on. The proposed recommendations are: to promote the use of open data licences to build trust and openness, promote sharing of private enterprises’ data within vertical sectors and across sectors to increase the volume of open data through incentive programmes, support testing for contamination of open data mixed with personal data to ensure open data is scrubbed clean – and so reinforce public confidence, ensure anti-competitive behaviour does not compromise the open data initiative….(More)”.

Open Social Innovation: Why and How Seekers Use Crowdsourcing for Societal Benefits

Curated on May 5, 2018August 3, 2018 by Stefaan Verhulst

Paper by Krithika Randhawa, Ralf Wilden Macquarie and Joel West: “Despite the increased research attention on crowdsourcing, we know little about why and how seeker organizations use this open innovation mechanism. Furthermore, previous studies have focused on profit-seeking firms, despite the use of open innovation practices by public sector organizations to achieve societal benefits. In this study, we investigate the organizational and project level choices of government (seekers) that crowdsource from citizens (solvers) to drive open social innovation, and thus develop new ways to address societal problems, a process referred to as “citizensourcing”.

Using a dataset of 18 local government seekers that use the same intermediary to conduct more than 2,000 crowdsourcing projects, we develop a model of seeker crowdsourcing implementation that links a previously-unstudied variance in seeker intent and engagement strategies, at the organizational level, to differences in project team motivation and capabilities, in turn leading to varying online engagement behaviors and ultimately project outcomes. Comparing and contrasting governmental with the more familiar corporate context, we further find that the non-pecuniary orientation of both seekers and solvers means that the motives of government crowdsourcing differ fundamentally from corporate crowdsourcing, but that the process more closely resembles a corporate-sponsored community rather than government-sponsored contests. More broadly, we offer insights on how seeker organizational factors and choices shape project-level implementation and success of crowdsourcing efforts, as well as suggest implications for open innovation activities of other smaller, geographicallybound organizations….(More)”.