Patient Power: Crowdsourcing in Cancer


Bonnie J. Addario at the HuffPost: “…Understanding how to manage and manipulate vast sums of medical data to improve research and treatments has become a top priority in the cancer enterprise. Researchers at the University of North Carolina Chapel Hill are using IBM’s Watson and its artificial intelligence computing power to great effect. Dr. Norman Sharpless told Charlie Rose from CBS’ 60 Minutes that Watson is reading tens of millions of medical papers weekly (8,000 new cancer research papers are published every day) and regularly scanning the web for new clinical trials most people, including researchers, are unaware of. The task is “essentially undoable” he said, for even the best, well-informed experts.

UNC’s effort is truly wonderful albeit a macro approach, less tailored and accessible only to certain medical centers. My experience tells me what the real problem is: How does a patient newly diagnosed with lung cancer, fragile and scared find the most relevant information without being overwhelmed and giving up? If the experts can’t easily find key data without Watson’s help, and Google’s first try turns up millions upon millions of semi-useful results, how do we build hope that there are good online answers for our patients?

We’ve thought about this a lot at the Addario Lung Cancer Foundation and figured out that the answer lies with the patients themselves. Why not crowdsource it with people who have lung cancer, their caregivers and family members?

So, we created the first-ever global Lung Cancer Patient Registry that simplifies the collection, management and distribution of critical health-related information – all in one place so that researchers and patients can easily access and find data specific to lung cancer patients.

This is a data-rich environment for those focusing solely on finding a cure for lung cancer. And it gives patients access to other patients to compare notes and generally feel safe sharing intimate details with their peers….(More)”

Data-Driven Policy Making: The Policy Lab Approach


Paper by Anne Fleur van Veenstra and Bas Kotterink: “Societal challenges such as migration, poverty, and climate change can be considered ‘wicked problems’ for which no optimal solution exists. To address such problems, public administrations increasingly aim for datadriven policy making. Data-driven policy making aims to make optimal use of sensor data, and collaborate with citizens to co-create policy. However, few public administrations have realized this so far. Therefore, in this paper an approach for data-driven policy making is developed that can be used in the setting of a Policy Lab. A Policy Lab is an experimental environment in which stakeholders collaborate to develop and test policy. Based on literature, we first identify innovations in data-driven policy making. Subsequently, we map these innovations to the stages of the policy cycle. We found that most innovations are concerned with using new data sources in traditional statistics and that methodologies capturing the benefits of data-driven policy making are still under development. Further research should focus on policy experimentation while developing new methodologies for data-driven policy making at the same time….(More)”.

Open & Shut


Harsha Devulapalli: “Welcome to Open & Shut — a new blog dedicated to exploring the opportunities and challenges of working with open data in closed societies around the world. Although we’ll be exploring questions relevant to open data practitioners worldwide, we’re particularly interested in seeing how civil society groups and actors in the Global South are using open data to push for greater government transparency, and tackle daunting social and economic challenges facing their societies….Throughout this series we’ll be profiling and interviewing organisations working with open data worldwide, and providing do-it-yourself data tutorials that will be useful for beginners as well as data experts. …

What do we mean by the terms ‘open data’ and ‘closed societies’?

It’s important to be clear about what we’re dealing with, here. So let’s establish some key terms. When we talk about ‘open data’, we mean data that anyone can access, use and share freely. And when we say ‘closed societies’, we’re referring to states or regions in which the political and social environment is actively hostile to notions of openness and public scrutiny, and which hold principles of freedom of information in low esteem. In closed societies, data is either not published at all by the government, or else is only published in inaccessible formats, is missing data, is hard to find or else is just not digitised at all.

Iran is one such state that we would characterise as a ‘closed society’. At Small Media, we’ve had to confront the challenges of poor data practice, secrecy, and government opaqueness while undertaking work to support freedom of information and freedom of expression in the country. Based on these experiences, we’ve been working to build Iran Open Data — a civil society-led open data portal for Iran, in an effort to make Iranian government data more accessible and easier for researchers, journalists, and civil society actors to work with.

Iran Open Data — an open data portal for Iran, created by Small Media

.

..Open & Shut will shine a light on the exciting new ways that different groups are using data to question dominant narratives, transform public opinion, and bring about tangible change in closed societies. At the same time, it’ll demonstrate the challenges faced by open data advocates in opening up this valuable data. We intend to get the community talking about the need to build cross-border alliances in order to empower the open data movement, and to exchange knowledge and best practices despite the different needs and circumstances we all face….(More)

Algorithmic regulation: A critical interrogation


Karen Yeung in Regulation and Governance: “Innovations in networked digital communications technologies, including the rise of “Big Data,” ubiquitous computing, and cloud storage systems, may be giving rise to a new system of social ordering known as algorithmic regulation. Algorithmic regulation refers to decisionmaking systems that regulate a domain of activity in order to manage risk or alter behavior through continual computational generation of knowledge by systematically collecting data (in real time on a continuous basis) emitted directly from numerous dynamic components pertaining to the regulated environment in order to identify and, if necessary, automatically refine (or prompt refinement of) the system’s operations to attain a pre-specified goal. This study provides a descriptive analysis of algorithmic regulation, classifying these decisionmaking systems as either reactive or pre-emptive, and offers a taxonomy that identifies eight different forms of algorithmic regulation based on their configuration at each of the three stages of the cybernetic process: notably, at the level of standard setting (adaptive vs. fixed behavioral standards), information-gathering and monitoring (historic data vs. predictions based on inferred data), and at the level of sanction and behavioral change (automatic execution vs. recommender systems). It maps the contours of several emerging debates surrounding algorithmic regulation, drawing upon insights from regulatory governance studies, legal critiques, surveillance studies, and critical data studies to highlight various concerns about the legitimacy of algorithmic regulation….(More)”.

Our digital journey: moving to electronic questionnaires


Jason Bradbury at the Office for National Statistics (UK): “Earlier this year we shared news about the Retail Sales Inquiry (RSI) – the monthly national survey of shops and shopping –  moving to digital data collection. ONS is transforming the way it collects data, improving the speed and quality of the information while reducing the burden on respondents. The past six months has seen a significant expansion of our digital survey availability. In January 5,000 retailers were invited to sign-up for an account giving them the option to send us their data  for one of our business surveys digitally.

Electronic questionnaires

The take-up of the electronic questionnaire (eQ) was incredible with over 80% of respondents choosing to supply their information for the RSI online. Overt the last six months, we have continued to see the appetite for online completion grow. Each month, an average of 300 new businesses opt to return their Retail Sales data digitally with many eager to move to digital methods for the other surveys they are required to complete….

Moving data collection from the phone and paper to online has been a huge success delivering improved quality, an ‘easy  to access’ online experience and when thinking about the impact this change could  have had on our core function as a statistical body, I am delighted to share that we have not witnessed any statistical issues and all of outputs have been compiled and produced as normal.

Put simply, the easier it is for someone to complete our surveys, the more likely they are to take the time to provide more detailed accurate data. It is worth noting that once a business has an account with ONS they often send back data to us quicker. The earlier and more detailed responses allow us more time to quality assure (QA) the information and reduce the need to re-contact the businesses.

Our digital journey

The digital world is a fast paced and an ever changing environment. We have found it challenging to match this pace in both our team’s skill base and our digital service. We are in the process of up-skilling our teams and updating our data collection service and infrastructure. This will enable us to improve our data collection service and move even more surveys online….(More)”

Smart or dumb? The real impact of India’s proposal to build 100 smart cities


 in The Conversation: “In 2014, the new Indian government declared its intention to achieve 100 smart cities.

In promoting this objective, it gave the example of a large development in the island city of Mumbai, Bhendi Bazaar. There, 3-5 storey housing would be replaced with towers of between 40 to 60 storeys to increase density. This has come to be known as “vertical with a vengeance”.

We have obtained details of the proposed project from the developer and the municipal authorities. Using an extended urban metabolism model, which measures the impacts of the built environment, we have assessed its overall impact. We determined how the flows of materials and energy will change as a result of the redevelopment.

Our research shows that the proposal is neither smart nor sustainable.

Measuring impacts

The Indian government clearly defined what they meant with “smart”. Over half of the 11 objectives were environmental and main components of the metabolism of a city. These include adequate water and sanitation, assured electricity, efficient transport, reduced air pollution and resource depletion, and sustainability.

We collected data from various primary and secondary sources. This included physical surveys during site visits, local government agencies, non-governmental organisations, the construction industry and research.

We then made three-dimensional models of the existing and proposed developments to establish morphological changes, including building heights, street widths, parking provision, roof areas, open space, landscaping and other aspects of built form.

Demographic changes (population density, total population) were based on census data, the developer’s calculations and an assessment of available space. Such information about the magnitude of the development and the associated population changes allowed us to analyse the additional resources required as well as the environmental impact….

Case studies such as Bhendi Bazaar provide an example of plans for increased density and urban regeneration. However, they do not offer an answer to the challenge of limited infrastructure to support the resource requirements of such developments.

The results of our research indicate significant adverse impacts on the environment. They show that the metabolism increases at a greater rate than the population grows. On this basis, this proposed development for Mumbai, or the other 99 cities, should not be called smart or sustainable.

With policies that aim to prevent urban sprawl, cities will inevitably grow vertically. But with high-rise housing comes dependence on centralised flows of energy, water supplies and waste disposal. Dependency in turn leads to vulnerability and insecurity….(More)”.

America is not a true democracy. But it could be with the help of technology


Nicole Softness at Quartz: “Many Americans aren’t aware they don’t live in a direct democracy. But with a little digital assistance, they could be….Once completely cut off from the global community, Estonia is now considered a world leader for its efforts to integrate technology with government administration. While standing in line for coffee, you could file your tax return, confirm sensitive personal medical information, and register a new company in just a few swipes, all on Estonia’s free wifi.

What makes this possible without the risk of fraud? Digital trust. Using a technology called blockchain, which verifies online communications and transactions at every step (and essentially eliminates the possibility of online fraud), Estonian leadership has moved the majority of citizenship processes online. Startups have now created new channels for democratic participation, like Rahvaalgatus, an online crowdsourcing platform that allows users to discuss and digitally vote on policy proposals submitted to the Estonian parliament.

Brazil has also utilized this trust quite valiantly. The country’s constitution, passed in 1988, legislated that signatures from 1% of a population could force the Brazilian leadership to recognize any signed document as an official draft bill and vote. Until recently, the notion of getting sufficient signatures on paper would have been laughable: that’s just over 2 million physical signatures. However, votes can now be cast online, which makes gathering digital signatures all the more easy. As a result, Brazilians now have more control over the legislature being brought before parliament.

 Blockchain technology creates an immutable record of signatures tied to the identities of voters. Again, blockchain technology is key here, as it creates an immutable record of signatures tied to the identities of voters. The government knows which voters are legitimate citizens, and citizens can be sure their votes remain accurate. When Brazilians are able to participate in this manner, their democracy shifts towards the sort of “direct” democracy that, until now, seemed logistically impossible in modern society.

Australian citizens have engaged in a slightly different experiment, dubbed “Government 2.0.” In March 2016, technology experts convened a new political party called Flux, which they describe as “democracy for the information age.” The party platform argues that bureaucracy stymies key government functions, which cannot process the requisite information required to govern.

If elected to government, members of Flux would vote on bills scheduled to appear before parliament based on the digital ballots of the supporters who voted them in. Voters could choose to participate in casting their vote for that bill themselves, or transfer their votes to trusted experts. Flux representatives in parliament would then cast their votes 100% based on the results of these member participants. (They are yet to win any seats in government, however.)

These solutions show us that bureaucratic boundaries no longer have to limit our access to a true democracy. The technology is here to make direct democracy the reality that the Greeks once imagined.

More so, increasing democratic participation will have positive ripple effects beyond participation in a direct democracy: Informed voting is the gateway to more active civic engagement and a more informed electorate, all of which raises the level of debate in a political environment desperately in need of participation….(More)”

Modernizing government’s approach to transportation and land use data: Challenges and opportunities


Adie Tomer and Ranjitha Shivaram at Brookings: “In the fields of transportation and land use planning, the public sector has long taken the leading role in the collection, analysis, and dissemination of data. Often, public data sets drawn from traveler diaries, surveys, and supply-side transportation maps were the only way to understand how people move around in the built environment – how they get to work, how they drop kids off at school, where they choose to work out or relax, and so on.

But, change is afoot: today, there are not only new data providers, but also new types of data. Cellphones, GPS trackers, and other navigation devices offer real-time demand-side data. For instance, mobile phone data can point to where distracted driving is a problem and help implement measures to deter such behavior. Insurance data and geo-located police data can guide traffic safety improvements, especially in accident-prone zones. Geotagged photo data can illustrate the use of popular public spaces by locals and tourists alike, enabling greater return on investment from public spaces. Data from exercise apps like Fitbit and Runkeeper can help identify recreational hot spots that attract people and those that don’t.

However, integrating all this data into how we actually plan and build communities—including the transportation systems that move all of us and our goods—will not be easy. There are several core challenges. Limited staff capacity and restricted budgets in public agencies can slow adoption. Governmental procurement policies are stuck in an analog era. Privacy concerns introduce risk and uncertainty. Private data could be simply unavailable to public consumers. And even if governments could acquire all of the new data and analytics that interest them, their planning and investment models must be updated to fully utilize these new resources.

Using a mix of primary research and expert interviews, this report catalogs emerging data sets related to transportation and land use, and assesses the ease by which they can be integrated into how public agencies manage the built environment. It finds that there is reason for the hype; we have the ability to know more about how humans move around today than at any time in history. But, despite all the obvious opportunities, not addressing core challenges will limit public agencies’ ability to put all that data to use for the collective good….(More)”

Open data on universities – New fuel for transformation


François van Schalkwyk at University World News: “Accessible, usable and relevant open data on South African universities makes it possible for a wide range of stakeholders to monitor, advise and challenge the transformation of South Africa’s universities from an informed perspective.

Some describe data as the new oil while others suggest it is a new form of capital or compare it to electricity. Either way, there appears to be a groundswell of interest in the potential of data to fuel development.

Whether the proliferation of data is skewing development in favour of globally networked elites or disrupting existing asymmetries of information and power, is the subject of ongoing debate. Certainly, there are those who will claim that open data, from a development perspective, could catalyse disruption and redistribution.

Open data is data that is free to use without restriction. Governments and their agencies, universities and their researchers, non-governmental organisations and their donors, and even corporations, are all potential sources of open data.

Open government data, as a public rather than a private resource, embedded in principles of universal access, participation and transparency, is touted as being able to restore the deteriorating levels of trust between citizens and their governments.

Open data promises to do so by making the decisions and processes of the state more transparent and inclusive, empowering citizens to participate and to hold public institutions to account for the distribution of public services and resources.

Benefits of open data

Open data has other benefits over its more cloistered cousins (data in private networks, big data, etc). By democratising access, open data makes possible the use of data on, for example, health services, crime, the environment, procurement and education by a range of different users, each bringing their own perspective to bear on the data. This can expose bias in the data or may improve the quality of the data by surfacing data errors. Both are important when data is used to shape government policies.

By removing barriers to reusing data such as copyright or licence-fees, tech-savvy entrepreneurs can develop applications to assist the public to make more informed decisions by making available easy-to-understand information on medicine prices, crime hot-spots, air quality, beneficial ownership, school performance, etc. And access to open research data can improve quality and efficiency in science.

Scientists can check and confirm the data on which important discoveries are based if the data is open, and, in some cases, researchers can reuse open data from other studies, saving them the cost and effort of collecting the data themselves.

‘Open washing’

But access alone is not enough for open data to realise its potential. Open data must also be used. And data is used if it holds some value for the user. Governments have been known to publish server rooms full of data that no one is interested in to support claims of transparency and supporting the knowledge economy. That practice is called ‘open washing’. …(More)”

Formalised data citation practices would encourage more authors to make their data available for reuse


 Hyoungjoo Park and Dietmar Wolfram at the LSE Impact Blog: “Today’s researchers work in a heavily data-intensive and collaborative environment in order to further scientific discovery across and within fields. It is becoming routine for researchers (i.e. authors and data publishers) to submit their research data, such as datasets, biological samples in biomedical fields, and computer code, as supplementary information in order to comply with data sharing requirements of major funding agencies, high-profile journals, and data journals. This is part of open science, where data and any publication products are expected to be made available to anyone interested.

Given that researchers benefit from publicly shared data through data reuse in their own research, researchers who provide access to data should be acknowledged for their contributions, much in the same way that authors are recognised for their research publications through citation. Researchers who use shared data or other shared research products (e.g. open access software, tissue cultures) should also acknowledge the providers of these resources through formal citation. At present, data citation is not widely practised in most disciplines and as an object of study remains largely overlooked….

We found that data citations appear in the references section of an article less frequently than in the main text, making it difficult to identify the reward and credit for data authors (i.e. data sharers). Consistent data citation formats could not be found. Current data citation practices do not (yet) benefit data sharers. Also, data citation was sometimes located in the supplementary information, outside of the references. Data that had been reused was often not acknowledged in the reference lists, but was rather hidden in the representation of data (e.g. tables, figures, images, graphs, and other elements), which may be a consequence of the fact that data citation practices are not yet common in scholarly communications.

Ongoing challenges remain in identifying and documenting data citation. First, the practice of informal data citation presents a challenge for accurately documenting data citation. …

Second, data recitation by one or more co-authors of earlier studies (i.e. self-citation) is common, which reduces the broader impact of data sharing by limiting much of the reuse to the original authors..

Third, currently indexed data citations may not include rapidly advancing areas, such as in the hard sciences or computer engineering, because approximately 90% of indexed works were associated with journal articles…

Fourth, the number of authors associated with shared datasets raises questions of the ownership of and responsibility for a collective work, although some journals require one author to be responsible for the data used in the study…(More). (See also An examination of research data sharing and re-use: implications for data citation practice, published in Scientometrics)