DATA – Page 324 – The Living Library

Enabling Blockchain Innovation in the U.S. Federal Government

Curated on October 23, 2017August 3, 2018 by Stefaan Verhulst

Primer by the American Council for Technology – Industry Advisory Council: “… intended to be a foundational tool in the understanding of blockchain and its use cases within the United States federal government. To that end, it should help allay the concerns that some may have about this new technology by providing an introduction to blockchain and its related technologies, and how blockchain can be safely and securely applied to the right government use cases. Blockchain has the potential to help government to reduce fraud, errors and the cost of paper-intensive processes, while enabling collaboration across multiple divisions and agencies to provide more efficient and effective services to citizens. Moreover, the adoption of blockchain may also allow governmental agencies to provide new value-added services to businesses and others which can generate new sources of revenue for these agencies….(More)”.

Our laws don’t do enough to protect our health data

Curated on October 23, 2017August 3, 2018 by Stefaan Verhulst

Sharona Hoffman at the Conversation: “A particularly sensitive type of big data is medical big data. Medical big data can consist of electronic health records, insurance claims, information entered by patients into websites such as PatientsLikeMeand more. Health information can even be gleaned from web searches, Facebook and your recent purchases.

Such data can be used for beneficial purposes by medical researchers, public health authorities, and healthcare administrators. For example, they can use it to study medical treatments, combat epidemics and reduce costs. But others who can obtain medical big data may have more selfish agendas.

I am a professor of law and bioethics who has researched big data extensively. Last year, I published a book entitled Electronic Health Records and Medical Big Data: Law and Policy.

I have become increasingly concerned about how medical big data might be used and who could use it. Our laws currently don’t do enough to prevent harm associated with big data.

What your data says about you

Personal health information could be of interest to many, including employers, financial institutions, marketers and educational institutions. Such entities may wish to exploit it for decision-making purposes.

For example, employers presumably prefer healthy employees who are productive, take few sick days and have low medical costs. However, there are laws that prohibit employers from discriminating against workers because of their health conditions. These laws are the Americans with Disabilities Act (ADA) and the Genetic Information Nondiscrimination Act. So, employers are not permitted to reject qualified applicants simply because they have diabetes, depression or a genetic abnormality.

However, the same is not true for most predictive information regarding possible future ailments. Nothing prevents employers from rejecting or firing healthy workers out of the concern that they will later develop an impairment or disability, unless that concern is based on genetic information.

What non-genetic data can provide evidence regarding future health problems? Smoking status, eating preferences, exercise habits, weight and exposure to toxins are all informative. Scientists believe that biomarkers in your blood and other health details can predict cognitive decline, depression and diabetes.

Even bicycle purchases, credit scores and voting in midterm elections can be indicators of your health status.

Gathering data

How might employers obtain predictive data? An easy source is social media, where many individuals publicly post very private information. Through social media, your employer might learn that you smoke, hate to exercise or have high cholesterol.

Another potential source is wellness programs. These programs seek to improve workers’ health through incentives to exercise, stop smoking, manage diabetes, obtain health screenings and so on. While many wellness programs are run by third party vendors that promise confidentiality, that is not always the case.

In addition, employers may be able to purchase information from data brokers that collect, compile and sell personal information. Data brokers mine sources such as social media, personal websites, U.S. Census records, state hospital records, retailers’ purchasing records, real property records, insurance claims and more. Two well-known data brokers are Spokeo and Acxiom.

Some of the data employers can obtain identify individuals by name. But even information that does not provide obvious identifying details can be valuable. Wellness program vendors, for example, might provide employers with summary data about their workforce but strip away particulars such as names and birthdates. Nevertheless, de-identified information can sometimes be re-identified by experts. Data miners can match information to data that is publicly available….(More)”.

Reboot for the AI revolution

Curated on October 21, 2017August 3, 2018 by Stefaan Verhulst

Yuval Noah Harari in Nature: “The ongoing artificial-intelligence revolution will change almost every line of work, creating enormous social and economic opportunities — and challenges. Some believe that intelligent computers will push humans out of the job market and create a new ‘useless class’; others maintain that automation will generate a wide range of new human jobs and greater prosperity for all. Almost everybody agrees that we should take action to prevent the worst-case scenarios….

Governments might decide to deliberately slow down the pace of automation, to lessen the resulting shocks and allow time for readjustments. But it will probably be both impossible and undesirable to prevent automation and job loss completely. That would mean giving up the immense positive potential of AI and robotics. If self-driving vehicles drive more safely and cheaply than humans, it would be counterproductive to ban them just to protect the jobs of taxi and lorry drivers.

A more sensible strategy is to create new jobs. In particular, as routine jobs are automated, opportunities for new non-routine jobs will mushroom. For example, general physicians who focus on diagnosing known diseases and administering familiar treatments will probably be replaced by AI doctors. Precisely because of that, there will be more money to pay human experts to do groundbreaking medical research, develop new medications and pioneer innovative surgical techniques.

This calls for economic entrepreneurship and legal dexterity. Above all, it necessitates a revolution in education…Creating new jobs might prove easier than retraining people to fill them. A huge useless class might appear, owing to both an absolute lack of jobs and a lack of relevant education and mental flexibility….

With insights gleaned from early warning signs and test cases, scholars should strive to develop new socio-economic models. The old ones no longer hold. For example, twentieth-century socialism assumed that the working class was crucial to the economy, and socialist thinkers tried to teach the proletariat how to translate its immense economic power into political clout. In the twenty-first century, if the masses lose their economic value they might have to struggle against irrelevance rather than exploitation….The challenges posed in the twenty-first century by the merger of infotech and biotech are arguably bigger than those thrown up by steam engines, railways, electricity and fossil fuels. Given the immense destructive power of our modern civilization, we cannot afford more failed models, world wars and bloody revolutions. We have to do better this time….(More)”

Laboratories for news? Experimenting with journalism hackathons

Curated on October 21, 2017August 3, 2018 by Stefaan Verhulst

Jan Lauren Boyles in Journalism: “Journalism hackathons are computationally based events in which participants create news product prototypes. In the ideal case, the gatherings are rooted in local community, enabling a wide set of institutional stakeholders (legacy journalists, hacker journalists, civic hackers, and the general public) to gather in conversation around key civic issues. This study explores how and to what extent journalism hackathons operate as a community-based laboratory for translating open data from practitioners to the public. Surfaced from in-depth interviews with event organizers encompassing nine countries, the findings illustrate that journalism hackathons are most successful when collaboration integrates civic organizations and community leaders….(More)”.

How “Big Data” Went Bust

Curated on October 19, 2017August 3, 2018 by Stefaan Verhulst

Will Oremus at Slate:”…Five years later, data plays a vastly expanded role in our lives, yet the term big data has gone out of fashion—and acquired something of an unsavory reputation. It’s worth looking back at what, exactly, happened to the revolution we were promised, and where data, analytics, and algorithms are headed now….

The problem with “big data” is not that data is bad. It’s not even that big data is bad: Applied carefully, massive data sets can reveal important trends that would otherwise go undetected. It’s the fetishization of data, and its uncritical use, that tends to lead to disaster, as Julia Rose West recently wrote for Slate. And that’s what “big data,” as a catchphrase, came to represent.

By its nature, big data is hard to interpret. When you’re collecting billions of data points—clicks or cursor positions on a website; turns of a turnstile in a large public space; hourly wind speed observations from around the world; tweets—the provenance of any given data point is obscured. This in turn means that seemingly high-level trends might turn out to be artifacts of problems in the data or methodology at the most granular level possible. But perhaps the bigger problem is that the data you have are usually only a proxy for what you really want to know. Big data doesn’t solve that problem—it magnifies it….

Aside from swearing off data and reverting to anecdote and intuition, there are at least two viable ways to deal with the problems that arise from the imperfect relationship between a data set and the real-world outcome you’re trying to measure or predict.

One is, in short: moar data. This has long been Facebook’s approach. When it became apparent that users’ “likes” were a flawed proxy for what they actually wanted to see more of in their feeds, the company responded by adding more and more proxies to its model. It began measuring other things, like the amount of time they spent looking at a post in their feed, the amount of time they spent reading a story they had clicked on, and whether they hit “like” before or after they had read the piece. When Facebook’s engineers had gone as far as they could in weighting and optimizing those metrics, they found that users were still unsatisfied in important ways. So the company added yet more metrics to the sauce: It started running huge user-survey panels, added new reaction emojis by which users could convey more nuanced sentiments, and started using A.I. to detect clickbait-y language in posts by pages and publishers. The company knows none of these proxies are perfect. But by constantly adding more of them to the mix, it can theoretically edge ever closer to an algorithm that delivers to users the posts that they most want to see.

One downside of the moar data approach is that it’s hard and expensive. Another is that the more variables are added to your model, the more complex, opaque, and unintelligible its methodology becomes. This is part of the problem Pasquale articulated in The Black Box Society. Even the most sophisticated algorithm, drawing on the best data sets, can go awry—and when it does, diagnosing the problem can be nigh-impossible. There are also the perils of “overfitting” and false confidence: The more sophisticated your model becomes, the more perfectly it seems to match up with all your past observations, and the more faith you place in it, the greater the danger that it will eventually fail you in a dramatic way. (Think mortgage crisis, election prediction models, and Zynga.)

Another possible response to the problems that arise from biases in big data sets is what some have taken to calling “small data.” Small data refers to data sets that are simple enough to be analyzed and interpreted directly by humans, without recourse to supercomputers or Hadoop jobs. Like “slow food,” the term arose as a conscious reaction to the prevalence of its opposite….(More)”

Open Space: The Global Effort for Open Access to Environmental Satellite Data

Curated on October 19, 2017August 3, 2018 by Stefaan Verhulst

Book by Mariel Borowitz: “Key to understanding and addressing climate change is continuous and precise monitoring of environmental conditions. Satellites play an important role in collecting climate data, offering comprehensive global coverage that can’t be matched by in situ observation. And yet, as Mariel Borowitz shows in this book, much satellite data is not freely available but restricted; this remains true despite the data-sharing advocacy of international organizations and a global open data movement. Borowitz examines policies governing the sharing of environmental satellite data, offering a model of data-sharing policy development and applying it in case studies from the United States, Europe, and Japan—countries responsible for nearly half of the unclassified government Earth observation satellites.

Borowitz develops a model that centers on the government agency as the primary actor while taking into account the roles of such outside actors as other government officials and non-governmental actors, as well as the economic, security, and normative attributes of the data itself. The case studies include the U.S. National Aeronautics and Space Administration (NASA) and the U.S. National Oceanographic and Atmospheric Association (NOAA), and the United States Geological Survey (USGS); the European Space Agency (ESA) and the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT); and the Japanese Aerospace Exploration Agency (JAXA) and the Japanese Meteorological Agency (JMA). Finally, she considers the policy implications of her findings for the future and provides recommendations on how to increase global sharing of satellite data….(More)”.

Our Gutenberg Moment: It’s Time To Grapple With The Internet’s Effect On Democracy

Curated on October 18, 2017August 3, 2018 by Stefaan Verhulst

Alberto Ibargüen at HuffPost: “When clashes wracked Charlottesville, many Americans saw neo-nazi demonstrators as the obvious instigators. But others focused on counter-demonstrators, a view amplified by the president blaming “many sides.” The rift in perception underscored an uncomfortable but unavoidable truth about the flow of information today: Americans no longer have a shared foundation of facts upon which we can agree.

Politics has long been a messy, divisive business. I lived through the 1960s, a period of similar dissatisfaction, disillusionment, and disunity, brilliantly chronicled by Ken Burns’ new film “The Vietnam War” on PBS. But common, local knowledge —of history and current events — has always been the great equalizer in American society. Today, however, a decrease in shared knowledge has led to a collapse in trust. Over the past few years, we have watched our capacity to compromise wane as not only our politics, but also our most basic value systems, have become polarized.

The key difference between then and now is how news is delivered and consumed. At the beginning of our Republic, the reach of media was local and largely verifiable. That direct relationship between media outlets and their communities — local newspapers and, later, radio and TV stations — held until the second half of the 20th century. Network TV began to create a sense of national community but it fractioned with the sudden ability to offer targeted, membership-based models via cable.

But cable was nothing compared to Internet. Internet’s unique ability to personalize and to create virtual communities of interest accelerated the decline of newspapers and television business models and altered the flow of information in ways that we are still uncovering. “Media” now means digital and cable, cool mediums that require hot performance. Trust in all media, including traditional media, is at an all-time low, and we’re just now beginning to grapple with the threat to democracy posed by this erosion of trust.

Internet is potentially the greatest democratizing tool in history. It is also democracy’s greatest challenge. In offering access to information that can support any position and confirm any bias, social media has propelled the erosion of our common set of everyday facts….(More)”.

Open data, democracy and public service reform

Curated on October 18, 2017August 3, 2018 by Stefaan Verhulst

Mark Thompson at Computer Weekly: “Discussion around reforming public services is as important as better information sharing rules if government is to make the most of public data…

Our public services face two paradoxes in relation to data sharing. First, on the demand side, “Zuckerberg’s law” – which claims that the amount of data we’re happy to share with companies increases exponentially year-on-year – flies in the face of our wariness as citizens to share with the state….

The upcoming General Data Protection Regulation (GDPR) – a beefed-up version of the existing Data Protection Act (DPA) – is likely to only exacerbate a fundamental problem, therefore: citizens don’t want the state to know much about them, and public servants don’t want to share. Each behaviour is paradoxical, and thus complex to address culturally.

Worse, we need to accelerate our public conversation considerably if we are to maintain pace with accelerating technological developments.

Existing complexity in the data space will shortly be exacerbated by new abilities to process unstructured data such as images and natural language – abilities which offer entirely new opportunities for commercial exploitation as well as surveillance…(More)”.

Priceless? A new framework for estimating the cost of open government reforms

Curated on October 18, 2017August 3, 2018 by Stefaan Verhulst

New paper by Praneetha Vissapragada and Naomi Joswiak: “The Open Government Costing initiative, seeded with funding from the World Bank, was undertaken to develop a practical and actionable approach to pinpointing the full economic costs of various open government programs. The methodology developed through this initiative represents an important step towards conducting more sophisticated cost-benefit analyses – and ultimately understanding the true value – of open government reforms intended to increase citizen engagement, promote transparency and accountability, and combat corruption, insights that have been sorely lacking in the open government community to date. The Open Government Costing Framework and Methods section (Section 2 of this report) outlines the critical components needed to conduct cost analysis of open government programs, with the ultimate objective of putting a price tag on key open government reform programs in various countries at a particular point in time. This framework introduces a costing process that employs six essential steps for conducting a cost study, including (1) defining the scope of the program, (2) identifying types of costs to assess, (3) developing a framework for costing, (4) identifying key components, (5) conducting data collection and (6) conducting data analysis. While the costing methods are built on related approaches used for analysis in other sectors such as health and nutrition, this framework and methodology was specifically adapted for open government programs and thus addresses the unique challenges associated with these types of initiatives. Using the methods outlined in this document, we conducted a cost analysis of two case studies: (1) ProZorro, an e-procurement program in Ukraine; and (2) Sierra Leone’s Open Data Program….(More)”

The Supreme Court Is Allergic To Math

Curated on October 18, 2017August 3, 2018 by Stefaan Verhulst

Oliver Roeder at FiveThirtyEight: “The Supreme Court does not compute. Or at least some of its members would rather not. The justices, the most powerful jurists in the land, seem to have a reluctance — even an allergy — to taking math and statistics seriously.

For decades, the court has struggled with quantitative evidence of all kinds in a wide variety of cases. Sometimes justices ignore this evidence. Sometimes they misinterpret it. And sometimes they cast it aside in order to hold on to more traditional legal arguments. (And, yes, sometimes they also listen to the numbers.) Yet the world itself is becoming more computationally driven, and some of those computations will need to be adjudicated before long. Some major artificial intelligence case will likely come across the court’s desk in the next decade, for example. By voicing an unwillingness to engage with data-driven empiricism, justices — and thus the court — are at risk of making decisions without fully grappling with the evidence.

This problem was on full display earlier this month, when the Supreme Court heard arguments in Gill v. Whitford, a case that will determine the future of partisan gerrymandering — and the contours of American democracy along with it. As my colleague Galen Druke has reported, the case hinges on math: Is there a way to measure a map’s partisan bias and to create a standard for when a gerrymandered map infringes on voters’ rights?…(More)”.