Making Better Use of Health Care Data


Benson S. Hsu, MD and Emily Griese in Harvard Business Review: “At Sanford Health, a $4.5 billion rural integrated health care system, we deliver care to over 2.5 million people in 300 communities across 250,000 square miles. In the process, we collect and store vast quantities of patient data – everything from admission, diagnostic, treatment and discharge data to online interactions between patients and providers, as well as data on providers themselves. All this data clearly represents a rich resource with the potential to improve care, but until recently was underutilized. The question was, how best to leverage it.

While we have a mature data infrastructure including a centralized data and analytics team, a standalone virtual data warehouse linking all data silos, and strict enterprise-wide data governance, we reasoned that the best way forward would be to collaborate with other institutions that had additional and complementary data capabilities and expertise.

We reached out to potential academic partners who were leading the way in data science, from university departments of math, science, and computer informatics to business and medical schools and invited them to collaborate with us on projects that could improve health care quality and lower costs. In exchange, Sanford created contracts that gave these partners access to data whose use had previously been constrained by concerns about data privacy and competitive-use agreements. With this access, academic partners are advancing their own research while providing real-world insights into care delivery.

The resulting Sanford Data Collaborative, now in its second year, has attracted regional and national partners and is already beginning to deliver data-driven innovations that are improving care delivery, patient engagement, and care access. Here we describe three that hold particular promise.

  • Developing Prescriptive Algorithms…
  • Augmenting Patient Engagement…
  • Improving Access to Care…(More)”.

Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It?


The New York Times: “The idea has been around for a bit. Jaron Lanier, the tech philosopher and virtual-reality pioneer who now works for Microsoft Research, proposed it in his 2013 book, “Who Owns the Future?,” as a needed corrective to an online economy mostly financed by advertisers’ covert manipulation of users’ consumer choices.

It is being picked up in “Radical Markets,” a book due out shortly from Eric A. Posner of the University of Chicago Law School and E. Glen Weyl, principal researcher at Microsoft. And it is playing into European efforts to collect tax revenue from American internet giants.

In a report obtained last month by Politico, the European Commission proposes to impose a tax on the revenue of digital companies based on their users’ location, on the grounds that “a significant part of the value of a business is created where the users are based and data is collected and processed.”

Users’ data is a valuable commodity. Facebook offers advertisers precisely targeted audiences based on user profiles. YouTube, too, uses users’ preferences to tailor its feed. Still, this pales in comparison with how valuable data is about to become, as the footprint of artificial intelligence extends across the economy.

Data is the crucial ingredient of the A.I. revolution. Training systems to perform even relatively straightforward tasks like voice translation, voice transcription or image recognition requires vast amounts of data — like tagged photos, to identify their content, or recordings with transcriptions.

“Among leading A.I. teams, many can likely replicate others’ software in, at most, one to two years,” notes the technologist Andrew Ng. “But it is exceedingly difficult to get access to someone else’s data. Thus data, rather than software, is the defensible barrier for many businesses.”

We may think we get a fair deal, offering our data as the price of sharing puppy pictures. By other metrics, we are being victimized: In the largest technology companies, the share of income going to labor is only about 5 to 15 percent, Mr. Posner and Mr. Weyl write. That’s way below Walmart’s 80 percent. Consumer data amounts to work they get free….

The big question, of course, is how we get there from here. My guess is that it would be naïve to expect Google and Facebook to start paying for user data of their own accord, even if that improved the quality of the information. Could policymakers step in, somewhat the way the European Commission did, demanding that technology companies compute the value of consumer data?…(More)”.

Informed Diet Selection: Increasing Food Literacy through Crowdsourcing


Paper by Niels van Berkel et al: “The obesity epidemic is one of the greatest threats to health and wellbeing throughout much of the world. Despite information on healthy lifestyles and eating habits being more accessible than ever before, the situation seems to be growing worse  And for a person who wants to lose weight there are practically unlimited options and temptations to choose from. Food, or dieting, is a booming business, and thousands of companies and vendors want their cut by pitching their solutions, particularly online (Google) where people first turn to find weight loss information. In our work, we have set to harness the wisdom of crowds in making sense of available diets, and to offer a direct way for users to increase their food literacy during diet selection.  The Diet Explorer is a crowd-powered online knowledge base that contains an arbitrary number of weight loss diets that are all assessed in terms of an arbitrary set of criteria…(More)”.

Journalism and artificial intelligence


Notes by Charlie Beckett (at LSE’s Media Policy Project Blog) : “…AI and machine learning is a big deal for journalism and news information. Possibly as important as the other developments we have seen in the last 20 years such as online platforms, digital tools and social media. My 2008 book on how journalism was being revolutionised by technology was called SuperMedia because these technologies offered extraordinary opportunities to make journalism much more efficient and effective – but also to transform what we mean by news and how we relate to it as individuals and communities. Of course, that can be super good or super bad.

Artificial intelligence and machine learning can help the news media with its three core problems:

  1. The overabundance of information and sources that leave the public confused
  2. The credibility of journalism in a world of disinformation and falling trust and literacy
  3. The Business model crisis – how can journalism become more efficient – avoiding duplication; be more engaged, add value and be relevant to the individual’s and communities’ need for quality, accurate information and informed, useful debate.

But like any technology they can also be used by bad people or for bad purposes: in journalism that can mean clickbait, misinformation, propaganda, and trolling.

Some caveats about using AI in journalism:

  1. Narratives are difficult to program. Trusted journalists are needed to understand and write meaningful stories.
  2. Artificial Intelligence needs human inputs. Skilled journalists are required to double check results and interpret them.
  3. Artificial Intelligence increases quantity, not quality. It’s still up to the editorial team and developers to decide what kind of journalism the AI will help create….(More)”.

A primer on political bots: Part one


Stuart W. Shulman et al at Data Driven Journalism: “The rise of political bots brings into sharp focus the role of automated social media accounts in today’s democratic civil society. Events during the Brexit referendum and the 2016 U.S. Presidential election revealed the scale of this issue for the first time to the majority of citizens and policy-makers. At the same time, the deployment of Russian-linked bots designed to promote pro-gun laws in the aftermath of the Florida school shooting demonstrates the state-sponsored, real-time readiness to shape, through information warfare, the dominant narratives on platforms such as Twitter. The regular news reports on these issues lead us to conclude that the foundations of democracy have become threatened by the presence of aggressive and socially disruptive bots, which aim to manipulate online political discourse.

While there is clarity on the various functions that bot accounts can be scripted to perform, as described below, the task of accurately defining this phenomenon and identifying bot accounts remains a challenge. At Texifter, we have endeavoured to bring nuance to this issue through a research project which explores the presence of automated accounts on Twitter. Initially, this project concerned itself with an attempt to identify bots which participated in online conversations around the prevailing cryptocurrency phenomenon. This article is the first in a series of three blog posts produced by the researchers at Texifter that outlines the contemporary phenomenon of Twitter bots….

Bots in their current iteration have a relatively short, albeit rapidly evolving history. Initially constructed with non-malicious intentions, it wasn’t until the late 1990s with the advent of Web 2.0 when bots began to develop a more negative reputation. Although bots have been used maliciously in denial-of-service (DDoS) attacks, spam emails, and mass identity theft, their purpose is not explicitly to incite mayhem.

Before the most recent political events, bots existed in chat rooms, operated as automated customer service agents on websites, and were a mainstay on dating websites. This familiar form of the bot is known to the majority of the general population as a “chatbot” – for instance, CleverBot was and still is a popular platform to talk to an “AI”. Another prominent example was Microsoft’s failed Twitter Chatbot Tay which made headlines in 2016 when “her” vocabulary and conversation functions were manipulated by Twitter users until “she” espoused neo-nazi views when “she” was subsequently deleted.

Image: XKCD Comic #632.

A Twitter bot is an account controlled by an algorithm or script, which is typically hosted on a cloud platform such as Heroku. They are typically, though not exclusively, scripted to conduct repetitive tasks.  For example, there are bots that retweet content containing particular keywords, reply to new followers, and direct messages to new followers; although they can be used for more complex tasks such as participating in online conversations. Bot accounts make up between 9 and 15% of all active accounts on Twitter; however, it is predicted that they account for a much greater percentage of total Twitter traffic. Twitter bots are generally not created with malicious intent; they are frequently used for online chatting or for raising the professional profile of a corporation – but their ability to pervade our online experience and shape political discourse warrants heightened scrutiny….(More)”.

Do Academic Journals Favor Researchers from Their Own Institutions?


Yaniv Reingewertz and Carmela Lutmar at Harvard Business Review: “Are academic journals impartial? While many would suggest that academic journals work for the advancement of knowledge and science, we show this is not always the case. In a recent study, we find that two international relations (IR) journals favor articles written by authors who share the journal’s institutional affiliation. We term this phenomenon “academic in-group bias.”

In-group bias is a well-known phenomenon that is widely documented in the psychological literature. People tend to favor their group, whether it is their close family, their hometown, their ethnic group, or any other group affiliation. Before our study, the evidence regarding academic in-group bias was scarce, with only one studyfinding academic in-group bias in law journals. Studies from economics found mixedresults. Our paper provides evidence of academic in-group bias in IR journals, showing that this phenomenon is not specific to law. We also provide tentative evidence which could potentially resolve the conflict in economics, suggesting that these journals might also exhibit in-group bias. In short, we show that academic in-group bias is general in nature, even if not necessarily large in scope….(More)”.

The Social Media Threat to Society and Security


George Soros at Project Syndicate: “It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. And there is a real chance that, once lost, those who grow up in the digital age – in which the power to command and shape people’s attention is increasingly concentrated in the hands of a few companies – will have difficulty regaining it.

The current moment in world history is a painful one. Open societies are in crisis, and various forms of dictatorships and mafia states, exemplified by Vladimir Putin’s Russia, are on the rise. In the United States, President Donald Trump would like to establish his own mafia-style state but cannot, because the Constitution, other institutions, and a vibrant civil society won’t allow it….

The rise and monopolistic behavior of the giant American Internet platform companies is contributing mightily to the US government’s impotence. These companies have often played an innovative and liberating role. But as Facebook and Google have grown ever more powerful, they have become obstacles to innovation, and have caused a variety of problems of which we are only now beginning to become aware…

Social media companies’ true customers are their advertisers. But a new business model is gradually emerging, based not only on advertising but also on selling products and services directly to users. They exploit the data they control, bundle the services they offer, and use discriminatory pricing to keep more of the benefits that they would otherwise have to share with consumers. This enhances their profitability even further, but the bundling of services and discriminatory pricing undermine the efficiency of the market economy.

Social media companies deceive their users by manipulating their attention, directing it toward their own commercial purposes, and deliberately engineering addiction to the services they provide. This can be very harmful, particularly for adolescents.

There is a similarity between Internet platforms and gambling companies. Casinos have developed techniques to hook customers to the point that they gamble away all of their money, even money they don’t have.

Something similar – and potentially irreversible – is happening to human attention in our digital age. This is not a matter of mere distraction or addiction; social media companies are actually inducing people to surrender their autonomy. And this power to shape people’s attention is increasingly concentrated in the hands of a few companies.

It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. Once lost, those who grow up in the digital age may have difficulty regaining it.

This would have far-reaching political consequences. People without the freedom of mind can be easily manipulated. This danger does not loom only in the future; it already played an important role in the 2016 US presidential election.

There is an even more alarming prospect on the horizon: an alliance between authoritarian states and large, data-rich IT monopolies, bringing together nascent systems of corporate surveillance with already-developed systems of state-sponsored surveillance. This may well result in a web of totalitarian control the likes of which not even George Orwell could have imagined….(More)”.

Regulatory sandbox lessons learned report


Financial Conduct Authority (UK): “The sandbox allows firms to test innovative products, services or business models in a live market environment, while ensuring that appropriate protections are in place. It was established to support the FCA’s objective of promoting effective competition in the interests of consumers and opened for applications in June 2016.

The sandbox has supported 50 firms from 146 applications received across the first two cohorts. This report sets out the sandbox’s overall impact on the market including the adoption of new technologies, increasing access and improving experiences for vulnerable consumers as well as lessons learnt from individual tests that have been, or are being, conducted as part of the sandbox.

Early indications suggest the sandbox is providing the benefits it set out to achieve with evidence of the sandbox enabling new products to be tested, reducing time and cost of getting innovative ideas to market, improving access to finance for innovators, and ensuring appropriate safeguards are built into new products and services.

We will be using these learnings to inform any future sandbox developments as well as our ongoing policymaking and supervision work….(More)”.

Small Data for Big Impact


Liz Luckett at the Stanford Social Innovation Review: “As an investor in data-driven companies, I’ve been thinking a lot about my grandfather—a baker, a small business owner, and, I now realize, a pioneering data scientist. Without much more than pencil, paper, and extraordinarily deep knowledge of his customers in Washington Heights, Manhattan, he bought, sold, and managed inventory while also managing risk. His community was poor, but his business prospered. This was not because of what we celebrate today as the power and predictive promise of big data, but rather because of what I call small data: nuanced market insights that come through regular and trusted interactions.

Big data takes into account volumes of information from largely electronic sources—such as credit cards, pay stubs, test scores—and segments people into groups. As a result, people participating in the formalized economy benefit from big data. But people who are paid in cash and have no recognized accolades, such as higher education, are left out. Small data captures those insights to address this market failure. My grandfather, for example, had critical customer information he carefully gathered over the years: who could pay now, who needed a few days more, and which tabs to close. If he had access to a big data algorithm, it likely would have told him all his clients were unlikely to repay him, based on the fact that they were low income (vs. high income) and low education level (vs. college degree). Today, I worry that in our enthusiasm for big data and aggregated predictions, we often lose the critical insights we can gain from small data, because we don’t collect it. In the process, we are missing vital opportunities to both make money and create economic empowerment.

We won’t solve this problem of big data by returning to my grandfather’s shop floor. What we need is more and better data—a small data movement to supply vital missing links in marketplaces and supply chains the world over. What are the proxies that allow large companies to discern whom among the low income are good customers in the absence of a shopkeeper? At The Social Entrepreneurs’ Fund (TSEF), we are profitably investing in a new breed of data company: enterprises that are intentionally and responsibly serving low-income communities, and generating new and unique insights about the behavior of individuals in the process. The value of the small data they collect is becoming increasingly useful to other partners, including corporations who are willing to pay for it. It is a kind of dual market opportunity that for the first time makes it economically advantageous for these companies to reach the poor. We are betting on small data to transform opportunities and quality of life for the underserved, tap into markets that were once seen as too risky or too costly to reach, and earn significant returns for investors….(More)”.

Building Trust in Data and Statistics


Shaida Badiee at UN World Data Forum: …What do we want for a 2030 data ecosystem?

Hope to achieve: A world where data are part of the DNA and culture of decision-making, used by all and valued as an important public good. A world where citizens trust the systems that produce data and have the skills and means to use and verify their quality and accuracy. A world where there are safeguards in place to protect privacy, while bringing the benefits of open data to all. In this world, countries value their national statistical systems, which are working independently with trusted partners in the public and private sectors and citizens to continuously meet the changing and expanding demands from data users and policy makers. Private sector data generators are generously sharing their data with public sector. And gaps in data are closing, making the dream of “leaving no one behind” come true, with SDG goals on the path to being met by 2030.

Hope to avoid: A world where large corporations control the bulk of national and international data and statistics with only limited sharing with the public sector, academics, and citizens. The culture of every man for himself and who pays, wins, dominates data sharing practices. National statistical systems are under-resourced and under-valued, with low trust from users, further weakening them and undermining their independence from political interference and their ability to control quality. The divide between those who have and those who do not have access, skills, and the ability to use data for decision-making and policy has widened. Data systems and their promise to count the uncounted and “leave no one behind” are falling behind due to low capacity and poor standards and institutions, and the hope of the 2030 agenda is fading.

With this vision in mind, are we on the right path? An optimist would say we are closer to the data ecosystem that we want to achieve. However, there are also some examples of movement in the wrong direction. There is no magic wand to make our wish come true, but a powerful enabler would be building trust in data and statistics. Therefore, this should be included as a goal in all our data strategies and action plans.

Here are some important building blocks underlying trust in data and statistics:

  1. Building strong organizational infrastructure, governance, and partnerships;
  2. Following sound data standards and principles for production, sharing, interoperability, and dissemination; and
  3. Addressing the last mile in the data value chain to meet users’ needs, create value with data, and ensure meaningful impacts…(More)”.