Agora: Towards An Open Ecosystem for Democratizing Data Science & Artificial Intelligence


Paper by Jonas Traub et al: “Data science and artificial intelligence are driven by a plethora of diverse data-related assets including datasets, data streams, algorithms, processing software, compute resources, and domain knowledge. As providing all these assets requires a huge investment, data sciences and artificial intelligence are currently dominated by a small number of providers who can afford these investments. In this paper, we present a vision of a data ecosystem to democratize data science and artificial intelligence. In particular, we envision a data infrastructure for fine-grained asset exchange in combination with scalable systems operation. This will overcome lock-in effects and remove entry barriers for new asset providers. Our goal is to enable companies, research organizations, and individuals to have equal access to data, data science, and artificial intelligence. Such an open ecosystem has recently been put on the agenda of several governments and industrial associations. We point out the requirements and the research challenges as well as outline an initial data infrastructure architecture for building such a data ecosystem…(More)”.

Citizens need to know numbers


David Spiegelhalter at Aeon: “…Many criticised the Leave campaign for its claim that Britain sends the EU £350 million a week. When Boris Johnson repeated it in 2017 – by which time he was Foreign Secretary – the chair of the UK Statistics Authority (the official statistical watchdog) rebuked him, noting it was a ‘clear misuse of official statistics’. A private criminal prosecution was even made against Johnson for ‘misconduct in a public office’, but it was halted by the High Court.

The message on the bus had a strong emotional resonance with millions of people, even though it was essentially misinformation. The episode demonstrates both the power and weakness of statistics: they can be used to amplify an entire worldview, and yet they often do not stand up to scrutiny. This is why statistical literacy is so important – in an age in which data plays an ever-more prominent role in society, the ability to spot ways in which numbers can be misused, and to be able to deconstruct claims based on statistics, should be a standard civic skill.

Statistics are not cold hard facts – as Nate Silver writes in The Signal and the Noise (2012): ‘The numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning.’ Not only has someone used extensive judgment in choosing what to measure, how to define crucial ideas, and to analyse them, but the manner in which they are communicated can utterly change their emotional impact. Let’s assume that £350 million is the actual weekly contribution to the EU. I often ask audiences to suggest what they would put on the side of the bus if they were on the Remain side. A standard option for making an apparently big number look small is to consider it as a proportion of an even bigger number: for example, the UK’s GDP is currently around £2.3 trillion, and so this contribution would comprise less than 1 per cent of GDP, around six months’ typical growth. An alternative device is to break down expenditure into smaller, more easily grasped units: for example, as there are 66 million people in the UK, £350 million a week is equivalent to around 75p a day, less than $1, say about the cost of a small packet of crisps (potato chips). If the bus had said: We each send the EU the price of a packet of crisps each day, the campaign might not have been so successful.

Numbers are often used to persuade rather than inform, statistical literacy needs to be improved, and so surely we need more statistics courses in schools and universities? Well, yes, but this should not mean more of the same. After years of researching and teaching statistical methods, I am not alone in concluding that the way in which we teach statistics can be counterproductive, with an overemphasis on mathematical foundations through probability theory, long lists of tests and formulae to apply, and toy problems involving, say, calculating the standard deviation of the weights of cod. The American Statistical Association’s Guidelines for Assessment and Instruction in Statistics Education (2016) strongly recommended changing the pedagogy of statistics into one based on problemsolving, real-world examples, and with an emphasis on communication….(More)”.

Experimental Innovation Policy


Paper by Albert Bravo-Biosca: “Experimental approaches are increasingly being adopted across many policy fields, but innovation policy has been lagging. This paper reviews the case for policy experimentation in this field, describes the different types of experiments that can be undertaken, discusses some of the unique challenges to the use of experimental approaches in innovation policy, and summarizes some of the emerging lessons, with a focus on randomized trials. The paper concludes describing how at the Innovation Growth Lab we have been working with governments across the OECD to help them overcome the barriers to policy experimentation in order to make their policies more impactful….(More)”.

The business case for integrating claims and clinical data


Claudia Williams at MedCityNews: “The path to value-based care is arduous. For health plans, their ability to manage care, assess quality, lower costs, and streamline reporting is directly impacted by access to clinical data. For providers, the same can be said due to their lack of access to claims data. 

Providers and health plans are increasingly demanding integrated claims and clinical data to drive and support value-based care programs. These organizations know that clinical and claims information from more than a single organization is the only way to get a true picture of patient care. From avoiding medication errors to enabling an evidence-based approach to treatment or identifying at-risk patients, the value of integrated claims and clinical data is immense — and will have far-reaching influence on both health outcomes and costs of care over time.

On July 30, Medicare announced the Data at the Point of Care pilot to share valuable claims data with Medicare providers in order to “fill in information gaps for clinicians, giving them a more structured and complete patient history with information like previous diagnoses, past procedures, and medication lists.” But that’s not the only example. To transition from fee-for-service to value-based care, providers and health plans have begun to partner with health data networks to access integrated clinical and claims data: 

Health plan adoption of integrated data strategy

A California health plan is partnering with one of the largest nonprofit health data networks in California, to better integrate clinical and claims data. …

Providers leveraging claims data to understand patient medication patterns 

Doctors using advanced health data networks typically see a full list of patients’ medications, derived from claims, when they treat them. With this information available, doctors can avoid dangerous drug to-drug interactions when they prescribe new medications. After a visit, they can also follow up and see if a patient actually filled a prescription and is still taking it….(More)”.

Guide to Mobile Data Analytics in Refugee Scenarios


Book edited Albert Ali Salah, Alex Pentland, Bruno Lepri and Emmanuel Letouzé: “After the start of the Syrian Civil War in 2011–12, increasing numbers of civilians sought refuge in neighboring countries. By May 2017, Turkey had received over 3 million refugees — the largest r efugee population in the world. Some lived in government-run camps near the Syrian border, but many have moved to cities looking for work and better living conditions. They faced problems of integration, income, welfare, employment, health, education, language, social tension, and discrimination. In order to develop sound policies to solve these interlinked problems, a good understanding of refugee dynamics is necessary.

This book summarizes the most important findings of the Data for Refugees (D4R) Challenge, which was a non-profit project initiated to improve the conditions of the Syrian refugees in Turkey by providing a database for the scientific community to enable research on urgent problems concerning refugees. The database, based on anonymized mobile call detail records (CDRs) of phone calls and SMS messages of one million Turk Telekom customers, indicates the broad activity and mobility patterns of refugees and citizens in Turkey for the year 1 January to 31 December 2017. Over 100 teams from around the globe applied to take part in the challenge, and 61 teams were granted access to the data.

This book describes the challenge, and presents selected and revised project reports on the five major themes: unemployment, health, education, social integration, and safety, respectively. These are complemented by additional invited chapters describing related projects from international governmental organizations, technological infrastructure, as well as ethical aspects. The last chapter includes policy recommendations, based on the lessons learned.

The book will serve as a guideline for creating innovative data-centered collaborations between industry, academia, government, and non-profit humanitarian agencies to deal with complex problems in refugee scenarios. It illustrates the possibilities of big data analytics in coping with refugee crises and humanitarian responses, by showcasing innovative approaches drawing on multiple data sources, information visualization, pattern analysis, and statistical analysis.It will also provide researchers and students working with mobility data with an excellent coverage across data science, economics, sociology, urban computing, education, migration studies, and more….(More)”.

#Kremlin: Using Hashtags to Analyze Russian Disinformation Strategy and Dissemination on Twitter


Paper by Sarah Oates, and John Gray: “Reports of Russian interference in U.S. elections have raised grave concerns about the spread of foreign disinformation on social media sites, but there is little detailed analysis that links traditional political communication theory to social media analytics. As a result, it is difficult for researchers and analysts to gauge the nature or level of the threat that is disseminated via social media. This paper leverages both social science and data science by using traditional content analysis and Twitter analytics to trace how key aspects of Russian strategic narratives were distributed via #skripal, #mh17, #Donetsk, and #russophobia in late 2018.

This work will define how key Russian international communicative goals are expressed through strategic narratives, describe how to find hashtags that reflect those narratives, and analyze user activity around the hashtags. This tests both how Twitter amplifies specific information goals of the Russians as well as the relative success (or failure) of particular hashtags to spread those messages effectively. This research uses Mentionmapp, a system co-developed by one of the authors (Gray) that employs network analytics and machine intelligence to identify the behavior of Twitter users as well as generate profiles of users via posting history and connections. This study demonstrates how political communication theory can be used to frame the study of social media; how to relate knowledge of Russian strategic priorities to labels on social media such as Twitter hashtags; and to test this approach by examining a set of Russian propaganda narratives as they are represented by hashtags. Our research finds that some Twitter users are consistently active across multiple Kremlin-linked hashtags, suggesting that knowledge of these hashtags is an important way to identify Russian propaganda online influencers. More broadly, we suggest that Twitter dichotomies such as bot/human or troll/citizen should be used with caution and analysis should instead address the nuances in Twitter use that reflect varying levels of engagement or even awareness in spreading foreign disinformation online….(More)”.

Complex Systems Change Starts with Those Who Use the Systems


Madeleine Clarke & John Healy at Stanford Social Innovation Review: “Philanthropy, especially in the United States and Europe, is increasingly espousing the idea that transformative shifts in social care, education, and health systems are needed. Yet successful examples of systems-level reform are rare. Concepts such as collective impact (funder-driven, cross-sector collaboration), implementation science (methods to promote the systematic uptake of research findings), and catalytic philanthropy (funders playing a powerful role in mobilizing fundamental reforms) have gained prominence as pathways to this kind of change. These approaches tend to characterize philanthropy—usually foundations—as the central, heroic actor. Meanwhile, research on change within social and health services continues to indicate that deeply ingrained beliefs and practices, such as overly medicalized models of care for people with intellectual disabilities, and existing resource distribution, which often maintains the pay and conditions of professional groups, inhibits the introduction of reform into complex systems. A recent report by RAND, for example, showed that a $1 billion, seven-year initiative to improve teacher performance failed, and cited the complexity of the system and practitioners’ resistance to change as possible explanations. 

We believe the most effective way to promote systems-level social change is to place the voices of people who use social services—the people for whom change matters most—at the center of change processes. But while many philanthropic organizations tout the importance of listening to the “end beneficiaries” or “service users,” the practice nevertheless remains an underutilized methodology for countering systemic obstacles to change and, ultimately, reforming complex systems….(More)”.

Data-Sharing in IoT Ecosystems From a Competition Law Perspective: The Example of Connected Cars


Paper by Wolfgang Kerber: “…analyses whether competition law can help to solve problems of access to data and interoperability in IoT ecosystems, where often one firm has exclusive control of the data produced by a smart device (and of the technical access to this device). Such a gatekeeper position can lead to the elimination of competition for aftermarket and other complementary services in such IoT ecosystems. This problem is analysed both from an economic and a legal perspective, and also generally for IoT ecosystems as well as for the much discussed problems of “access to in-vehicle data and re-sources” in connected cars, where the “extended vehicle” concept of the car manufacturers leads to such positions of exclusive control. The paper analyses, in particular, the competition rules about abusive behavior of dominant firms (Art. 102 TFEU) and of firms with “relative market power” (§ 20 (1) GWB) in German competition law. These provisions might offer (if appropriately applied and amended) at least some solutions for these data access problems. Competition law, however, might not be sufficient for dealing with all or most of these problems, i.e. that also additional solutions might be needed (data portability, direct data (access) rights, or sector-specific regulation)….(More)”.

Towards “Government as a Platform”? Preliminary Lessons from Australia, the United Kingdom and the United States


Paper by J. Ramon Gil‐Garcia, Paul Henman, and Martha Alicia Avila‐Maravilla: “In the last two decades, Internet portals have been used by governments around the world as part of very diverse strategies from service provision to citizen engagement. Several authors propose that there is an evolution of digital government reflected in the functionality and sophistication of these portals and other technologies. More recently, scholars and practitioners are proposing different conceptualizations of “government as a platform” and, for some, this could be the next stage of digital government. However, it is not clear what are the main differences between a sophisticated Internet portal and a platform. Therefore, based on an analysis of three of the most advanced national portals, this ongoing research paper explores to what extent these digital efforts clearly represent the basic characteristics of platforms. So, this paper explores questions such as: (1) to what extent current national portals reflect the characteristics of what has been called “government as a platform?; and (2) Are current national portals evolving towards “government as a platform”?…(More)”.

JPMorgan Creates ‘Volfefe’ Index to Track Trump Tweet Impact


Tracy Alloway at Bloomberg: “Two of the largest Wall Street banks are trying to measure the market impact of Donald Trump’s tweets.

Analysts at JPMorgan Chase & Co. have created an index to quantify what they say are the growing effects on U.S. bond yields. Citigroup Inc.’s foreign exchange team, meanwhile, report that these micro-blogging missives are also becoming “increasingly relevant” to foreign-exchange moves.

JPMorgan’s “Volfefe Index,” named after Trump’s mysterious covfefe tweet from May 2017, suggests that the president’s electronic musings are having a statistically significant impact on Treasury yields. The number of market-moving Trump tweets has ballooned in the past month, with those including words such as “China,” “billion,” “products,” “Democrats” and “great” most likely to affect prices, the analysts found….

JPMorgan’s analysis looked at Treasury yields in the five minutes after a Trump tweet, and the index shows the rolling one-month probability that each missive is market-moving.

They found that the Volfefe Index can account for a “measurable fraction” of moves in implied volatility, seen in interest rate derivatives known as swaptions. That’s particularly apparent at the shorter end of the curve, with two- and five-year rates more impacted than 10-year securities.

Meanwhile, Citi’s work shows that the president’s tweets are generally followed by a stretch of higher volatility across global currency markets. And there’s little sign traders are growing numb to these messages….(More)”