Statistics and data science degrees: Overhyped or the real deal?


 at The Conversation“Data science” is hot right now. The number of undergraduate degrees in statistics has tripled in the past decade, and as a statistics professor, I can tell you that it isn’t because freshmen love statistics.

Way back in 2009, economist Hal Varian of Google dubbed statistician the “next sexy job.” Since then, statistician, data scientist and actuary have topped various “best jobs” lists. Not to mention the enthusiastic press coverage of industry applications: Machine learning! Big dataAIDeep learning!

But is it good advice? I’m going to voice an unpopular opinion for the sake of starting a conversation. Stats is indeed useful, but not in the way that the popular media – and all those online data science degree programs – seem to suggest….

While all the press tends to go to the sensationalist applications – computers that watch cat videos, anyone? – the data science boom reflects a broad increase in demand for data literacy, as a baseline requirement for modern jobs.

The “big data era” doesn’t just mean large amounts of data; it also means increased ease and ability to collect data of all types, in all walks of life. Although the big five tech companies – Google, Apple, Amazon, Facebook and Microsoft – represent about 10 percent of the U.S. market cap and dominate the public imagination, they employ only one-half of one percent of all employees.

Therefore, to be a true revolution, data science will need to infiltrate nontech industries. And it is. The U.S. has seen its impact on political campaigns. I myself have consulted in the medical devices sector. A few years back, Walmart held a data analysis competition as a recruiting tool. The need for people that can dig into the data and parse it is everywhere.

In a speech at the National Academy of Sciences in 2015, Steven “Freakonomics” Levitt related his insights about the need for data-savvy workers, based on his experience as a sought-after consultant in fields ranging from the airline industry to fast food….(More)”.

A Doctor’s Prescription: Data May Finally Be Good for Your Health


Interview by Art Kleiner: “In 2015, Robert Wachter published The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age, a skeptical account of digitization in hospitals. Despite the promise offered by the digital transformation of healthcare, electronic health records had not delivered better care and greater efficiency. The cumbersome design, legacy procedures, and resistance from staff were frustrating everyone — administrators, nurses, consultants, and patients. Costs continued to rise, and preventable medical mistakes were not spotted. One patient at Wachter’s own hospital, one of the nation’s finest, was given 39 times the correct dose of antibiotics by an automated system that nobody questioned. The teenager survived, but it was clear that there needed to be a new approach to the management and use of data.

Wachter has for decades considered the delivery of healthcare through a lens focused on patient safety and quality. In 1996, he coauthored a paper in the New England Journal of Medicine that coined the term hospitalist in describing and promoting a new way of managing patients in hospitals: having one doctor — the hospitalist — “own” the patient journey from admission to discharge. The primary goal was to improve outcomes and save lives. Wachter argued it would also reduce costs and increase efficiency, making the business case for better healthcare. And he was right. Today there are more than 50,000 hospitalists, and it took just two years from the article’s publication to have the first data proving his point. In 2016, Wachter was named chair of the Department of Medicine at the University of California, San Francisco (UCSF), where he has worked since 1990.

Today, Wachter is, to paraphrase the title of a recent talk, less grumpy than he used to be about health tech. The hope part of his book’s title has materialized in some areas faster than he predicted. AI’s advances in imaging are already helping the detection of cancers become more accurate. As data collection has become better systematized, big technology firms such as Google, Amazon, and Apple are entering (in Google’s case, reentering) the field and having more success focusing their problem-solving skills on healthcare issues. In his San Francisco office, Wachter sat down with strategy+businessto discuss why the healthcare system may finally be about to change….

Systems for Fresh Thinking

S+B: The changes you appreciate seem to have less to do with technological design and more to do with people getting used to the new systems, building their own variations, and making them work.
WACHTER:
 The original electronic health record was just a platform play to get the data in digital form. It didn’t do anything particularly helpful in terms of helping the physicians make better decisions or helping to connect one kind of doctor with another kind of doctor. But it was a start.

I remember that when we were starting to develop our electronic health record at UCSF, 12 or 13 years ago, I hired a physician who is now in charge of our health computer system. I said to him, “We don’t have our electronic health record in yet, but I’m pretty sure we will in seven or eight years. What will your job be when that’s done?” I actually thought once the system was fully implemented, we’d be done with the need to innovate and evolve in health IT. That, of course, was asinine.

S+B: That’s like saying to an auto mechanic, “What will your job be when we have automatic transmissions?”
WACHTER:
 Right, but even more so, because many of us saw electronic health records as the be-all and end-all of digitally facilitated medicine. But putting in the electronic health record is just step one of 10. Then you need to start connecting all the pieces, and then you add analytics that make sense of the data and make predictions. Then you build tools and apps to fit into the workflow and change the way you work.

One of my biggest epiphanies was this: When you digitize, in any industry, nobody is clever enough to actually change anything. All they know how to do is digitize the old practice. You only start seeing real progress when smart people come in, begin using the new system, and say, “Why the hell do we do it that way?” And then you start thinking freshly about the work. That’s when you have a chance to reimagine the work in a digital environment…(More)”.

Can the UN Include Indigenous Peoples in its Development Goals?: There’s An App For That


Article by Jacquelyn Kovarik at NACA: “…Last year, during a high-level event of the General Assembly, a coalition of states along with the European Union and the International Labour Organization announced a new technology for monitoring the rights of Indigenous people. The proposal was a web application called “Indigenous Navigator,” designed to enable native peoples to monitor their rights from within their communities. The project is extremely seductive: why rely on the General Assembly to represent Indigenous peoples when they can represent themselves—remotely and via cutting-edge data-collecting technology? Could an app be the answer to over a decade of failed attempts to include Indigenous peoples in the international body?

The web application, which officially launched in 11 countries early this year, is comprised of four “community-based monitoring tools” that are designed to bridge the gap between Indigenous rights implementation and the United Nations goals. The toolbox, which is available open-access to anyone with internet, consists of: a set of two impressively comprehensive surveys designed to collect data on Indigenous rights at a community and national level; a comparative matrix that illustrates the links between the UN Declaration on Indigenous Rights and the UN development goals; an index designed to quickly compare Indigenous realities across communities, regions, or states; and a set of indicators designed to measure the realization of Indigenous rights in communities or states. The surveys are divided by sections based on the UN Declaration on the Rights of Indigenous Peoples, and include such categories as cultural integrity, land rights, access to justice, health, cross-border contacts, freedom of expression and media, education, and economic and social development. The surveys also include tips for methodological administration. For example, in questions about poverty rates in the community, a tip provided reads: “Most people/communities have their own criteria for defining who are poor and who are not poor. Here you are asked to estimate how many of the men of your people/community are considered poor, according to your own criteria for poverty.” It then suggests that it may be helpful to first discuss what are the perceived characteristics of a poor person within the community, before answering the question….(More)”.

Human Rights in the Big Data World


Paper by Francis Kuriakose and Deepa Iyer: “Ethical approach to human rights conceives and evaluates law through the underlying value concerns. This paper examines human rights after the introduction of big data using an ethical approach to rights. First, the central value concerns such as equity, equality, sustainability and security are derived from the history of digital technological revolution. Then, the properties and characteristics of big data are analyzed to understand emerging value concerns such as accountability, transparency, tracability, explainability and disprovability.

Using these value points, this paper argues that big data calls for two types of evaluations regarding human rights. The first is the reassessment of existing human rights in the digital sphere predominantly through right to equality and right to work. The second is the conceptualization of new digital rights such as right to privacy and right against propensity-based discrimination. The paper concludes that as we increasingly share the world with intelligence systems, these new values expand and modify the existing human rights paradigm….(More)”.

The free flow of non-personal data


Joint statement by Vice-President Ansip and Commissioner Gabriel on the European Parliament’s vote on the new EU rules facilitating the free flow of non-personal data: “The European Parliament adopted today a Regulation on the free flow of non-personal data proposed by the European Commission in September 2017. …

We welcome today’s vote at the European Parliament. A digital economy and society cannot exist without data and this Regulation concludes another key pillar of the Digital Single Market. Only if data flows freely can Europe get the best from the opportunities offered by digital progress and technologies such as artificial intelligence and supercomputers.  

This Regulation does for non-personal data what the General Data Protection Regulation has already done for personal data: free and safe movement across the European Union. 

With its vote, the European Parliament has sent a clear signal to all businesses of Europe: it makes no difference where in the EU you store and process your data – data localisation requirements within the Member States are a thing of the past. 

The new rules will provide a major boost to the European data economy, as it opens up potential for European start-ups and SMEs to create new services through cross-border data innovation. This could lead to a 4% – or €739 billion – higher EU GDP until 2020 alone. 

Together with the General Data Protection Regulation, the Regulation on the free flow of non-personal data will allow the EU to fully benefit from today’s and tomorrow’s data-based global economy.” 

Background

Since the Communication on the European Data Economy was adopted in January 2017 as part of the Digital Single Market strategy, the Commission has run a public online consultation, organised structured dialogues with Member States and has undertaken several workshops with different stakeholders. These evidence-gathering initiatives have led to the publication of an impact assessment….The Regulation on the free flow of non-personal data has no impact on the application of the General Data Protection Regulation (GDPR), as it does not cover personal data. However, the two Regulations will function together to enable the free flow of any data – personal and non-personal – thus creating a single European space for data. In the case of a mixed dataset, the GDPR provision guaranteeing free flow of personal data will apply to the personal data part of the set, and the free flow of non-personal data principle will apply to the non-personal part. …(More)”.

Text Analysis Systems Mine Workplace Emails to Measure Staff Sentiments


Alan Rothman at LLRX: “…For all of these good, bad or indifferent workplaces, a key question is whether any of the actions of management to engage the staff and listen to their concerns ever resulted in improved working conditions and higher levels of job satisfaction?

The answer is most often “yes”. Just having a say in, and some sense of control over, our jobs and workflows can indeed have a demonstrable impact on morale, camaraderie and the bottom line. As posited in the Hawthorne Effect, also termed the “Observer Effect”, this was first discovered during studies in the 1920’s and 1930’s when the management of a factory made improvements to the lighting and work schedules. In turn, worker satisfaction and productivity temporarily increased. This was not so much because there was more light, but rather, that the workers sensed that management was paying attention to, and then acting upon, their concerns. The workers perceived they were no longer just cogs in a machine.

Perhaps, too, the Hawthorne Effect is in some ways the workplace equivalent of the Heisenberg’s Uncertainty Principle in physics. To vastly oversimplify this slippery concept, the mere act of observing a subatomic particle can change its position.¹

Giving the processes of observation, analysis and change at the enterprise level a modern (but non-quantum) spin, is a fascinating new article in the September 2018 issue of The Atlantic entitled What Your Boss Could Learn by Reading the Whole Company’s Emails, by Frank Partnoy.  I highly recommend a click-through and full read if you have an opportunity. I will summarize and annotate it, and then, considering my own thorough lack of understanding of the basics of y=f(x), pose some of my own physics-free questions….

Today the text analytics business, like the work done by KeenCorp, is thriving. It has been long-established as the processing behind email spam filters. Now it is finding other applications including monitoring corporate reputations on social media and other sites.²

The finance industry is another growth sector, as investment banks and hedge funds scan a wide variety of information sources to locate “slight changes in language” that may point towards pending increases or decreases in share prices. Financial research providers are using artificial intelligence to mine “insights” from their own selections of news and analytical sources.

But is this technology effective?

In a paper entitled Lazy Prices, by Lauren Cohen (Harvard Business School and NBER), Christopher Malloy (Harvard Business School and NBER), and Quoc Nguyen (University of Illinois at Chicago), in a draft dated February 22, 2018, these researchers found that the share price of company, in this case NetApp in their 2010 annual report, measurably went down after the firm “subtly changes” its reporting “descriptions of certain risks”. Algorithms can detect such changes more quickly and effectively than humans. The company subsequently clarified in its 2011 annual report their “failure to comply” with reporting requirements in 2010. A highly skilled stock analyst “might have missed that phrase”, but once again its was captured by “researcher’s algorithms”.

In the hands of a “skeptical investor”, this information might well have resulted in them questioning the differences in the 2010 and 2011 annual reports and, in turn, saved him or her a great deal of money. This detection was an early signal of a looming decline in NetApp’s stock. Half a year after the 2011 report’s publication, it was reported that the Syrian government has bought the company and “used that equipment to spy on its citizen”, causing further declines.

Now text analytics is being deployed at a new target: The composition of employees’ communications. Although it has been found that workers have no expectations of privacy in their workplaces, some companies remain reluctant to do so because of privacy concerns. Thus, companies are finding it more challenging to resist the “urge to mine employee information”, especially as text analysis systems continue to improve.

Among the evolving enterprise applications are the human resources departments in assessing overall employee morale. For example, Vibe is such an app that scans through communications on Slack, a widely used enterprise platform. Vibe’s algorithm, in real-time reporting, measures the positive and negative emotions of a work team….(More)”.

Open Government Data Report: Enhancing Policy Maturity for Sustainable Impact


Report by the OECD: This report provides an overview of the state of open data policies across OECD member and partner countries, based on data collected through the OECD Open Government Data survey (2013, 2014, 2016/17), country reviews and comparative analysis. The report analyses open data policies using an analytical framework that is in line with the OECD OUR data Index and the International Open Data Charter. It assesses governments’ efforts to enhance the availability, accessibility and re-use of open government data. It makes the case that beyond countries’ commitment to open up good quality government data, the creation of public value requires engaging user communities from the entire ecosystem, such as journalists, civil society organisations, entrepreneurs, major tech private companies and academia. The report also underlines how open data policies are elements of broader digital transformations, and how public sector data policies require interaction with other public sector agendas such as open government, innovation, employment, integrity, public budgeting, sustainable development, urban mobility and transport. It stresses the relevance of measuring open data impacts in order to support the business case for open government data….(More)”.

Craft metrics to value co-production


Liz Richardson and Beth Perry at Nature: “Advocates of co-production encourage collaboration between professional researchers and those affected by that research, to ensure that the resulting science is relevant and useful. Opening up science beyond scientists is essential, particularly where problems are complex, solutions are uncertain and values are salient. For example, patients should have input into research on their conditions, and first-hand experience of local residents should shape research on environmental-health issues.

But what constitutes success on these terms? Without a better understanding of this, it is harder to incentivize co-production in research. A key way to support co-production is reconfiguring that much-derided feature of academic careers: metrics.

Current indicators of research output (such as paper counts or the h-index) conceptualize the value of research narrowly. They are already roundly criticized as poor measures of quality or usefulness. Less appreciated is the fact that these metrics also leave out the societal relevance of research and omit diverse approaches to creating knowledge about social problems.

Peer review also has trouble assessing the value of research that sits at disciplinary boundaries or that addresses complex social challenges. It denies broader social accountability by giving scientists a monopoly on determining what is legitimate knowledge1. Relying on academic peer review as a means of valuing research can discourage broader engagement.

This privileges abstract and theoretical research over work that is localized and applied. For example, research on climate-change adaptation, conducted in the global south by researchers embedded in affected communities, can make real differences to people’s lives. Yet it is likely to be valued less highly by conventional evaluation than research that is generalized from afar and then published in a high-impact English-language journal….(More)”.

Desire paths: the illicit trails that defy the urban planners


So goes the logic of “desire paths” – described by Robert Macfarlane as “paths & tracks made over time by the wishes & feet of walkers, especially those paths that run contrary to design or planning”; he calls them “free-will ways”. The New Yorker offers other names: “cow paths, pirate paths, social trails, kemonomichi (beast trails), chemins de l’âne (donkey paths), and Olifantenpad (elephant trails)”. JM Barrie described them as “Paths that have Made Themselves”….

Desire paths have been described as illustrating “the tension between the native and the built environment and our relationship to them”. Because they often form in areas where there are no pavements, they can be seen to “indicate [the] yearning” of those wishing to walk, a way for “city dwellers to ‘write back’ to city planners, giving feedback with their feet”.

But as well as revealing the path of least resistance, they can also reveal where people refuse to tread. If you’ve been walking the same route for years, an itchy-footed urge to go off-piste, even just a few metres, is probably something you’ll identify with. It’s this idea that led one academic journal to describe them as a record of “civil disobedience”.

Rather than dismiss or even chastise the naughty pedestrian by placing fences or railings to block off “illicit” wanderings, some planners work to incorporate them into urban environments. This chimes with the thinking of Jane Jacobs, an advocate of configuring cities around desire lines, who said: “There is no logic that can be superimposed on the city; people make it, and it is to them … that we must fit our plans.”…(More)”.

Whither large International Non-Governmental Organisations?


Working Paper by Penny Lawrence: “Large international non-government organisations (INGOs) seem to be in an existential crisis in their role in the fight for social justice. Many, such as Save the Children or Oxfam, have become big well-known brands with compliance expectations similar to big businesses. Yet the public still imagine them to be run by volunteers. Their context is changing so fast, and so unpredictably, that they are struggling to keep up. It is a time of extraordinary disruptive change including the digital transformation, changing societal norms and engagement expectations and political upheaval and challenge. Fifteen years ago the political centre-ground in the UK seemed firm, with expanding space for civil society organisations to operate. Space for civil society voice now seems more threatened and challenged (Kenny 2015).

There has been a decline in trust in large charities in particular, partly as a result of their own complacency, acting as if the argument for aid has been won. Partly as a result of questioned practices e.g. the fundraising scandal of 2016/17 (where repeated mail drops to individuals requesting funds caused public backlash) and the safeguarding scandal of 2018 (where historic cases of sexual abuse by INGO staff, including Oxfam, were revisited by media in the wake of the #me too movement). This is also partly as a result of political challenge on INGOs’ advocacy and influencing role, their bias and their voice:

‘Some government ministers regard the charity sector with suspicion because it largely employs senior people with a left-wing perspective on life and because of other unfair criticisms of government it means there is regularly a tension between big charities and the conservative party’ Richard Wilson (Former Minister for Civil Society) 2018

On the other hand many feel that charities who have taken significant contracts to deliver services for the state have forfeited their independent voice and lost their way:

‘The voluntary sector risks declining over the next ten years into a mere instrument of a shrunken state, voiceless and toothless, unless it seizes the agenda and creates its own vision.’ Professor Nicholas Deakin 2014

It’s a tough context to be leading an INGO through, but INGOs have appeared ill prepared and slow to respond to the threats and opportunities, not realising how much they may need to change to respond to the fast evolving context and expectations. Large INGOs spend most of their energy exploiting present grant and contract business models, rather than exploring the opportunities to overcome poverty offered by such disruptive change. Their size and structures do not enable agility. They are too internally focused and self-referencing at a time when the world around them is changing so fast, and when political sands have shifted. Focussing on the internationalisation of structures and decision-making means large INGOs are ‘defeated by our own complexity’, as one INGO interviewee put it.

The purpose of this paper is to stimulate thinking amongst large INGOs at a time of such extraordinary disruptive change. The paper explores options for large INGOs, in terms of function and structure. After outlining large INGOs’ history, changing context, value and current thinking, it explores learning from others outside the development sector before suggesting the emerging options. It reflects on what’s encouraging and what’s stopping change and offers possible choices and pathways forwards….(More)”.