Cambridge Analytica scandal: legitimate researchers using Facebook data could be collateral damage


 at The Conversation: “The scandal that has erupted around Cambridge Analytica’s alleged harvesting of 50m Facebook profiles assembled from data provided by a UK-based academic and his company is a worrying development for legitimate researchers.

Political data analytics company Cambridge Analytica – which is affiliated with Strategic Communication Laboratories (SCL) – reportedly used Facebook data, after it was handed over by Aleksandr Kogan, a lecturer at the University of Cambridge’s department of psychology.

Kogan, through his company Global Science Research (GSR) – separate from his university work – gleaned the data from a personality test app named “thisisyourdigitallife”. Roughly 270,000 US-based Facebook users voluntarily responded to the test in 2014. But the app also collected data on those participants’ Facebook friends without their consent.

This was possible due to Facebook rules at the time that allowed third-party apps to collect data about a Facebook user’s friends. The Mark Zuckerberg-run company has since changed its policy to prevent such access to developers….

Social media data is a rich source of information for many areas of research in psychology, technology, business and humanities. Some recent examples include using Facebook to predict riots, comparing the use of Facebook with body image concern in adolescent girls and investigating whether Facebook can lower levels of stress responses, with research suggesting that it may enhance and undermine psycho-social constructs related to well-being.

It is right to believe that researchers and their employers value research integrity. But instances where trust has been betrayed by an academic – even if it’s the case that data used for university research purposes wasn’t caught in the crossfire – will have a negative impact on whether participants will continue to trust researchers. It also has implications for research governance and for companies to share data with researchers in the first place.

Universities, research organisations and funders govern the integrity of research with clear and strict ethics proceduresdesigned to protect participants in studies, such as where social media data is used. The harvesting of data without permission from users is considered an unethical activity under commonly understood research standards.

The fallout from the Cambridge Analytica controversy is potentially huge for researchers who rely on social networks for their studies, where data is routinely shared with them for research purposes. Tech companies could become more reluctant to share data with researchers. Facebook is already extremely protective of its data – the worry is that it could become doubly difficult for researchers to legitimately access this information in light of what has happened with Cambridge Analytica….(More)”.

Artificial Intelligence and the Need for Data Fairness in the Global South


Medium blog by Yasodara Cordova: “…The data collected by industry represents AI opportunities for governments, to improve their services through innovation. Data-based intelligence promises to increase the efficiency of resource management by improving transparency, logistics, social welfare distribution — and virtually every government service. E-government enthusiasm took of with the realization of the possible applications, such as using AI to fight corruption by automating the fraud-tracking capabilities of cost-control tools. Controversially, the AI enthusiasm has spread to the distribution of social benefits, optimization of tax oversight and control, credit scoring systems, crime prediction systems, and other applications based in personal and sensitive data collection, especially in countries that do not have comprehensive privacy protections.

There are so many potential applications, society may operate very differently in ten years when the “datafixation” has advanced beyond citizen data and into other applications such as energy and natural resource management. However, many countries in the Global South are not being given necessary access to their countries’ own data.

Useful data are everywhere, but only some can take advantage. Beyond smartphones, data can be collected from IoT components in common spaces. Not restricted to urban spaces, data collection includes rural technology like sensors installed in tractors. However, even when the information is related to issues of public importance in developing countries —like data taken from road mesh or vital resources like water and land — it stays hidden under contract rules and public citizens cannot access, and therefore take benefit, from it. This arrangement keeps the public uninformed about their country’s operations. The data collection and distribution frameworks are not built towards healthy partnerships between industry and government preventing countries from realizing the potential outlined in the previous paragraph.

The data necessary to the development of better cities, public policies, and common interest cannot be leveraged if kept in closed silos, yet access often costs more than is justifiable. Data are a primordial resource to all stages of new technology, especially tech adoption and integration, so the necessary long term investment in innovation needs a common ground to start with. The mismatch between the pace of the data collection among big established companies and small, new, and local businesses will likely increase with time, assuming no regulation is introduced for equal access to collected data….

Currently, data independence remains restricted to discussions on the technological infrastructure that supports data extraction. Privacy discussions focus on personal data rather than the digital accumulation of strategic data in closed silos — a necessary discussion not yet addressed. The national interest of data is not being addressed in a framework of economic and social fairness. Access to data, from a policy-making standpoint, needs to find a balance between the extremes of public, open access and limited, commercial use.

A final, but important note: the vast majority of social media act like silos. APIs play an important role in corporate business models, where industry controls the data it collects without reward, let alone user transparency. Negotiation of the specification of APIs to make data a common resource should be considered, for such an effort may align with the citizens’ interest….(More)”.

Truth Decay: An Initial Exploration of the Diminishing Role of Facts and Analysis in American Public Life


Report by Jennifer Kavanagh and Michael D. Rich: “Over the past two decades, national political and civil discourse in the United States has been characterized by “Truth Decay,” defined as a set of four interrelated trends: an increasing disagreement about facts and analytical interpretations of facts and data; a blurring of the line between opinion and fact; an increase in the relative volume, and resulting influence, of opinion and personal experience over fact; and lowered trust in formerly respected sources of factual information. These trends have many causes, but this report focuses on four: characteristics of human cognitive processing, such as cognitive bias; changes in the information system, including social media and the 24-hour news cycle; competing demands on the education system that diminish time spent on media literacy and critical thinking; and polarization, both political and demographic. The most damaging consequences of Truth Decay include the erosion of civil discourse, political paralysis, alienation and disengagement of individuals from political and civic institutions, and uncertainty over national policy.

This report explores the causes and consequences of Truth Decay and how they are interrelated, and examines past eras of U.S. history to identify evidence of Truth Decay’s four trends and observe similarities with and differences from the current period. It also outlines a research agenda, a strategy for investigating the causes of Truth Decay and determining what can be done to address its causes and consequences….(More)”.

How tech used to track the flu could change the game for public health response


Cathie Anderson in the Sacramento Bee: “Tech entrepreneurs and academic researchers are tracking the spread of flu in real-time, collecting data from social media and internet-connected devices that show startling accuracy when compared against surveillance data that public health officials don’t report until a week or two later….

Smart devices and mobile apps have the potential to reshape public health alerts and responses,…, for instance, the staff of smart thermometer maker Kinsa were receiving temperature readings that augured the surge of flu patients in emergency rooms there.

Kinsa thermometers are part of the movement toward the Internet of Things – devices that automatically transmit information to a database. No personal information is shared, unless users decide to input information such as age and gender. Using data from more than 1 million devices in U.S. homes, the staff is able to track fever as it hits and use an algorithm to estimate impact for a broader population….

Computational researcher Aaron Miller worked with an epidemiological team at the University of Iowa to assess the feasibility of using Kinsa data to forecast the spread of flu. He said the team first built a model using surveillance data from the CDC and used it to forecast the spread of influenza. Then the team created a model where they integrated the data from Kinsa along with that from the CDC.

“We got predictions that were … 10 to 50 percent better at predicting the spread of flu than when we used CDC data alone,” Miller said. “Potentially, in the future, if you had granular information from the devices and you had enough information, you could imagine doing analysis on a really local level to inform things like school closings.”

While Kinsa uses readings taken in homes, academic researchers and companies such as sickweather.com are using crowdsourcing from social media networks to provide information on the spread of flu. Siddharth Shah, a transformational health industry analyst at Frost & Sullivan, pointed to an award-winning international study led by researchers at Northeastern University that tracked flu through Twitter posts and other key parameters of flu.

When compared with official influenza surveillance systems, the researchers said, the model accurately forecast the evolution of influenza up to six weeks in advance, much earlier than prior models. Such advance warnings would give health agencies significantly more time to expand upon medical resources or to alert the public to measures they can take to prevent transmission of the disease….

For now, Shah said, technology will probably only augment or complement traditional public data streams. However, he added, innovations already are changing how diseases are tracked. Chronic disease management, for instance, is going digital with devices such as Omada health that helps people with Type 2 diabetes better manage health challenges and Noom, a mobile app that helps people stop dieting and instead work toward true lifestyle change….(More).

Sub-National Democracy and Politics Through Social Media


Book edited by Mehmet Zahid Sobacı and İbrahim Hatipoğlu: “This book analyzes the impact of social media on democracy and politics at the subnational level in developed and developing countries. Over the last decade or so, social media has transformed politics. Offering political actors opportunities to organize, mobilize, and connect with constituents, voters, and supporters, social media has become an important tool in global politics as well as a force for democracy. Most of the available research literature focuses on the impact of social media at the national level; this book fills that gap by analyzing the political uses of social media at the sub-national level.

The book is divided into two parts. Part One, “Social Media for Democracy” includes chapters that analyze potential contributions of social media tools to the realizing of basic values of democracy, such as public engagement, transparency, accountability, participation and collaboration at the sub-national level. Part Two, “Social Media in Politics” focuses on the use of social media tools by political actors in political processes and activities (online campaigns, protests etc.) at the local, regional and state government levels during election and non-election periods. Combining theoretical and empirical analysis, each chapter provides evaluations of overarching issues, questions, and problems as well as real-world experiences with social media, politics, and democracy in a diverse sample of municipalities…(More)”.

Nudging the city and residents of Cape Town to save water


Leila Harris, Jiaying Zhao and Martine Visser in The Conversation: “Cape Town could become the world’s first major city to run out of water – what’s been termed Day Zero….To its credit, the city has worked with researchers at the University of Cape Town to test strategies to nudge domestic users into reducing their water use. Nudges are interventions to encourage behaviour change for better outcomes, or in this context, to achieve environmental or conservation goals.

What key insights could help inform the city’s strategies? Research from psychology and behavioural economics could prove useful to refine efforts and help to achieve further water savings.

The most effective tactics

Research suggests the following types of nudges could be effective in promoting conservation behaviours.

Social norms: International research, as well as studies conducted in Cape Town, suggest that effective conservation can be promoted by giving feedback to consumers on how they perform relative to their neighbours. To this end, Cape Town introduced a water map that highlights homes that are compliant with targets.

The city has also been bundling information on usage with easy to implement water saving tips, something that research has shown to be particularly effective.

Research also suggests that combining behavioural interventions with traditional measures – such as tariff increases and restrictions – are often effective to reduce use in the short-term.

Real-time feedback: Cape Town is presenting the daily water level in major dams on a dashboard. This approach is consistent with research that shows that real-time information can effectively reduce water and energy consumption.

Such efforts could even be more effective if information is highlighted in relation to the critical level that’s been set for Day Zero, in this case 13.5%.

In the early days of a drought, it is also advisable to make information like this readily accessible through news outlets, social media, or even text messages. The water tracker produced by eighty20, a private Cape Town-based company, provides an example.

Social recognition: There’s evidence that efforts to celebrate successes or encourage competition can be effective – for instance, recognising neighbourhoods for meeting conservation targets. Prizes needn’t be monetary. Sometimes simple recognition, such as a certificate, can be effective.

Social recognition was found to be the most successful intervention among nine others nudges tested in research conducted in Cape Town in 2016. In this experiment, households who reduced consumption by 10% were recognised on the city’s website.

Another study showed that competition between the various floors of a government building in the Western Cape led to energy savings of up to 14%.

Cooperation: In the months ahead, the city would also do well to consider the support it might offer to encourage cooperation, particularly as the situation becomes more acute and as tensions rise.

Past studies have shown that social reputation and efforts to promote reciprocity can go a long way to encourage cooperation. The point is argued in a recent article featuring the importance of cooperation among Capetonians across different income groups.

Some residents of Cape Town are already pushing for a cooperative approach such as helping neighbours who might have difficulty travelling to collection points. Support for these efforts should be an important part of policies in the run up to Day Zero. These are often the examples that provide bright spots in challenging times.

Research also suggests that to navigate moments of crisis effectively, clear and trustworthy communication is critical. This also needs to be a priority….(More)“.

Infection forecasts powered by big data


Michael Eisenstein at Nature: “…The good news is that the present era of widespread access to the Internet and digital health has created a rich reservoir of valuable data for researchers to dive into….By harvesting and combining these streams of big data with conventional ways of monitoring infectious diseases, the public-health community could gain fresh powers to catch and curb emerging outbreaks before they rage out of control.

Going viral

Data scientists at Google were the first to make a major splash using data gathered online to track infectious diseases. The Google Flu Trends algorithm, launched in November 2008, combed through hundreds of billions of users’ queries on the popular search engine to look for small increases in flu-related terms such as symptoms or vaccine availability. Initial data suggested that Google Flu Trends could accurately map the incidence of flu with a lag of roughly one day. “It was a very exciting use of these data for the purpose of public health,” says Brownstein. “It really did start a whole revolution and new field of work in query data.”

Unfortunately, Google Flu Trends faltered when it mattered the most, completely missing the onset in April 2009 of the H1N1 pandemic. The algorithm also ran into trouble later on in the pandemic. It had been trained against seasonal fluctuations of flu, says Viboud, but people’s behaviour changed in the wake of panic fuelled by media reports — and that threw off Google’s data. …

Nevertheless, its work with Internet usage data was inspirational for infectious-disease researchers. A subsequent study from a team led by Cecilia Marques-Toledo at the Federal University of Minas Gerais in Belo Horizonte, Brazil, used Twitter to get high-resolution data on the spread of dengue fever in the country. The researchers could quickly map new cases to specific cities and even predict where the disease might spread to next (C. A. Marques-Toledo et al. PLoS Negl. Trop. Dis. 11, e0005729; 2017). Similarly, Brownstein and his colleagues were able to use search data from Google and Twitter to project the spread of Zika virus in Latin America several weeks before formal outbreak declarations were made by public-health officials. Both Internet services are used widely, which makes them data-rich resources. But they are also proprietary systems for which access to data is controlled by a third party; for that reason, Generous and his colleagues have opted instead to make use of search data from Wikipedia, which is open source. “You can get the access logs, and how many people are viewing articles, which serves as a pretty good proxy for search interest,” he says.

However, the problems that sank Google Flu Trends still exist….Additionally, online activity differs for infectious conditions with a social stigma such as syphilis or AIDS, because people who are or might be affected are more likely to be concerned about privacy. Appropriate search-term selection is essential: Generous notes that initial attempts to track flu on Twitter were confounded by irrelevant tweets about ‘Bieber fever’ — a decidedly non-fatal condition affecting fans of Canadian pop star Justin Bieber.

Alternatively, researchers can go straight to the source — by using smartphone apps to ask people directly about their health. Brownstein’s team has partnered with the Skoll Global Threats Fund to develop an app called Flu Near You, through which users can voluntarily report symptoms of infection and other information. “You get more detailed demographics about age and gender and vaccination status — things that you can’t get from other sources,” says Brownstein. Ten European Union member states are involved in a similar surveillance programme known as Influenzanet, which has generally maintained 30,000–40,000 active users for seven consecutive flu seasons. These voluntary reporting systems are particularly useful for diseases such as flu, for which many people do not bother going to the doctor — although it can be hard to persuade people to participate for no immediate benefit, says Brownstein. “But we still get a good signal from the people that are willing to be a part of this.”…(More)”.

Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It?


The New York Times: “The idea has been around for a bit. Jaron Lanier, the tech philosopher and virtual-reality pioneer who now works for Microsoft Research, proposed it in his 2013 book, “Who Owns the Future?,” as a needed corrective to an online economy mostly financed by advertisers’ covert manipulation of users’ consumer choices.

It is being picked up in “Radical Markets,” a book due out shortly from Eric A. Posner of the University of Chicago Law School and E. Glen Weyl, principal researcher at Microsoft. And it is playing into European efforts to collect tax revenue from American internet giants.

In a report obtained last month by Politico, the European Commission proposes to impose a tax on the revenue of digital companies based on their users’ location, on the grounds that “a significant part of the value of a business is created where the users are based and data is collected and processed.”

Users’ data is a valuable commodity. Facebook offers advertisers precisely targeted audiences based on user profiles. YouTube, too, uses users’ preferences to tailor its feed. Still, this pales in comparison with how valuable data is about to become, as the footprint of artificial intelligence extends across the economy.

Data is the crucial ingredient of the A.I. revolution. Training systems to perform even relatively straightforward tasks like voice translation, voice transcription or image recognition requires vast amounts of data — like tagged photos, to identify their content, or recordings with transcriptions.

“Among leading A.I. teams, many can likely replicate others’ software in, at most, one to two years,” notes the technologist Andrew Ng. “But it is exceedingly difficult to get access to someone else’s data. Thus data, rather than software, is the defensible barrier for many businesses.”

We may think we get a fair deal, offering our data as the price of sharing puppy pictures. By other metrics, we are being victimized: In the largest technology companies, the share of income going to labor is only about 5 to 15 percent, Mr. Posner and Mr. Weyl write. That’s way below Walmart’s 80 percent. Consumer data amounts to work they get free….

The big question, of course, is how we get there from here. My guess is that it would be naïve to expect Google and Facebook to start paying for user data of their own accord, even if that improved the quality of the information. Could policymakers step in, somewhat the way the European Commission did, demanding that technology companies compute the value of consumer data?…(More)”.

Journalism and artificial intelligence


Notes by Charlie Beckett (at LSE’s Media Policy Project Blog) : “…AI and machine learning is a big deal for journalism and news information. Possibly as important as the other developments we have seen in the last 20 years such as online platforms, digital tools and social media. My 2008 book on how journalism was being revolutionised by technology was called SuperMedia because these technologies offered extraordinary opportunities to make journalism much more efficient and effective – but also to transform what we mean by news and how we relate to it as individuals and communities. Of course, that can be super good or super bad.

Artificial intelligence and machine learning can help the news media with its three core problems:

  1. The overabundance of information and sources that leave the public confused
  2. The credibility of journalism in a world of disinformation and falling trust and literacy
  3. The Business model crisis – how can journalism become more efficient – avoiding duplication; be more engaged, add value and be relevant to the individual’s and communities’ need for quality, accurate information and informed, useful debate.

But like any technology they can also be used by bad people or for bad purposes: in journalism that can mean clickbait, misinformation, propaganda, and trolling.

Some caveats about using AI in journalism:

  1. Narratives are difficult to program. Trusted journalists are needed to understand and write meaningful stories.
  2. Artificial Intelligence needs human inputs. Skilled journalists are required to double check results and interpret them.
  3. Artificial Intelligence increases quantity, not quality. It’s still up to the editorial team and developers to decide what kind of journalism the AI will help create….(More)”.

Citicafe: conversation-based intelligent platform for citizen engagement


Paper by Amol Dumrewal et al in the Proceedings of the ACM India Joint International Conference on Data Science and Management of Data: “Community civic engagement is a new and emerging trend in urban cities driven by the mission of developing responsible citizenship. The recognition of civic potential in every citizen goes a long way in creating sustainable societies. Technology is playing a vital role in helping this mission and over the last couple of years, there have been a plethora of social media avenues to report civic issues. Sites like Twitter, Facebook, and other online portals help citizens to report issues and register complaints. These complaints are analyzed by the public services to help understand and in-turn address these issues. However, once the complaint is registered, often no formal or informal feedback is given back from these sites to the citizens. This de-motivates citizens and may deter them from registering further complaints. In addition, these sites offer no holistic information about a neighborhood to the citizens. It is useful for people to know whether there are similar complaints posted by other people in the same area, the profile of all complaints and a know-how of how and when these complaints will be addressed.

In this paper, we create a conversation-based platform CitiCafe for enhancing citizen engagement front-ended by a virtual agent with a Twitter interface. This platform back-end stores and processes information pertaining to civic complaints in a city. A Twitter based conversation service allows citizens to have a direct correspondence with CitiCafe via “tweets” and direct messages. The platform also helps citizens to (a) report problems and (b) gather information related to civic issues in different neighborhoods. This can also help, in the long run, to develop civic conversations among citizens and also between citizens and public services….(More)”.