How Democracy Can Survive Big Data


Colin Koopman in The New York Times: “…The challenge of designing ethics into data technologies is formidable. This is in part because it requires overcoming a century-long ethos of data science: Develop first, question later. Datafication first, regulation afterward. A glimpse at the history of data science shows as much.

The techniques that Cambridge Analytica uses to produce its psychometric profiles are the cutting edge of data-driven methodologies first devised 100 years ago. The science of personality research was born in 1917. That year, in the midst of America’s fevered entry into war, Robert Sessions Woodworth of Columbia University created the Personal Data Sheet, a questionnaire that promised to assess the personalities of Army recruits. The war ended before Woodworth’s psychological instrument was ready for deployment, but the Army had envisioned its use according to the precedent set by the intelligence tests it had been administering to new recruits under the direction of Robert Yerkes, a professor of psychology at Harvard at the time. The data these tests could produce would help decide who should go to the fronts, who was fit to lead and who should stay well behind the lines.

The stakes of those wartime decisions were particularly stark, but the aftermath of those psychometric instruments is even more unsettling. As the century progressed, such tests — I.Q. tests, college placement exams, predictive behavioral assessments — would affect the lives of millions of Americans. Schoolchildren who may have once or twice acted out in such a way as to prompt a psychometric evaluation could find themselves labeled, setting them on an inescapable track through the education system.

Researchers like Woodworth and Yerkes (or their Stanford colleague Lewis Terman, who formalized the first SAT) did not anticipate the deep consequences of their work; they were too busy pursuing the great intellectual challenges of their day, much like Mr. Zuckerberg in his pursuit of the next great social media platform. Or like Cambridge Analytica’s Christopher Wylie, the twentysomething data scientist who helped build psychometric profiles of two-thirds of all Americans by leveraging personal information gained through uninformed consent. All of these researchers were, quite understandably, obsessed with the great data science challenges of their generation. Their failure to consider the consequences of their pursuits, however, is not so much their fault as it is our collective failing.

For the past 100 years we have been chasing visions of data with a singular passion. Many of the best minds of each new generation have devoted themselves to delivering on the inspired data science promises of their day: intelligence testing, building the computer, cracking the genetic code, creating the internet, and now this. We have in the course of a single century built an entire society, economy and culture that runs on information. Yet we have hardly begun to engineer data ethics appropriate for our extraordinary information carnival. If we do not do so soon, data will drive democracy, and we may well lose our chance to do anything about it….(More)”.

How the government will operate in 2030


Darrell West at the Hill: “Imagine it is 2030 and you are a U.S. government employee working from home. With the assistance of the latest technology, you participate in video calls with clients and colleagues, augment your job activities through artificial intelligence and a personal digital assistant, work through collaboration software, and regularly get rated on a one-to-five scale by clients regarding your helpfulness, follow-through, and task completion.

How did you — and the government — get here? The sharing economy that unfolded in 2018 has revolutionized the public-sector workforce. The days when federal employees were subject to a centrally directed Office of Personnel and Management that oversaw permanent, full-time workers sitting in downtown office buildings are long gone. In their place is a remote workforce staffed by a mix of short- and long-term employees. This has dramatically improved worker productivity and satisfaction.

In the new digital world that has emerged, the goal is to use technology to make employees accountable. Gone are 20- or 30-year careers in the federal bureaucracy. Political leaders have always preached the virtue of running government like a business, and the success of Uber, Airbnb, and WeWork has persuaded them to focus on accountability and performance.

Companies such as Facebook demonstrated they could run large and complex organizations with less than 20,000 employees, and the federal government followed suit in the late 2020s. Now, workers deploy the latest tools of artificial intelligence, virtual reality, data analytics, robots, driverless cars, and digital assistants to improve the government. Unlike the widespread mistrust and cynicism that had poisoned attitudes in the decades before, the general public now sees government as a force for achieving positive results.

Many parts of the federal government are decentralized and mid-level employees are given greater authority to make decisions — but are subject to digital ratings that keep them accountable for their performance. The U.S. government borrowed this technique from China, where airport authorities in 2018 installed digital devices that allowed visitors to rate the performance of individual passport officers after every encounter. The reams of data have enabled Chinese authorities to fire poor performers and make sure foreign visitors see a friendly and competent face at the Beijing International Airport.

Alexa-like devices are given to all federal employees. The devices are used to keep track of leave time, file reimbursement requests, request time off, and complete a range of routine tasks that used to take employees hours. Through voice-activated commands, they navigate these mundane tasks quickly and efficiently. No one can believe the mountains of paperwork required just a decade ago….(More)”.

The People vs. Democracy: Why Our Freedom Is in Danger and How to Save It


Book by Yascha Mounk: “The world is in turmoil. From India to Turkey and from Poland to the United States, authoritarian populists have seized power. As a result, Yascha Mounk shows, democracy itself may now be at risk.

Two core components of liberal democracy—individual rights and the popular will—are increasingly at war with each other. As the role of money in politics soared and important issues were taken out of public contestation, a system of “rights without democracy” took hold. Populists who rail against this say they want to return power to the people. But in practice they create something just as bad: a system of “democracy without rights.”

The consequence, Mounk shows in The People vs. Democracy, is that trust in politics is dwindling. Citizens are falling out of love with their political system. Democracy is wilting away. Drawing on vivid stories and original research, Mounk identifies three key drivers of voters’ discontent: stagnating living standards, fears of multiethnic democracy, and the rise of social media. To reverse the trend, politicians need to enact radical reforms that benefit the many, not the few.

The People vs. Democracy is the first book to go beyond a mere description of the rise of populism. In plain language, it describes both how we got here and where we need to go. For those unwilling to give up on either individual rights or the popular will, Mounk shows, there is little time to waste: this may be our last chance to save democracy….(More)”

Can Social Media Help Build Communities?


Paper by Eric Forbush and  Nicol Turner-Lee: “In June 2017, Mark Zuckerberg proclaimed a new mission for Facebook, which was to “[g]ive people the power to build community and bring the world closer together” during the company’s first Community Summit. Yet, his declaration comes in the backdrop of a politically polarized America. While research has indicated that ideological polarization (the alignment and divergence of ideologies) has remained relatively unchanged, affective polarization (the degree to which Democrats and Republicans dislike each other) has skyrocketed (Gentzkow, 2016; Lelkes, 2016). This dislike for members of the opposite party may be amplified on social media platforms.
Social media have been accused of making our social networks increasingly insular, resulting in “echo chambers,” wherein individuals select information and friends who support their already held beliefs (Quattrociocchi, Scala, and Sunstein, 2016; Williams, McMurray, Kurz, and Lambert, 2015). However, the implicit message in Zuckerberg’s comments, and other leaders in this space, is that social media can provide users with a means for brokering relationships with other users that hold different values and beliefs from them. However, little is known on the extent to which social media platforms enable these opportunities.

Theories of prejudice reduction (Paluck and Green, 2009) partially explain an idealistic outcome of improved online relationships. In his seminal contact theory, Gordon Allport (1954) argued that under certain optimal conditions, all that is needed to reduce prejudice is for members of different groups to spend more time interacting with each other. However, contemporary social media platforms may not be doing enough to increase intergroup engagements, especially between politically polarized communities on issues of importance.

In this paper, we use Twitter data collected over a 20-day period, following the Day of Action for Net Neutrality on July 12, 2017. In support of a highly polarized regulatory issue, the Day of Action was organized by advocacy groups and corporations in support of an open internet, which does not discriminate against online users when accessing their preferred content. Analyzing 81,316 tweets about #netneutrality from 40,502 distinct users, we use social network analysis to develop network visualizations and conduct discrete content analysis of central tweets. Our research also divides the content by those in support and those opposed to any type of repeal of net neutrality rules by the FCC.

Our analysis of this particular issue reveals that social media is merely replicating, and potentially strengthening polarization on issues by party affiliations and online associations. Consequently, the appearance of mediators who are able to bridge online conversations or beliefs on charged issues appear to be nonexistent on both sides of the issue. Consequently, our findings suggest that social media companies may not be doing enough to bring communities together through meaningful conversations on their platforms….(More)”.

Lessons from Cambridge Analytica: one way to protect your data


Julia Apostle in the Financial Times: “The unsettling revelations about how data firm Cambridge Analytica surreptitiously exploited the personal information of Facebook users is yet another demoralising reminder of how much data has been amassed about us, and of how little control we have over it.

Unfortunately, the General Data Protection Regulation privacy laws that are coming into force across Europe — with more demanding consent, transparency and accountability requirements, backed by huge fines — may improve practices, but they will not change the governing paradigm: the law labels those who gather our data as “controllers”. We are merely “subjects”.

But if the past 20 years have taught us anything, it is that when business and legislators have been too slow to adapt to public demand — for goods and services that we did not even know we needed, such as Amazon, Uber and bitcoin — computer scientists have stepped in to fill the void. And so it appears that the realms of data privacy and security are deserving of some disruption. This might come in the form of “self-sovereign identity” systems.

The theory behind self-sovereign identity is that individuals should control the data elements that form the basis of their digital identities, and not centralised authorities such as governments and private companies. In the current online environment, we all have multiple log-ins, usernames, customer IDs and personal data spread across countless platforms and stored in myriad repositories.

Instead of this scattered approach, we should each possess the digital equivalent of a wallet that contains verified pieces of our identities. We can then choose which identification to share, with whom, and when. Self-sovereign identity systems are currently being developed.

They involve the creation of a unique and persistent identifier attributed to an individual (called a decentralised identity), which cannot be taken away. The systems use public/private key cryptography, which enables a user with a private key (a string of numbers) to share information with unlimited recipients who can access the encrypted data if they possess a corresponding public key.

The systems also rely on decentralised ledger applications like blockchain. While key cryptography has been around for a long time, it is the development of decentralised ledger technology, which also supports the trading of cryptocurrencies without the involvement of intermediaries, that will allow self-sovereign identity systems to take off. The potential uses for decentralised identity are legion and small-scale implementation is already happening. The Swiss municipality of Zug started using a decentralised identity system called uPort last year, to allow residents access to certain government services. The municipality announced it will also use the system for voting this spring….

Decentralised identity is more difficult to access and therefore there is less financial incentive for hackers to try. Self-sovereign identity systems could eliminate many of our data privacy concerns while empowering individuals in the online world and turning the established data order on its head. But the success of the technology depends on its widespread adoption….(More)

Psychographics: the behavioural analysis that helped Cambridge Analytica know voters’ minds


Michael Wade at The Conversation: “Much of the discussion has been on how Cambridge Analytica was able to obtain data on more than 50m Facebook users – and how it allegedly failed to delete this data when told to do so. But there is also the matter of what Cambridge Analytica actually did with the data. In fact the data crunching company’s approach represents a step change in how analytics can today be used as a tool to generate insights – and to exert influence.

For example, pollsters have long used segmentation to target particular groups of voters, such as through categorising audiences by gender, age, income, education and family size. Segments can also be created around political affiliation or purchase preferences. The data analytics machine that presidential candidate Hillary Clinton used in her 2016 campaign – named Ada after the 19th-century mathematician and early computing pioneer – used state-of-the-art segmentation techniques to target groups of eligible voters in the same way that Barack Obama had done four years previously.

Cambridge Analytica was contracted to the Trump campaign and provided an entirely new weapon for the election machine. While it also used demographic segments to identify groups of voters, as Clinton’s campaign had, Cambridge Analytica also segmented using psychographics. As definitions of class, education, employment, age and so on, demographics are informational. Psychographics are behavioural – a means to segment by personality.

This makes a lot of sense. It’s obvious that two people with the same demographic profile (for example, white, middle-aged, employed, married men) can have markedly different personalities and opinions. We also know that adapting a message to a person’s personality – whether they are open, introverted, argumentative, and so on – goes a long way to help getting that message across….

There have traditionally been two routes to ascertaining someone’s personality. You can either get to know them really well – usually over an extended time. Or you can get them to take a personality test and ask them to share it with you. Neither of these methods is realistically open to pollsters. Cambridge Analytica found a third way, with the assistance of two University of Cambridge academics.

The first, Aleksandr Kogan, sold them access to 270,000 personality tests completed by Facebook users through an online app he had created for research purposes. Providing the data to Cambridge Analytica was, it seems, against Facebook’s internal code of conduct, but only now in March 2018 has Kogan been banned by Facebook from the platform. In addition, Kogan’s data also came with a bonus: he had reportedly collected Facebook data from the test-takers’ friends – and, at an average of 200 friends per person, that added up to some 50m people.

However, these 50m people had not all taken personality tests. This is where the second Cambridge academic, Michal Kosinski, came in. Kosinski – who is said to believe that micro-targeting based on online data could strengthen democracy – had figured out a way to reverse engineer a personality profile from Facebook activity such as likes. Whether you choose to like pictures of sunsets, puppies or people apparently says a lot about your personality. So much, in fact, that on the basis of 300 likes, Kosinski’s model is able to predict someone’s personality profile with the same accuracy as a spouse….(More)”

Cambridge Analytica scandal: legitimate researchers using Facebook data could be collateral damage


 at The Conversation: “The scandal that has erupted around Cambridge Analytica’s alleged harvesting of 50m Facebook profiles assembled from data provided by a UK-based academic and his company is a worrying development for legitimate researchers.

Political data analytics company Cambridge Analytica – which is affiliated with Strategic Communication Laboratories (SCL) – reportedly used Facebook data, after it was handed over by Aleksandr Kogan, a lecturer at the University of Cambridge’s department of psychology.

Kogan, through his company Global Science Research (GSR) – separate from his university work – gleaned the data from a personality test app named “thisisyourdigitallife”. Roughly 270,000 US-based Facebook users voluntarily responded to the test in 2014. But the app also collected data on those participants’ Facebook friends without their consent.

This was possible due to Facebook rules at the time that allowed third-party apps to collect data about a Facebook user’s friends. The Mark Zuckerberg-run company has since changed its policy to prevent such access to developers….

Social media data is a rich source of information for many areas of research in psychology, technology, business and humanities. Some recent examples include using Facebook to predict riots, comparing the use of Facebook with body image concern in adolescent girls and investigating whether Facebook can lower levels of stress responses, with research suggesting that it may enhance and undermine psycho-social constructs related to well-being.

It is right to believe that researchers and their employers value research integrity. But instances where trust has been betrayed by an academic – even if it’s the case that data used for university research purposes wasn’t caught in the crossfire – will have a negative impact on whether participants will continue to trust researchers. It also has implications for research governance and for companies to share data with researchers in the first place.

Universities, research organisations and funders govern the integrity of research with clear and strict ethics proceduresdesigned to protect participants in studies, such as where social media data is used. The harvesting of data without permission from users is considered an unethical activity under commonly understood research standards.

The fallout from the Cambridge Analytica controversy is potentially huge for researchers who rely on social networks for their studies, where data is routinely shared with them for research purposes. Tech companies could become more reluctant to share data with researchers. Facebook is already extremely protective of its data – the worry is that it could become doubly difficult for researchers to legitimately access this information in light of what has happened with Cambridge Analytica….(More)”.

Artificial Intelligence and the Need for Data Fairness in the Global South


Medium blog by Yasodara Cordova: “…The data collected by industry represents AI opportunities for governments, to improve their services through innovation. Data-based intelligence promises to increase the efficiency of resource management by improving transparency, logistics, social welfare distribution — and virtually every government service. E-government enthusiasm took of with the realization of the possible applications, such as using AI to fight corruption by automating the fraud-tracking capabilities of cost-control tools. Controversially, the AI enthusiasm has spread to the distribution of social benefits, optimization of tax oversight and control, credit scoring systems, crime prediction systems, and other applications based in personal and sensitive data collection, especially in countries that do not have comprehensive privacy protections.

There are so many potential applications, society may operate very differently in ten years when the “datafixation” has advanced beyond citizen data and into other applications such as energy and natural resource management. However, many countries in the Global South are not being given necessary access to their countries’ own data.

Useful data are everywhere, but only some can take advantage. Beyond smartphones, data can be collected from IoT components in common spaces. Not restricted to urban spaces, data collection includes rural technology like sensors installed in tractors. However, even when the information is related to issues of public importance in developing countries —like data taken from road mesh or vital resources like water and land — it stays hidden under contract rules and public citizens cannot access, and therefore take benefit, from it. This arrangement keeps the public uninformed about their country’s operations. The data collection and distribution frameworks are not built towards healthy partnerships between industry and government preventing countries from realizing the potential outlined in the previous paragraph.

The data necessary to the development of better cities, public policies, and common interest cannot be leveraged if kept in closed silos, yet access often costs more than is justifiable. Data are a primordial resource to all stages of new technology, especially tech adoption and integration, so the necessary long term investment in innovation needs a common ground to start with. The mismatch between the pace of the data collection among big established companies and small, new, and local businesses will likely increase with time, assuming no regulation is introduced for equal access to collected data….

Currently, data independence remains restricted to discussions on the technological infrastructure that supports data extraction. Privacy discussions focus on personal data rather than the digital accumulation of strategic data in closed silos — a necessary discussion not yet addressed. The national interest of data is not being addressed in a framework of economic and social fairness. Access to data, from a policy-making standpoint, needs to find a balance between the extremes of public, open access and limited, commercial use.

A final, but important note: the vast majority of social media act like silos. APIs play an important role in corporate business models, where industry controls the data it collects without reward, let alone user transparency. Negotiation of the specification of APIs to make data a common resource should be considered, for such an effort may align with the citizens’ interest….(More)”.

Truth Decay: An Initial Exploration of the Diminishing Role of Facts and Analysis in American Public Life


Report by Jennifer Kavanagh and Michael D. Rich: “Over the past two decades, national political and civil discourse in the United States has been characterized by “Truth Decay,” defined as a set of four interrelated trends: an increasing disagreement about facts and analytical interpretations of facts and data; a blurring of the line between opinion and fact; an increase in the relative volume, and resulting influence, of opinion and personal experience over fact; and lowered trust in formerly respected sources of factual information. These trends have many causes, but this report focuses on four: characteristics of human cognitive processing, such as cognitive bias; changes in the information system, including social media and the 24-hour news cycle; competing demands on the education system that diminish time spent on media literacy and critical thinking; and polarization, both political and demographic. The most damaging consequences of Truth Decay include the erosion of civil discourse, political paralysis, alienation and disengagement of individuals from political and civic institutions, and uncertainty over national policy.

This report explores the causes and consequences of Truth Decay and how they are interrelated, and examines past eras of U.S. history to identify evidence of Truth Decay’s four trends and observe similarities with and differences from the current period. It also outlines a research agenda, a strategy for investigating the causes of Truth Decay and determining what can be done to address its causes and consequences….(More)”.

How tech used to track the flu could change the game for public health response


Cathie Anderson in the Sacramento Bee: “Tech entrepreneurs and academic researchers are tracking the spread of flu in real-time, collecting data from social media and internet-connected devices that show startling accuracy when compared against surveillance data that public health officials don’t report until a week or two later….

Smart devices and mobile apps have the potential to reshape public health alerts and responses,…, for instance, the staff of smart thermometer maker Kinsa were receiving temperature readings that augured the surge of flu patients in emergency rooms there.

Kinsa thermometers are part of the movement toward the Internet of Things – devices that automatically transmit information to a database. No personal information is shared, unless users decide to input information such as age and gender. Using data from more than 1 million devices in U.S. homes, the staff is able to track fever as it hits and use an algorithm to estimate impact for a broader population….

Computational researcher Aaron Miller worked with an epidemiological team at the University of Iowa to assess the feasibility of using Kinsa data to forecast the spread of flu. He said the team first built a model using surveillance data from the CDC and used it to forecast the spread of influenza. Then the team created a model where they integrated the data from Kinsa along with that from the CDC.

“We got predictions that were … 10 to 50 percent better at predicting the spread of flu than when we used CDC data alone,” Miller said. “Potentially, in the future, if you had granular information from the devices and you had enough information, you could imagine doing analysis on a really local level to inform things like school closings.”

While Kinsa uses readings taken in homes, academic researchers and companies such as sickweather.com are using crowdsourcing from social media networks to provide information on the spread of flu. Siddharth Shah, a transformational health industry analyst at Frost & Sullivan, pointed to an award-winning international study led by researchers at Northeastern University that tracked flu through Twitter posts and other key parameters of flu.

When compared with official influenza surveillance systems, the researchers said, the model accurately forecast the evolution of influenza up to six weeks in advance, much earlier than prior models. Such advance warnings would give health agencies significantly more time to expand upon medical resources or to alert the public to measures they can take to prevent transmission of the disease….

For now, Shah said, technology will probably only augment or complement traditional public data streams. However, he added, innovations already are changing how diseases are tracked. Chronic disease management, for instance, is going digital with devices such as Omada health that helps people with Type 2 diabetes better manage health challenges and Noom, a mobile app that helps people stop dieting and instead work toward true lifestyle change….(More).