When the Big Lie Meets Big Data


Peter Bruce in Scientific America: “…The science of predictive modeling has come a long way since 2004. Statisticians now build “personality” models and tie them into other predictor variables. … One such model bears the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism. Using Big Data at the individual level, machine learning methods might classify a person as, for example, “closed, introverted, neurotic, not agreeable, and conscientious.”

Alexander Nix, CEO of Cambridge Analytica (owned by Trump’s chief donor, Rebekah Mercer), says he has thousands of data points on you, and every other voter: what you buy or borrow, where you live, what you subscribe to, what you post on social media, etc. At a recent Concordia Summit, using the example of gun rights, Nix described how messages will be crafted to appeal specifically to you, based on your personality profile. Are you highly neurotic and conscientious? Nix suggests the image of a sinister gloved hand reaching through a broken window.

In his presentation, Nix noted that the goal is to induce behavior, not communicate ideas. So where does truth fit in? Johan Ugander, Assistant Professor of Management Science at Stanford, suggests that, for Nix and Cambridge Analytica, it doesn’t. In counseling the hypothetical owner of a private beach how to keep people off his property, Nix eschews the merely factual “Private Beach” sign, advocating instead a lie: “Sharks sighted.” Ugander, in his critique, cautions all data scientists against “building tools for unscrupulous targeting.”

The warning is needed, but may be too late. What Nix described in his presentation involved carefully crafted messages aimed at his target personalities. His messages pulled subtly on various psychological strings to manipulate us, and they obeyed no boundary of truth, but they required humans to create them.  The next phase will be the gradual replacement of human “craftsmanship” with machine learning algorithms that can supply targeted voters with a steady stream of content (from whatever source, true or false) designed to elicit desired behavior. Cognizant of the Pandora’s box that data scientists have opened, the scholarly journal Big Data has issued a call for papers for a future issue devoted to “Computational Propaganda.”…(More)”

Open Government Data in Africa: A preference elicitation analysis of media practitioners


Eric Afful-Dadzie and Anthony Afful-Dadzie in Government Information Quarterly: “Open Government Data (OGD) continues to receive considerable traction around the world. In particular, there have been a growing number of OGD establishments in the developed world, sparking expectations of similar trends in growing democracies. To understand the readiness of OGD stakeholders in Africa especially the media, this paper (1) reviews current infrastructure at OGD web portals in Africa and (2) conducts a preference elicitation analysis among media practitioners in 5 out of the 7 OGD country centers in Africa regarding desired structure of OGD in developing countries. The analysis gives a view of the relative importance media practitioners ascribe to a selected set of OGD attributes in anticipation of a more functional OGD in their respective countries. Using conjoint analysis, the result indicates that media practitioners put premium on ‘metadata’ and ‘data format’ respectively in order of importance. Results from the review also reveal that features of current OGD web portals in Africa are not consistent with the desired preferences of users. Overall, the study provides a general insight into media expectations of OGD in Africa, and also serves as a foundational knowledge for authorities and practitioners to manage expectations of the media in connection with OGD in Africa….(More)”.

Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media


Paper by Rupinder Paul Khandpur, Taoran Ji, Steve Jan, Gang Wang, Chang-Tien Lu, Naren Ramakrishnan: “Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods….(More)”

Social Media for Government


Book by Gohar Feroz Khan: “This book provides practical know-how on understanding, implementing, and managing main stream social media tools (e.g., blogs and micro-blogs, social network sites, and content communities) from a public sector perspective. Through social media, government organizations can inform citizens, promote their services, seek public views and feedback, and monitor satisfaction with the services they offer so as to improve their quality. Given the exponential growth of social media in contemporary society, it has become an essential tool for communication, content sharing, and collaboration. This growth and these tools also present an unparalleled opportunity to implement a transparent, open, and collaborative government.  However, many government organization, particularly those in the developing world, are still somewhat reluctant to leverage social media, as it requires significant policy and governance changes, as well as specific know-how, skills and resources to plan, implement and manage social media tools. As a result, governments around the world ignore or mishandle the opportunities and threats presented by social media. To help policy makers and governments implement a social media driven government, this book provides guidance in developing an effective social media policy and strategy. It also addresses issues such as those related to security and privacy….(More)”

Why We Make Free, Public Information More Accessible


Gabi Fitz and Lisa Brooks in Philantopic: “One of the key roles the nonprofit sector plays in civil society is providing evidence about social problems and their solutions. Given recent changes to policies regarding the sharing of knowledge and evidence by federal agencies, that function is more critical than ever.

Nonprofits deliver more than direct services such as running food banks or providing shelter to people who are homeless. They also collect and share data, evidence, and lessons learned so as to help all of us understand complex and difficult problems.

Those efforts not only serve to illuminate and benchmark our most pressing social problems, they also inform the actions we take, whether at the individual, organizational, community, or policy level. Often, they provide the evidence in “evidence-based” decision making, not to mention the knowledge that social sector organizations and policy makers rely on when shaping their programs and services and individual citizens turn to inform their own engagement.

In January 2017, several U.S. government agencies, including the Environmental Protection Agency and the Departments of Health and Human Services and Agriculture, were ordered by officials of the incoming Trump administration not to share anything that could be construed as controversial through official communication channels such as websites and social media channels. (See “Federal Agencies Told to Halt External Communications.”) Against that backdrop, the nonprofit sector’s interest in generating and sharing evidence has become more urgent than ever…..

Providing access to evidence and lessons learned is always important, but in light of recent events, we believe it’s more necessary than ever. That’s why we are asking for your help in providing — and preserving — access to this critical knowledge base.

Over the next few months, we will be updating and maintaining special collections of non-academic research on the following topics and need lead curators with issue expertise to lend us a hand. IssueLab special collections are an effort to contextualize important segments of the growing evidence base we curate, and are one of the ways we  help visitors to the platform learn about nonprofit organizations and resources that may be useful to their work and knowledge-gathering efforts.

Possible special collection topics to be updated or curated:

→ Access to reproductive services (new)
→ Next steps for ACA
→ Race and policing
→ Immigrant detention and deportation
→ Climate change and extractive mining (new)
→ Veterans affairs
→ Gun violence

If you are a researcher, knowledge broker, or service provider in any of these fields of practice, please consider volunteering as a lead curator. …(More)”

Corporate Social Responsibility for a Data Age


Stefaan G. Verhulst in the Stanford Social Innovation Review: “Proprietary data can help improve and save lives, but fully harnessing its potential will require a cultural transformation in the way companies, governments, and other organizations treat and act on data….

We live, as it is now common to point out, in an era of big data. The proliferation of apps, social media, and e-commerce platforms, as well as sensor-rich consumer devices like mobile phones, wearable devices, commercial cameras, and even cars generate zettabytes of data about the environment and about us.

Yet much of the most valuable data resides with the private sector—for example, in the form of click histories, online purchases, sensor data, and call data records. This limits its potential to benefit the public and to turn data into a social asset. Consider how data held by business could help improve policy interventions (such as better urban planning) or resiliency at a time of climate change, or help design better public services to increase food security.

Data responsibility suggests steps that organizations can take to break down these private barriers and foster so-called data collaboratives, or ways to share their proprietary data for the public good. For the private sector, data responsibility represents a new type of corporate social responsibility for the 21st century.

While Nepal’s Ncell belongs to a relatively small group of corporations that have shared their data, there are a few encouraging signs that the practice is gaining momentum. In Jakarta, for example, Twitter exchanged some of its data with researchers who used it to gather and display real-time information about massive floods. The resulting website, PetaJakarta.org, enabled better flood assessment and management processes. And in Senegal, the Data for Development project has brought together leading cellular operators to share anonymous data to identify patterns that could help improve health, agriculture, urban planning, energy, and national statistics.

Examples like this suggest that proprietary data can help improve and save lives. But to fully harness the potential of data, data holders need to fulfill at least three conditions. I call these the “the three pillars of data responsibility.”…

The difficulty of translating insights into results points to some of the larger social, political, and institutional shifts required to achieve the vision of data responsibility in the 21st century. The move from data shielding to data sharing will require that we make a cultural transformation in the way companies, governments, and other organizations treat and act on data. We must incorporate new levels of pro-activeness, and make often-unfamiliar commitments to transparency and accountability.

By way of conclusion, here are four immediate steps—essential but not exhaustive—we can take to move forward:

  1. Data holders should issue a public commitment to data responsibility so that it becomes the default—an expected, standard behavior within organizations.
  2. Organizations should hire data stewards to determine what and when to share, and how to protect and act on data.
  3. We must develop a data responsibility decision tree to assess the value and risk of corporate data along the data lifecycle.
  4. Above all, we need a data responsibility movement; it is time to demand data responsibility to ensure data improves and safeguards people’s lives…(More)”

Why big data may be having a big effect on how our politics plays out


 in The Conversation: “…big data… is an inconceivably vast mass of information, which at first glance would seem a giant mess; just white noise.

Unless you know how to decipher it.

According to a story first published in Zurich-based Das Magazin in December and more recently taken up by Motherboard, events such as Brexit and Trump’s ascendency may have been made possible through just such deciphering. The argument is that technology combining psychological profiling and data analysis may have played a pivotal part in exploiting unconscious bias at the individual voter level. The theory is this was used in the recent US election to increase or suppress votes to benefit particular candidates in crucial locations. It is claimed that the company behind this may be active in numerous countries.

The technology at play is based on the integration of a model of psychological profiling known as OCEAN. This uses the details contained within individuals’ digital footprints to create user-specific profiles. These map to the level of the individual, identifiable voter, who can then be manipulated by exploiting beliefs, preferences and biases that they might not even be aware of, but which their data has revealed about them in glorious detail.

As well as enabling the creation of tailored media content, this can also be used to create scripts of relevant talking points for campaign doorknockers to focus on, according to the address and identity of the householder to whom they are speaking.

This goes well beyond the scope and detail of previous campaign strategies. If the theory about the role of these techniques is correct, it signals a new landscape of political strategising. An active researcher in the field, when writing about the company behind this technology (which Trump paid for services during his election campaign), described the potential scale of such technologies:

Marketers have long tailored their placement of advertisements based on their target group, for example by placing ads aimed at conservative consumers in magazines read by conservative audiences. What is new about the psychological targeting methods implemented by Cambridge Analytica, however, is their precision and scale. According to CEO Alexander Nix, the company holds detailed psycho-demographic profiles of more than 220 million US citizens and used over 175,000 different ad messages to meet the unique motivations of their recipients….(More)”

Code-Dependent: Pros and Cons of the Algorithm Age


 and  at PewResearch Center: “Algorithms are instructions for solving a problem or completing a task. Recipes are algorithms, as are math equations. Computer code is algorithmic. The internet runs on algorithms and all online searching is accomplished through them. Email knows where to go thanks to algorithms. Smartphone apps are nothing but algorithms. Computer and video games are algorithmic storytelling. Online dating and book-recommendation and travel websites would not function without algorithms. GPS mapping systems get people from point A to point B via algorithms. Artificial intelligence (AI) is naught but algorithms. The material people see on social media is brought to them by algorithms. In fact, everything people see and do on the web is a product of algorithms. Every time someone sorts a column in a spreadsheet, algorithms are at play, and most financial transactions today are accomplished by algorithms. Algorithms help gadgets respond to voice commands, recognize faces, sort photos and build and drive cars. Hacking, cyberattacks and cryptographic code-breaking exploit algorithms. Self-learning and self-programming algorithms are now emerging, so it is possible that in the future algorithms will write many if not most algorithms.

Algorithms are often elegant and incredibly useful tools used to accomplish tasks. They are mostly invisible aids, augmenting human lives in increasingly incredible ways. However, sometimes the application of algorithms created with good intentions leads to unintended consequences. Recent news items tie to these concerns:

A City Is Not a Computer


 at Places Journal: “…Modernity is good at renewing metaphors, from the city as machine, to the city as organism or ecology, to the city as cyborgian merger of the technological and the organic. Our current paradigm, the city as computer, appeals because it frames the messiness of urban life as programmable and subject to rational order. Anthropologist Hannah Knox explains, “As technical solutions to social problems, information and communications technologies encapsulate the promise of order over disarray … as a path to an emancipatory politics of modernity.” And there are echoes of the pre-modern, too. The computational city draws power from an urban imaginary that goes back millennia, to the city as an apparatus for record-keeping and information management.

We’ve long conceived of our cities as knowledge repositories and data processors, and they’ve always functioned as such. Lewis Mumford observed that when the wandering rulers of the European Middle Ages settled in capital cities, they installed a “regiment of clerks and permanent officials” and established all manner of paperwork and policies (deeds, tax records, passports, fines, regulations), which necessitated a new urban apparatus, the office building, to house its bureaus and bureaucracy. The classic example is the Uffizi (Offices) in Florence, designed by Giorgio Vasari in the mid-16th century, which provided an architectural template copied in cities around the world. “The repetitions and regimentations of the bureaucratic system” — the work of data processing, formatting, and storage — left a “deep mark,” as Mumford put it, on the early modern city.

Yet the city’s informational role began even earlier than that. Writing and urbanization developed concurrently in the ancient world, and those early scripts — on clay tablets, mud-brick walls, and landforms of various types — were used to record transactions, mark territory, celebrate ritual, and embed contextual information in landscape.  Mumford described the city as a fundamentally communicative space, rich in information:

Through its concentration of physical and cultural power, the city heightened the tempo of human intercourse and translated its products into forms that could be stored and reproduced. Through its monuments, written records, and orderly habits of association, the city enlarged the scope of all human activities, extending them backwards and forwards in time. By means of its storage facilities (buildings, vaults, archives, monuments, tablets, books), the city became capable of transmitting a complex culture from generation to generation, for it marshaled together not only the physical means but the human agents needed to pass on and enlarge this heritage. That remains the greatest of the city’s gifts. As compared with the complex human order of the city, our present ingenious electronic mechanisms for storing and transmitting information are crude and limited.

Mumford’s city is an assemblage of media forms (vaults, archives, monuments, physical and electronic records, oral histories, lived cultural heritage); agents (architectures, institutions, media technologies, people); and functions (storage, processing, transmission, reproduction, contextualization, operationalization). It is a large, complex, and varied epistemological and bureaucratic apparatus. It is an information processor, to be sure, but it is also more than that.

Were he alive today, Mumford would reject the creeping notion that the city is simply the internet writ large. He would remind us that the processes of city-making are more complicated than writing parameters for rapid spatial optimization. He would inject history and happenstance. The city is not a computer. This seems an obvious truth, but it is being challenged now (again) by technologists (and political actors) who speak as if they could reduce urban planning to algorithms.

Why should we care about debunking obviously false metaphors? It matters because the metaphors give rise to technical models, which inform design processes, which in turn shape knowledges and politics, not to mention material cities. The sites and systems where we locate the city’s informational functions — the places where we see information-processing, storage, and transmission “happening” in the urban landscape — shape larger understandings of urban intelligence….(More)”

‘Collective intelligence’ is not necessarily present in virtual groups


Jordan B. Barlow and Alan R. Dennis at LSE: “Do groups of smart people perform better than groups of less intelligent people?

Research published in Science magazine in 2010 reported that groups, like individuals, have a certain level of “collective intelligence,” such that some groups perform consistently well across many different types of tasks, while other groups perform consistently poorly. Collective intelligence is similar to individual intelligence, but at the group level.

Interestingly, the Science study found that collective intelligence was not related to the individual intelligence of group members; groups of people with higher intelligence did not perform better than groups with lower intelligence. Instead, the study found that high performing teams had members with higher social sensitivity – the ability to read the emotions of others using visual facial cues.

Social sensitivity is important when we sit across a table from each other. But what about online, when we exchange emails or text messages? Does social sensitivity matter when I can’t see your face?

We examined the collective intelligence in an online environment in which groups used text-based computer-mediated communication. We followed the same procedures as the original Science study, which used the approach typically used to measure individual intelligence. In individual intelligence tests, a person completes several small “tasks” or problems. An analysis of task scores typically demonstrates that task scores are correlated, meaning that if a person does well on one problem, it is likely that they did well on other problems….

The results were not what we expected. The correlations between our groups’ performance scores were either not statistically significant or significantly negative, as shown in Table 1. The average correlation between any two tasks was -0.05, indicating that performance on one task was not correlated with performance on other tasks. In other words, groups who performed well on one of the tasks were unlikely to perform well on the other tasks…

Our findings challenge the conclusion reported in Science that groups have a general collective intelligence analogous to individual intelligence. Our study shows that no collective intelligence factor emerged when groups used a popular commercial text-based online tool. That is, when using tools with limited visual cues, groups that performed well on one task were no more likely to perform well on a different task. Thus the “collective intelligence” factor related to social sensitivity that was reported in Science is not collective intelligence; it is instead a factor associated with the ability to work well using face-to-face communication, and does not transcend media….(More)”