Memex Human Trafficking


MEMEX is a DARPA program that explores how next generation search and extraction systems can help with real-world use cases. The initial application is the fight against human trafficking. In this application, the input is a portion of the public and dark web in which human traffickers are likely to (surreptitiously) post supply and demand information about illegal labor, sex workers, and more. DeepDive processes such documents to extract evidential data, such as names, addresses, phone numbers, job types, job requirements, information about rates of service, etc. Some of these data items are difficult for trained human annotators to accurately extract and have never been previously available, but DeepDive-based systems have high accuracy (Precision and Recall in the 90s, which may exceed non-experts). Together with provenance information, such structured, evidential data are then passed on to both other collaborators on the MEMEX program as well as law enforcement for analysis and consumption in operational applications. MEMEX has been featured extensively in the media and is supporting actual investigations. For example, every human trafficking investigation pursued by the Human Trafficking Response Unity in New York City involves MEMEX. DeepDive is the main extracted data provider for MEMEX. See also, 60 minutes, Scientific American, Wall St. Journal, BBC, and Wired. It is supporting actual investigations and perhaps new usecases in the war on terror.

Here is a detailed description of DeepDive’s role in MEMEX.”

 

Dissecting the Spirit of Gezi: Influence vs. Selection in the Occupy Gezi Movement


New study by Ceren Budak and Duncan J. Watts in Sociological Science: “Do social movements actively shape the opinions and attitudes of participants by bringing together diverse groups that subsequently influence one another? Ethnographic studies of the 2013 Gezi uprising seem to answer “yes,” pointing to solidarity among groups that were traditionally indifferent, or even hostile, to one another. We argue that two mechanisms with differing implications may generate this observed outcome: “influence” (change in attitude caused by interacting with other participants); and “selection” (individuals who participated in the movement were generally more supportive of other groups beforehand).

We tease out the relative importance of these mechanisms by constructing a panel of over 30,000 Twitter users and analyzing their support for the main Turkish opposition parties before, during, and after the movement. We find that although individuals changed in significant ways, becoming in general more supportive of the other opposition parties, those who participated in the movement were also significantly more supportive of the other parties all along. These findings suggest that both mechanisms were important, but that selection dominated. In addition to our substantive findings, our paper also makes a methodological contribution that we believe could be useful to studies of social movements and mass opinion change more generally. In contrast with traditional panel studies, which must be designed and implemented prior to the event of interest, our method relies on ex post panel construction, and hence can be used to study unanticipated or otherwise inaccessible events. We conclude that despite the well known limitations of social media, their “always on” nature and their widespread availability offer an important source of public opinion data….(More)”

How Africa can benefit from the data revolution


 in The Guardian: “….The modern information infrastructure is about movement of data. From data we derive information and knowledge, and that knowledge can be propagated rapidly across the country and throughout the world. Facebook and Google have both made massive investments in machine learning, the mainstay technology for converting data into knowledge. But the potential for these technologies in Africa is much larger: instead of simply advertising products to people, we can imagine modern distributed health systems, distributed markets, knowledge systems for disease intervention. The modern infrastructure should be data driven and deployed across the mobile network. A single good idea can then be rapidly implemented and distributed via the mobile phone app ecosystems.

The information infrastructure does not require large scale thinking and investment to deliver. In fact, it requires just the reverse. It requires agility and innovation. Larger companies cannot react quickly enough to exploit technological advances. Small companies with a good idea can grow quickly. From IBM to Microsoft, Google and now Facebook. All these companies now agree on one thing: data is where the value lies. Modern internet companies are data-driven from the ground up. Could the same thing happen in Africa’s economies? Can entire countries reformulate their infrastructures to be data-driven from the ground up?

Maybe, or maybe not, but it isn’t necessary to have a grand plan to give it a go. It is already natural to use data and communication to solve real world problems. In Silicon Valley these are the challenges of getting a taxi or reserving a restaurant. In Africa they are often more fundamental. John Quinn has been in Kampala, Uganda at Makerere University for eight years now targeting these challenges. In June this year, John and other researchers from across the region came together for Africa’s first workshop on data science at Dedan Kimathi University of Technology. The objective was to spread knowledge of technologies, ideas and solutions. For the modern information infrastructure to be successful software solutions need to be locally generated. African apps to solve African problems. With this in mind the workshop began with a three day summer school on data science which was then followed by two days of talks on challenges in African data science.

The ideas and solutions presented were cutting edge. The Umati project uses social media to understand the use of ethnic hate speech in Kenya (Sidney Ochieng, iHub, Nairobi). The use of social media for monitoring the evolution and effects of Ebola in west Africa (Nuri Pashwani, IBM Research Africa). The Kudusystem for market making in Ugandan farm produce distribution via SMS messages (Kenneth Bwire, Makerere University, Kampala). Telecommunications data for inferring the source and spread of a typhoid outbreak in Kampala (UN Pulse Lab, Kampala). The Punya system for prototyping and deployment of mobile phone apps to deal with emerging crises or market opportunities (Julius Adebayor, MIT) and large scale systems for collating and sharing data resources Open Data Kenya and UN OCHA Human Data Exchange….(More)”

A New Kind of Media Using Government Data


Eric Newburger at the Department of Commerce:MSNBC has published a data-heavy story collection that takes advantage of the internet’s power to communicate not only faster, but in different and meaningful ways.  “The Geography of Poverty” combines narrative, data graphics, and photo-essay content through an interface so seamless as to be almost invisible.

So far they have released three of what will eventually be five parts, but already they have tapped datasets from BLS, Census, the Department of Agriculture, and EPA.  They combined these federal sources with private data: factory data from Randy Peterson and Chemplants.com; displacement information from news sources; Mary Sternberg’s “Along the River Road”; and Steve Lerner’s Diamond and Kate Orff’s research in “Petrochemical America.”

These layers of data feed visualizations which provide a deeper understanding of the highly personal stories the photos tell; the text weaves the elements into a cohesive whole.  Today’s web tools make this kind of reporting not only possible, but fairly simple to assemble.

The result is a new kind of media that mixes the personal and the societal, the social and the environmental, fitting small scale stories of individuals and local communities into the broader context of our whole nation….(More)”

Social Media and Local Governments


Book edited by Sobaci, Mehmet Zahid: “Today, social media have attracted the attention of political actors and administrative institutions to inform citizens as a prerequisite of open and transparent administration, deliver public services, contact stakeholders, revitalize democracy, encourage the cross-agency cooperation, and contribute to knowledge management. In this context, the social media tools can contribute to the emergence of citizen-oriented, open, transparent and participatory public administration. Taking advantage of the opportunities offered by social media is not limited to central government. Local governments deploy internet-based innovative technologies that complement traditional methods in implementing different functions. This book focuses on the relationship between the local governments and social media, deals with the change that social media have caused in the organization, understanding of service provision, performance of local governments and in the relationships between local governments and their partners, and aims to advance our theoretical and empirical understanding of the growing use of social media by local governments. This book will be of interest to researchers and students in e-government, public administration, political science, communication, information science, and social media. Government officials and public managers will also find practical use recommendations for social media in several aspects of local governance…(More)”

Science Isn’t Broken


Christie Aschwanden at FiveThirtyEight: “Yet even in the face of overwhelming evidence, it’s hard to let go of a cherished idea, especially one a scientist has built a career on developing. And so, as anyone who’s ever tried to correct a falsehood on the Internet knows, the truth doesn’t always win, at least not initially, because we process new evidence through the lens of what we already believe. Confirmation bias can blind us to the facts; we are quick to make up our minds and slow to change them in the face of new evidence.

A few years ago, Ioannidis and some colleagues searched the scientific literature for references to two well-known epidemiological studies suggesting that vitamin E supplements might protect against cardiovascular disease. These studies were followed by several large randomized clinical trials that showed no benefit from vitamin E and one meta-analysis finding that at high doses, vitamin E actually increased the risk of death.

Human fallibilities send the scientific process hurtling in fits, starts and misdirections instead of in a straight line from question to truth.

Despite the contradictory evidence from more rigorous trials, the first studies continued to be cited and defended in the literature. Shaky claims about beta carotene’s ability to reduce cancer risk and estrogen’s role in staving off dementia also persisted, even after they’d been overturned by more definitive studies. Once an idea becomes fixed, it’s difficult to remove from the conventional wisdom.

Sometimes scientific ideas persist beyond the evidence because the stories we tell about them feel true and confirm what we already believe. It’s natural to think about possible explanations for scientific results — this is how we put them in context and ascertain how plausible they are. The problem comes when we fall so in love with these explanations that we reject the evidence refuting them.

The media is often accused of hyping studies, but scientists are prone to overstating their results too.

Take, for instance, the breakfast study. Published in 2013, it examined whether breakfast eaters weigh less than those who skip the morning meal and if breakfast could protect against obesity. Obesity researcher Andrew Brown and his colleagues found that despite more than 90 mentions of this hypothesis in published media and journals, the evidence for breakfast’s effect on body weight was tenuous and circumstantial. Yet researchers in the field seemed blind to these shortcomings, overstating the evidence and using causative language to describe associations between breakfast and obesity. The human brain is primed to find causality even where it doesn’t exist, and scientists are not immune.

As a society, our stories about how science works are also prone to error. The standard way of thinking about the scientific method is: ask a question, do a study, get an answer. But this notion is vastly oversimplified. A more common path to truth looks like this: ask a question, do a study, get a partial or ambiguous answer, then do another study, and then do another to keep testing potential hypotheses and homing in on a more complete answer. Human fallibilities send the scientific process hurtling in fits, starts and misdirections instead of in a straight line from question to truth.

Media accounts of science tend to gloss over the nuance, and it’s easy to understand why. For one thing, reporters and editors who cover science don’t always have training on how to interpret studies. And headlines that read “weak, unreplicated study finds tenuous link between certain vegetables and cancer risk” don’t fly off the newsstands or bring in the clicks as fast as ones that scream “foods that fight cancer!”

People often joke about the herky-jerky nature of science and health headlines in the media — coffee is good for you one day, bad the next — but that back and forth embodies exactly what the scientific process is all about. It’s hard to measure the impact of diet on health, Nosek told me. “That variation [in results] occurs because science is hard.” Isolating how coffee affects health requires lots of studies and lots of evidence, and only over time and in the course of many, many studies does the evidence start to narrow to a conclusion that’s defensible. “The variation in findings should not be seen as a threat,” Nosek said. “It means that scientists are working on a hard problem.”

The scientific method is the most rigorous path to knowledge, but it’s also messy and tough. Science deserves respect exactly because it is difficult — not because it gets everything correct on the first try. The uncertainty inherent in science doesn’t mean that we can’t use it to make important policies or decisions. It just means that we should remain cautious and adopt a mindset that’s open to changing course if new data arises. We should make the best decisions we can with the current evidence and take care not to lose sight of its strength and degree of certainty. It’s no accident that every good paper includes the phrase “more study is needed” — there is always more to learn….(More)”

Review Federal Agencies on Yelp…and Maybe Get a Response


Yelp Official Blog: “We are excited to announce that Yelp has concluded an agreement with the federal government that will allow federal agencies and offices to claim their Yelp pages, read and respond to reviews, and incorporate that feedback into service improvements.

We encourage Yelpers to review any of the thousands of agency field offices, TSA checkpoints, national parks, Social Security Administration offices, landmarks and other places already listed on Yelp if you have good or bad feedback to share about your experiences. Not only is it helpful to others who are looking for information on these services, but you can actually make an impact by sharing your feedback directly with the source.

It’s clear Washington is eager to engage with people directly through social media. Earlier this year a group of 46 lawmakers called for the creation of a “Yelp for Government” in order to boost transparency and accountability, and Representative Ron Kind reiterated this call in a letter to the General Services Administration (GSA). Luckily for them, there’s no need to create a new platform now that government agencies can engage directly on Yelp.

As this agreement is fully implemented in the weeks and months ahead, we’re excited to help the federal government more directly interact with and respond to the needs of citizens and to further empower the millions of Americans who use Yelp every day.

In addition to working with the federal government, last week we announced our our partnership with ProPublica to incorporate health care statistics and consumer opinion survey data onto the Yelp business pages of more than 25,000 medical treatment facilities. We’ve also partnered with local governments in expanding the LIVES open data standard to show restaurant health scores on Yelp….(More)”

Using Technology, Building Democracy


Book by Jessica Baldwin-Philippi: “The days of “revolutionary” campaign strategies are gone. The extraordinary has become ordinary, and campaigns at all levels, from the federal to the municipal, have realized the necessity of incorporating digital media technologies into their communications strategies. Still, little is understood about how these practices have been taken up and routinized on a wide scale, or the ways in which the use of these technologies is tied to new norms and understandings of political participation and citizenship in the digital age. The vocabulary that we do possess for speaking about what counts as citizenship in a digital age is limited.

Drawing on ethnographic fieldwork in a federal-level election, interviews with communications and digital media consultants, and textual analysis of campaign materials, this book traces the emergence and solidification of campaign strategies that reflect what it means to be a citizen in the digital era. It identifies shifting norms and emerging trends to build new theories of citizenship in contemporary democracy. Baldwin-Philippi argues that these campaign practices foster engaged and skeptical citizens. But, rather than assess the quality or level of participation and citizenship due to the use of technologies, this book delves into the way that digital strategies depict what “good” citizenship ought to be and the goals and values behind the tactics….(More)”

Four things policy-makers need to know about social media data and real time analytics.


Ella McPherson at LSE’s Impact Blog: “I recently gave evidence to the House of Commons Science and Technology Select Committee. This was based on written evidence co-authored with my colleague, Anne Alexander, and submitted to their ongoing inquiry into social media data and real time analytics. Both Anne and I research the use of social media during contested times; Anne looks at its use by political activists and labour movement organisers in the Arab world, and I look at its use in human rights reporting. In both cases, the need to establish facticity is high, as is the potential for the deliberate or inadvertent falsification of information. Similarly to the case that Carruthers makes about war reporting, we believe that the political-economic, methodological, and ethical issues raised by media dynamics in the context of crisis are bellwethers for the dynamics in more peaceful and mundane contexts.

From our work we have learned four crucial lessons that policy-makers considering this issue should understand:

1.  Social media information is vulnerable to a variety of distortions – some typical of all information, and others more specific to the characteristics of social media communications….

2.  If social media information is used to establish events, it must be verified; while technology can hasten this process, it is unlikely to ever occur real time due to the subjective, human element of judgment required….

 

3.  Verifying social media information may require identifying its source, which has ethical implications related to informed consent and anonymisation….

4.  Another way to think about social media information is as what Hermida calls an ‘awareness system,’ which reduces the need to collect source identities; under this approach, researchers look at volume rather than veracity to recognise information of interest… (More)

Transforming public services the right way with impact management


Emily Bazalgette at FutureGov: “…Impact evaluation involves using a range of research methodologies to investigate whether our products and services are having an impact on users’ lives. ….Rigorous academic impact evaluation wasn’t really designed for rapidly iterating products made by a fast-moving digital and design company like FutureGov. Our products can change significantly over short periods of time — for instance, in a single workshop Doc Ready evolved from a feature-rich social media platform to a stripped-down checklist builder — and that can create a tension between our agile process and traditional evaluation methodologies, which tend to require a fixed product to support a long-term evaluation plan.

We’ve decided to embrace this tension by using Theories of Change, a useful evaluation tool recommended to us by our investors and partners Nesta Impact Investments. To give you a flavour (excuse the pun), below we have Casserole Club’s Theory of Change.

Casserole toc

The problem we’re trying to solve (reducing social isolation) doesn’t tend to change, but the way we solve it might (the inputs and short to medium-term outcomes). In future, we may find that we need to adapt to serve new user groups, or operate in different channels, or that there are mediating outcomes for social isolation that Casserole Club produces other than social contact with a Casserole Club cook. Theories of Change allow us to stay focused on big-picture outcomes, while being flexible about how the product delivers on these outcomes.

Another lesson is to make evaluation everyone’s business. Like many young-ish companies, FutureGov is not at the stage where we have the resources to support a full-time, dedicated Head of Impact. But we’ve found that you can get pretty far if you’ve got a flat structure and lots of passionate people (both of which, luckily, we have). Our lack of hierarchy means that anyone can take up a project and run with it, and collaboration across the company is encouraged. Product impact evaluation is owned by the product teams who manage the product over time. This means we can get more done, that research design benefits from the deep knowledge of our product teams, and that evaluation skills (like how to design a decent survey or depth interview) have started to spread across the organisation….(More)”