Collection

The Case for Sharing All of America’s Data on Mosquitoes

Curated on August 30, 2017August 20, 2018 by Stefaan Verhulst

Ed Yong in the Atlantic: “The U.S. is sitting on one of the largest data sets on any animal group, but most of it is inaccessible and restricted to local agencies….For decades, agencies around the United States have been collecting data on mosquitoes. Biologists set traps, dissect captured insects, and identify which species they belong to. They’ve done this for millions of mosquitoes, creating an unprecedented trove of information—easily one of the biggest long-term attempts to monitor any group of animals, if not the very biggest.

The problem, according to Micaela Elvira Martinez from Princeton University and Samuel Rund from the University of Notre Dame, is that this treasure trove of data isn’t all in the same place, and only a small fraction of it is public. The rest is inaccessible, hoarded by local mosquito-control agencies around the country.

Currently, these agencies can use their data to check if their attempts to curtail mosquito populations are working. Are they doing enough to remove stagnant water, for example? Do they need to spray pesticides? But if they shared their findings, Martinez and Rund say that scientists could do much more. They could better understand the ecology of these insects, predict the spread of mosquito-borne diseases like dengue fever or Zika, coordinate control efforts across states and counties, and quickly spot the arrival of new invasive species.

That’s why Martinez and Rund are now calling for the creation of a national database of mosquito records that anyone can access. “There’s a huge amount of taxpayer investment and human effort that goes into setting traps, checking them weekly, dissecting all those mosquitoes under a microscope, and tabulating the data,” says Martinez. “It would be a big bang for our buck to collate all that data and make it available.”

Martinez is a disease modeler—someone who uses real-world data to build simulations that reveal how infections rise, spread, and fall. She typically works with childhood diseases like measles and polio, where researchers are almost spoiled for data. Physicians are legally bound to report any cases, and the Centers for Disease Control and Prevention (CDC) compiles and publishes this information as a weekly report.

The same applies to cases of mosquito-borne diseases like dengue and Zika, but not to populations of the insects themselves. So, during last year’s Zika epidemic, when Martinez wanted to study the Aedes aegypti mosquito that spreads the disease, she had a tough time. “I was really surprised that I couldn’t find data on Aedes aegypti numbers,” she says. Her colleagues explained that scientists use climate variables like temperature and humidity to predict where mosquitoes are going to be abundant. That seemed ludicrous to her, especially since organizations collect information on the actual insects. It’s just that no one ever gathers those figures together….

Together with Rund and a team of undergraduate students, she found that there are more than 1,000 separate agencies in the United States that collect mosquito data—at least one in every county or jurisdiction. Only 152 agencies make their data publicly available in some way. The team collated everything they could find since 2009, and ended up with information about more than 15 million mosquitoes. Imagine what they’d have if all the datasets were open, especially since some go back decades.

A few mosquito-related databases do exist, but none are quite right. ArboNET, which is managed by the CDC and state health departments, mainly stores data about mosquito-borne diseases, and whatever information it has on the insects themselves isn’t precise enough in either time or space to be useful for modeling. MosquitoNET, which was developed by the CDC, does track mosquitoes, but “it’s a completely closed system, and hardly anyone has access to it,” says Rund. The Smithsonian Institution’s VectorMap is better in that it’s accessible, “but it lacks any real-time data from the continental United States,” says Rund. “When I checked a few months ago, it had just one record of Aedes aegypti since 2013.”…

Some scientists who work on mosquito control apparently disagree, and negative reviews have stopped Martinez and Rund from publishing their ideas in prominent academic journals. (For now, they’ve uploaded a paper describing their vision to the preprint repository bioRxiv.) “Some control boards say: What if people want to sue us because we’re showing that they have mosquito vectors near their homes, or if their house prices go down?” says Martinez. “And one mosquito-control scientist told me that no one should be able to work with mosquito data unless they’ve gone out and trapped mosquitoes themselves.”…

“Data should be made available without having to justify exactly what’s going to be done with it,” Martinez says. “We should put it out there for scientists to start unlocking it. I think there are a ton of biologists who will come up with cool things to do.”…(More)”.

Debating big data: A literature review on realizing value from big data

Curated on August 30, 2017August 3, 2018 by Stefaan Verhulst

Wendy Arianne Günther et al in The Journal of Strategic Information Systems: “Big data has been considered to be a breakthrough technological development over recent years. Notwithstanding, we have as yet limited understanding of how organizations translate its potential into actual social and economic value. We conduct an in-depth systematic review of IS literature on the topic and identify six debates central to how organizations realize value from big data, at different levels of analysis. Based on this review, we identify two socio-technical features of big data that influence value realization: portability and interconnectivity. We argue that, in practice, organizations need to continuously realign work practices, organizational models, and stakeholder interests in order to reap the benefits from big data. We synthesize the findings by means of an integrated model….(More)”.

Algorithms in the Criminal Justice System: Assessing the Use of Risk Assessments in Sentencing

Curated on August 30, 2017August 3, 2018 by Stefaan Verhulst

Priscilla Guo, Danielle Kehl, and Sam Kessler at Responsive Communities (Harvard): “In the summer of 2016, some unusual headlines began appearing in news outlets across the United States. “Secret Algorithms That Predict Future Criminals Get a Thumbs Up From the Wisconsin Supreme Court,” read one. Another declared: “There’s software used across the country to predict future criminals. And it’s biased against blacks.” These news stories (and others like them) drew attention to a previously obscure but fast-growing area in the field of criminal justice: the use of risk assessment software, powered by sophisticated and sometimes proprietary algorithms, to predict whether individual criminals are likely candidates for recidivism. In recent years, these programs have spread like wildfire throughout the American judicial system. They are now being used in a broad capacity, in areas ranging from pre-trial risk assessment to sentencing and probation hearings. This paper focuses on the latest—and perhaps most concerning—use of these risk assessment tools: their incorporation into the criminal sentencing process, a development which raises fundamental legal and ethical questions about fairness, accountability, and transparency. The goal is to provide an overview of these issues and offer a set of key considerations and questions for further research that can help local policymakers who are currently implementing or considering implementing similar systems. We start by putting this trend in context: the history of actuarial risk in the American legal system and the evolution of algorithmic risk assessments as the latest incarnation of a much broader trend. We go on to discuss how these tools are used in sentencing specifically and how that differs from other contexts like pre-trial risk assessment. We then delve into the legal and policy questions raised by the use of risk assessment software in sentencing decisions, including the potential for constitutional challenges under the Due Process and Equal Protection clauses of the Fourteenth Amendment. Finally, we summarize the challenges that these systems create for law and policymakers in the United States, and outline a series of possible best practices to ensure that these systems are deployed in a manner that promotes fairness, transparency, and accountability in the criminal justice system….(More)”.

Who Falls for Fake News? The Roles of Analytic Thinking, Motivated Reasoning, Political Ideology, and Bullshit Receptivity

Curated on August 30, 2017August 3, 2018 by Stefaan Verhulst

Paper by Gordon Pennycook and David G. Rand: “Inaccurate beliefs pose a threat to democracy and fake news represents a particularly egregious and direct avenue by which inaccurate beliefs have been propagated via social media. Here we investigate the cognitive psychological profile of individuals who fall prey to fake news. We find a consistent positive correlation between the propensity to think analytically – as measured by the Cognitive Reflection Test (CRT) – and the ability to differentiate fake news from real news (“media truth discernment”). This was true regardless of whether the article’s source was indicated (which, surprisingly, also had no main effect on accuracy judgments). Contrary to the motivated reasoning account, CRT was just as positively correlated with media truth discernment, if not more so, for headlines that aligned with individuals’ political ideology relative to those that were politically discordant. The link between analytic thinking and media truth discernment was driven both by a negative correlation between CRT and perceptions of fake news accuracy (particularly among Hillary Clinton supporters), and a positive correlation between CRT and perceptions of real news accuracy (particularly among Donald Trump supporters). This suggests that factors that undermine the legitimacy of traditional news media may exacerbate the problem of inaccurate political beliefs among Trump supporters, who engaged in less analytic thinking and were overall less able to discern fake from real news (regardless of the news’ political valence). We also found consistent evidence that pseudo-profound bullshit receptivity negatively correlates with perceptions of fake news accuracy; a correlation that is mediated by analytic thinking. Finally, analytic thinking was associated with an unwillingness to share both fake and real news on social media. Our results indicate that the propensity to think analytically plays an important role in the recognition of misinformation, regardless of political valence – a finding that opens up potential avenues for fighting fake news….(More)”.

From Katrina To Harvey: How Disaster Relief Is Evolving With Technology

Curated on August 29, 2017August 3, 2018 by Stefaan Verhulst

Cale Guthrie Weissman at Fast Company: “Open data may sound like a nerdy thing, but this weekend has proven it’s also a lifesaver in more ways than one.

As Hurricane Harvey pelted the southern coast of Texas, a local open-data resource helped provide accurate and up-to-date information to the state’s residents. Inside Harris County’s intricate bayou system–intended to both collect water and effectively drain it–gauges were installed to sense when water is overflowing. The sensors transmit the data to a website, which has become a vital go-to for Houston residents….

This open access to flood gauges is just one of the many ways new tech-driven projects have helped improve responses to disasters over the years. “There’s no question that technology has played a much more significant role,” says Lemaitre, “since even Hurricane Sandy.”

While Sandy was noted in 2012 for its ability to connect people with Twitter hashtags and other relatively nascent social apps like Instagram, the last few years have brought a paradigm shift in terms of how emergency relief organizations integrate technology into their responses….

Social media isn’t just for the residents. Local and national agencies–including FEMA–rely on this information and are using it to help create faster and more effective disaster responses. Following the disaster with Hurricane Katrina, FEMA worked over the last decade to revamp its culture and methods for reacting to these sorts of situations. “You’re seeing the federal government adapt pretty quickly,” says Lemaitre.

There are a few examples of this. For instance, FEMA now has an app to push necessary information about disaster preparedness. The agency also employs people to cull the open web for information that would help make its efforts better and more effective. These “social listeners” look at all the available Facebook, Snapchat, and other social media posts in aggregate. Crews are brought on during disasters to gather intelligence, and then report about areas that need relief efforts–getting “the right information to the right people,” says Lemaitre.

There’s also been a change in how this information is used. Often, when disasters are predicted, people send supplies to the affected areas as a way to try and help out. Yet they don’t know exactly where they should send it, and local organizations sometimes become inundated. This creates a huge logistical nightmare for relief organizations that are sitting on thousands of blankets and tarps in one place when they should be actively dispersing them across hundreds of miles.

“Before, you would just have a deluge of things dropped on top of a disaster that weren’t particularly helpful at times,” says Lemaitre. Now people are using sites like Facebook to ask where they should direct the supplies. For example, after a bad flood in Louisiana last year, a woman announced she had food and other necessities on Facebook and was able to direct the supplies to an area in need. This, says Lemaitre, is “the most effective way.”

Put together, Lemaitre has seen agencies evolve with technology to help create better systems for quicker disaster relief. This has also created a culture of learning updates and reacting in real time. Meanwhile, more data is becoming open, which is helping both people and agencies alike. (The National Weather Service, which has long trumpeted its open data for all, has become a revered stalwart for such information, and has already proven indispensable in Houston.)

Most important, the pace of technology has caused organizations to change their own procedures. Twelve years ago, during Katrina, the protocol was to wait until an assessment before deploying any assistance. Now organizations like FEMA know that just doesn’t work. “You can’t afford to lose time,” says Lemaitre. “Deploy as much as you can and be fast about it–you can always scale back.”

It’s important to note that, even with rapid technological improvements, there’s no way to compare one disaster response to another–it’s simply not apples to apples. All the same, organizations are still learning about where they should be looking and how to react, connecting people to their local communities when they need them most….(More)”.

Bridging Governments’ Borders

Curated on August 17, 2017August 3, 2018 by Stefaan Verhulst

Robyn Scott & Lisa Witter at SSIR: “…Our research found that “disconnection” falls into five, negatively reinforcing categories in the public sector; a closer look at these categories may help policy makers see the challenge before them more clearly:

1. Disconnected Governments

There is a truism in politics and government that all policy is local and context-dependent. Whether this was ever an accurate statement is questionable; it is certainly no longer. While all policy must ultimately be customized for local conditions, it absurd to assume there is little or nothing to learn from other countries. Three trends, in fact, indicate that solutions will become increasingly fungible between countries…..

2. Disconnected Issues

What climate change policy can endure without a job-creation strategy? What sensible criminal justice reform does not consider education? Yet even within countries, departments and their employees often remain as foreign to each other as do nations….

3. Disconnected Public Servants

The isolation of governments, and of government departments, is caused by and reinforces the isolation of people working in government, who have few incentives—and plenty of disincentives—to share what they are working on…..

4. Disconnected Citizens

…There are areas of increasingly visible progress in bridging the disconnections of government, citizen engagement being one. We’re still in the early stages, but private sector fashions such as human-centered design and design thinking have become government buzzwords. And platforms enabling new types of citizen engagement—from participatory budgeting to apps that people use to report potholes—are increasingly popping up around the world…..

5. Disconnected Ideas

According to the World Bank’s own data, one third of its reports are never read, even once. Foundations and academia pour tens of millions of dollars into policy research with few targeted channels to reach policymakers; they also tend to produce and deliver information in formats that policymakers don’t find useful. People in government, like everyone else, are frequently on their mobile phones, and short of time….(More)”

What does it mean to be differentially private?

Curated on August 17, 2017August 3, 2018 by Stefaan Verhulst

Paul Francis at IAPP: “Back in June 2016, Apple announced it will use differential privacy to protect individual privacy for certain data that it collects. Though already a hot research topic for over a decade, this announcement introduced differential privacy to the broader public. Before that announcement, Google had already been using differential privacy for collecting Chrome usage statistics. And within the last month, Uber announced that they too are using differential privacy.

If you’ve done a little homework on differential privacy, you may have learned that it provides provable guarantees of privacy and concluded that a database that is differentially private is, well, private — in other words, that it protects individual privacy. But that isn’t necessarily the case. When someone says, “a database is differentially private,” they don’t mean that the database is private. Rather, they mean, “the privacy of the database can be measured.”

Really, it is like saying that “a bridge is weight limited.” If you know the weight limit of a bridge, then yes, you can use the bridge safely. But the bridge isn’t safe under all conditions. You can exceed the weight limit and hurt yourself.

The weight limit of bridges is expressed in tons, kilograms or number of people. Simplifying here a bit, the amount of privacy afforded by a differentially private database is expressed as a number, by convention labeled ε (epsilon). Lower ε means more private.

All bridges have a weight limit. Everybody knows this, so it sounds dumb to say, “a bridge is weight limited.” And guess what? All databases are differentially private. Or, more precisely, all databases have an ε. A database with no privacy protections at all has an ε of infinity. It is pretty misleading to call such a database differentially private, but mathematically speaking, it is not incorrect to do so. A database that can’t be queried at all has an ε of zero. Private, but useless.

In their paper on differential privacy for statistics, Cynthia Dwork and Adam Smith write, “The choice of ε is essentially a social question. We tend to think of ε as, say, 0.01, 0.1, or in some cases, ln 2 or ln 3.” The natural logarithm of 3 (ln 3) is around 1.1….(More)”.

Crowdsourcing the Charlottesville Investigation

Curated on August 16, 2017August 3, 2018 by Stefaan Verhulst

Internet sleuths got to work, and by Monday morning they were naming names and calling for arrests.

The name of the helmeted man went viral after New York Daily News columnist Shaun King posted a series of photos on Twitter and Facebook that more clearly showed his face and connected him to photos from a Facebook account. “Neck moles gave it away,” King wrote in his posts, which were shared more than 77,000 times. But the name of the red-bearded assailant was less clear: some on Twitter claimed it was a Texas man who goes by a Nordic alias online. Others were sure it was a Michigan man who, according to Facebook, attended high school with other white nationalist demonstrators depicted in photos from Charlottesville.

After being contacted for comment by The Marshall Project, the Michigan man removed his Facebook page from public view.

Such speculation, especially when it is not conclusive, has created new challenges for law enforcement. There is the obvious risk of false identification. In 2013, internet users wrongly identified university student Sunil Tripathi as a suspect in the Boston marathon bombing, prompting the internet forum Reddit to issue an apology for fostering “online witch hunts.” Already, an Arkansas professor was misidentified as as a torch-bearing protester, though not a criminal suspect, at the Charlottesville rallies.

Beyond the cost to misidentified suspects, the crowdsourced identification of criminal suspects is both a benefit and burden to investigators.

“If someone says: ‘hey, I have a picture of someone assaulting another person, and committing a hate crime,’ that’s great,” said Sgt. Sean Whitcomb, the spokesman for the Seattle Police Department, which used social media to help identify the pilot of a drone that crashed into a 2015 Pride Parade. (The man was convicted in January.) “But saying, ‘I am pretty sure that this person is so and so’. Well, ‘pretty sure’ is not going to cut it.”

Still, credible information can help police establish probable cause, which means they can ask a judge to sign off on either a search warrant, an arrest warrant, or both….(More)“.

Gaming for Infrastructure

Curated on August 16, 2017August 3, 2018 by Stefaan Verhulst

Nilmini Rubin & Jennifer Hara at the Stanford Social Innovation Review: “…the American Society of Civil Engineers (ASCE) estimates that the United States needs $4.56 trillion to keep its deteriorating infrastructure current but only has funding to cover less than half of necessary infrastructure spending—leaving the at least country $2.0 trillion short through the next decade. Globally, the picture is bleak as well: World Economic Forum estimates that the infrastructure gap is $1 trillion each year.

What can be done? Some argue that public-private partnerships (PPPs or P3s) are the answer. We agree that they can play an important role—if done well. In a PPP, a private party provides a public asset or service for a government entity, bears significant risk, and is paid on performance. The upside for governments and their citizens is that the private sector can be incentivized to deliver projects on time, within budget, and with reduced construction risk. The private sector can benefit by earning a steady stream of income from a long-term investment from a secure client. From the Grand Parkway Project in Texas to the Queen Alia International Airport in Jordan, PPPs have succeeded domestically and internationally.

The problem is that PPPs can be very hard to design and implement. And since they can involve commitments of millions or even billions of dollars, a PPP failure can be awful. For example, the Berlin Airport is a PPP that is six years behind schedule, and its costs overruns total roughly $3.8 billion to date.

In our experience, it can be useful for would-be partners to practice engaging in a PPP before they dive into a live project. At our organization, Tetra Tech’s Institute for Public-Private Partnerships, for example, we use an online and multiplayer game—the P3 Game—to help make PPPs work.

The game is played with 12 to 16 people who are divided into two teams: a Consortium and a Contracting Authority. In each of four rounds, players mimic the activities they would engage in during the course of a real PPP, and as in real life, they are confronted with unexpected events: The Consortium fails to comply with a routine road inspection, how should the Contracting Authority team respond? The cost of materials skyrockets, how should the Consortium team manage when it has a fixed price contract?

Players from government ministries, legislatures, construction companies, financial institutions, and other entities get to swap roles and experience a PPP from different vantage points. They think through challenges and solve problems together—practicing, failing, learning, and growing—within the confines of the game and with no real-world cost.

More than 1,000 people have participated to date, including representatives of the US Army Corps of Engineers, the World Bank, and Johns Hopkins University, using a variety of scenarios. PPP team members who work on part of the Schiphol-Amsterdam-Almere Project, a $5.6-billion road project in the Netherlands, played the game using their actual contract document….(More)”.