The Big Easy Budget Game uses open data from the city to allow players to create their own version of an operating budget. Players are given a digital $602 million, and have to balance the budget — keeping in mind the government’s responsibilities, previous year’s spending and their personal priorities.
Each department in the game has a minimum funding level (players can’t just quit funding public schools if they feel like it), and restricted funding, such as state or federal dollars, is off limits.
CBNO hopes to attract 600 players this year, and plans to compile the data from each player into a crowdsourced meta-budget called “The People’s Budget.” Next fall, the People’s Budget will be released along with the city’s proposed 2017 budget.
Along with the budgeting game, CBNO released a more detailed website, also using the city’s open data, that breaks down the city’s budgeted versus actual spending from 2007 to now and is filterable. The goal is to allow users without big data experience to easily research funding relevant to their neighborhoods.
Many cities have been releasing interactive websites to make their data more accessible to residents. Checkbook NYC updates more than $70 billion in city expenses daily and breaks them down by transaction. Fiscal Focus Pittsburgh is an online visualization tool that outlines revenues and expenses in the city’s budget….(More)”
Joe McKendrick at ZDNet: “Open data is one of those refreshing trends that flows in the opposite direction of the culture of fear that has developed around data security. Instead of putting data under lock and key, surrounded by firewalls and sandboxes, some organizations see value in making data available to all comers — especially developers.
The GovLab.org, a nonprofit advocacy group, published an overview of the benefits governments and organizations are realizing from open data, as well as some of the challenges. The group defines open data as “publicly available data that can be universally and readily accessed, used and redistributed free of charge. It is structured for usability and computability.”…
For enterprises, an open-data stance may be the fuel to build a vibrant ecosystem of developers and business partners. Scott Feinberg, API architect for The New York Times, is one of the people helping to lead the charge to open-data ecosystems. In a recent CXOTalk interview with ZDNet colleague Michael Krigsman, he explains how through the NYT APIs program, developers can sign up for access to 165 years worth of content.
But it requires a lot more than simply throwing some APIs out into the market. Establishing such a comprehensive effort across APIs requires a change in mindset that many organizations may not be ready for, Feinberg cautions. “You can’t be stingy,” he says. “You have to just give it out. When we launched our developer portal there’s a lot of questions like, are people going to be stealing our data, questions like that. Just give it away. You don’t have to give it all but don’t be stingy, and you will find that first off not that many people are going to use it at first. you’re going to find that out, but the people who do, you’re going to find those passionate people who are really interested in using your data in new ways.”
Feinberg clarifies that the NYT’s APIs are not giving out articles for free. Rather, he explains, “we give is everything but article content. You can search for articles. You can find out what’s trending. You can almost do anything you want with our data through our APIs with the exception of actually reading all of the content. It’s really about giving people the opportunity to really interact with your content in ways that you’ve never thought of, and empowering your community to figure out what they want. You know while we don’t give our actual article text away, we give pretty much everything else and people build a lot of really cool stuff on top of that.”
Open data sets, of course, have to worthy of the APIs that offer them. In his post, Borne outlines the seven qualities open data needs to have to be of value to developers and consumers. (Yes, they’re also “Vs” like big data.)
Validity: It’s “critical to pay attention to these data validity concerns when your organization’s data are exposed to scrutiny and inspection by others,” Borne states.
Value: The data needs to be the font of new ideas, new businesses, and innovations.
Variety: Exposing the wide variety of data available can be “a scary proposition for any data scientist,” Borne observes, but nonetheless is essential.
Voice: Remember that “your open data becomes the voice of your organization to your stakeholders.”
Vocabulary: “The semantics and schema (data models) that describe your data are more critical than ever when you provide the data for others to use,” says Borne. “Search, discovery, and proper reuse of data all require good metadata, descriptions, and data modeling.”
Vulnerability: Accept that open data, because it is so open, will be subjected to “misuse, abuse, manipulation, or alteration.”
proVenance: This is the governance requirement behind open data offerings. “Provenance includes ownership, origin, chain of custody, transformations that been made to it, processing that has been applied to it (including which versions of processing software were used), the data’s uses and their context, and more,” says Borne….(More)”
The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data and humanitarian response was originally published in 2016.
Data, when used well in a trusted manner, allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:
Selected Reading List (summaries and hyperlinks)
Annotated Selected Reading List
Additional Readings
Selected Reading List (summaries in alphabetical order)
Fancesco Mancini, International Peace Institute – New Technology and the prevention of Violence and Conflict – Explores the ways in which new tools available in communications technology can assist humanitarian workers in preventing violence and conflict.
Andrew Robertson and Steve Olson (USIP) – Using Data Sharing to Improve Coordination in Peacebuilding – Summarises the findings of a United States Institute of Peace workshop which investigated the use of data-sharing systems between government and non-government actors in conflict zones. It identifies some of the challenges and benefits of data-sharing in peacebuilding efforts.
United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development – A World That Counts, Mobilizing the Data Revolution – Compiled by a group of 20 international experts, this report proposes ways to improve data management and monitoring, whilst mitigating some of the risks data poses.
Katie Whipkey and Andrej Verity – Guidance for Incorporating Big Data into Humanitarian Operations – Created as part of the Digital Humanitarian Network with the support of UN-OCHA, this is a manual for humanitarian organizations looking to strategically incorporate Big Data into their work.
Katja Lindskov Jacobsen –Making design safe for citizens: A hidden history of humanitarian experimentation – Argues that the UNHCR’s use of iris recognition technology in 2002 and 2007 during the repatriation of Afghan refugees from Pakistan constitutes a case of “humanitarian experimentation.” It questions this sort of experimentation which compromises the security of refugees in the pursuit of safer technologies for the rest of the world.
Responsible Data Forum –Responsible Data Reflection Stories: an Overview – compiles various stories sourced by the Responsible Data Forum blog relating to data challenges faced by advocacy organizations, and draws recommendations based on these cases.
Kristin Bergtora Sandvik – The humanitarian cyberspace: shrinking space or an expanding frontier? – Provides a detailed account of the development of a “humanitarian cyberspace” and how information and communication technologies have been further integrated into humanitarian operations since the mid-1990s.
Annotated Selected Reading List (in alphabetical order)
Karlsrud, John. “Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies.” Cyberspace and International Relations, 2013. http://bit.ly/235Qb3e
This chapter from the book “Cyberspace and International Relations” suggests that advances in big data give humanitarian organizations unprecedented opportunities to prevent and mitigate natural disasters and humanitarian crises. However, the sheer amount of unstructured data necessitates effective “data mining” strategies for multinational organizations to make the most use of this data.
By profiling some civil-society organizations who use big data in their peacekeeping efforts, Karlsrud suggests that these community-focused initiatives are leading the movement toward analyzing and using big data in countries vulnerable to crisis.
The chapter concludes by offering ten recommendations to UN peacekeeping forces to best realize the potential of big data and new technology in supporting their operations.
Mancini, Fancesco. “New Technology and the prevention of Violence and Conflict.” International Peace Institute, 2013. http://bit.ly/1ltLfNV
This report from the International Peace Institute looks at five case studies to assess how information and communications technologies (ICTs) can help prevent humanitarian conflicts and violence. Their findings suggest that context has a significant impact on the ability for these ICTs for conflict prevention, and any strategies must take into account the specific contingencies of the region to be successful.
The report suggests seven lessons gleaned from the five case studies:
New technologies are just one in a variety of tools to combat violence. Consequently, organizations must investigate a variety of complementary strategies to prevent conflicts, and not simply rely on ICTs.
Not every community or social group will have the same relationship to technology, and their ability to adopt new technologies are similarly influenced by their context. Therefore, a detailed needs assessment must take place before any new technologies are implemented.
New technologies may be co-opted by violent groups seeking to maintain conflict in the region. Consequently, humanitarian groups must be sensitive to existing political actors and be aware of possible negative consequences these new technologies may spark.
Local input is integral to support conflict prevention measures, and there exists need for collaboration and awareness-raising with communities to ensure new technologies are sustainable and effective.
Information shared between civil-society has more potential to develop early-warning systems. This horizontal distribution of information can also allow communities to maintain the accountability of local leaders.
Meier, Patrick. “Digital humanitarians: how big data is changing the face of humanitarian response.” Crc Press, 2015. http://amzn.to/1RQ4ozc
This book traces the emergence of “Digital Humanitarians”—people who harness new digital tools and technologies to support humanitarian action. Meier suggests that this has created a “nervous system” to connect people from disparate parts of the world, revolutionizing the way we respond to humanitarian crises.
Meier argues that such technology is reconfiguring the structure of the humanitarian space, where victims are not simply passive recipients of aid but can contribute with other global citizens. This in turn makes us more humane and engaged people.
Robertson, Andrew and Olson, Steve. “Using Data Sharing to Improve Coordination in Peacebuilding.” United States Institute for Peace, 2012. http://bit.ly/235QuLm
This report functions as an overview of a roundtable workshop on Technology, Science and Peace Building held at the United States Institute of Peace. The workshop aimed to investigate how data-sharing techniques can be developed for use in peace building or conflict management.
Four main themes emerged from discussions during the workshop:
“Data sharing requires working across a technology-culture divide”—Data sharing needs the foundation of a strong relationship, which can depend on sociocultural, rather than technological, factors.
“Information sharing requires building and maintaining trust”—These relationships are often built on trust, which can include both technological and social perspectives.
“Information sharing requires linking civilian-military policy discussions to technology”—Even when sophisticated data-sharing technologies exist, continuous engagement between different stakeholders is necessary. Therefore, procedures used to maintain civil-military engagement should be broadened to include technology.
“Collaboration software needs to be aligned with user needs”—technology providers need to keep in mind the needs of its users, in this case peacebuilders, in order to ensure sustainability.
United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution.” 2014. https://bit.ly/2Cb3lXq
This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.
Katie Whipkey and Andrej Verity. “Guidance for Incorporating Big Data into Humanitarian Operations.” Digital Humanitarian Network, 2015. http://bit.ly/1Y2BMkQ
This report produced by the Digital Humanitarian Network provides an overview of big data, and how humanitarian organizations can integrate this technology into their humanitarian response. It primarily functions as a guide for organizations, and provides concise, brief outlines of what big data is, and how it can benefit humanitarian groups.
The report puts forward four main benefits acquired through the use of big data by humanitarian organizations: 1) the ability to leverage real-time information; 2) the ability to make more informed decisions; 3) the ability to learn new insights; 4) the ability for organizations to be more prepared.
It goes on to assess seven challenges big data poses for humanitarian organizations: 1) geography, and the unequal access to technology across regions; 2) the potential for user error when processing data; 3) limited technology; 4) questionable validity of data; 5) underdeveloped policies and ethics relating to data management; 6) limitations relating to staff knowledge.
Risks of Using Big Data in Humanitarian Context Crawford, Kate, and Megan Finn. “The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters.” GeoJournal 80.4, 2015. http://bit.ly/1X0F7AI
Crawford & Finn present a critical analysis of the use of big data in disaster management, taking a more skeptical tone to the data revolution facing humanitarian response.
They argue that though social and mobile data analysis can yield important insights and tools in crisis events, it also presents a number of limitations which can lead to oversights being made by researchers or humanitarian response teams.
Crawford & Finn explore the ethical concerns the use of big data in disaster events introduces, including issues of power, privacy, and consent.
The paper concludes by recommending that critical data studies, such as those presented in the paper, be integrated into crisis event research in order to analyze some of the assumptions which underlie mobile and social data.
Jacobsen, Katja Lindskov (2010)Making design safe for citizens: A hidden history of humanitarian experimentation. Citizenship Studies 14.1: 89-103. http://bit.ly/1YaRTwG
This paper explores the phenomenon of “humanitarian experimentation,” where victims of disaster or conflict are the subjects of experiments to test the application of technologies before they are administered in greater civilian populations.
By analyzing the particular use of iris recognition technology during the repatriation of Afghan refugees to Pakistan in 2002 to 2007, Jacobsen suggests that this “humanitarian experimentation” compromises the security of already vulnerable refugees in order to better deliver biometric product to the rest of the world.
Responsible Data Forum. “Responsible Data Reflection Stories: An Overview.” http://bit.ly/1Rszrz1
This piece from the Responsible Data forum is primarily a compilation of “war stories” which follow some of the challenges in using big data for social good. By drawing on these crowdsourced cases, the Forum also presents an overview which makes key recommendations to overcome some of the challenges associated with big data in humanitarian organizations.
It finds that most of these challenges occur when organizations are ill-equipped to manage data and new technologies, or are unaware about how different groups interact in digital spaces in different ways.
Sandvik, Kristin Bergtora. “The humanitarian cyberspace: shrinking space or an expanding frontier?” Third World Quarterly 37:1, 17-32, 2016. http://bit.ly/1PIiACK
This paper analyzes the shift toward more technology-driven humanitarian work, where humanitarian work increasingly takes place online in cyberspace, reshaping the definition and application of aid. This has occurred along with what many suggest is a shrinking of the humanitarian space.
Sandvik provides three interpretations of this phenomena:
First, traditional threats remain in the humanitarian space, which are both modified and reinforced by technology.
Second, new threats are introduced by the increasing use of technology in humanitarianism, and consequently the humanitarian space may be broadening, not shrinking.
Finally, if the shrinking humanitarian space theory holds, cyberspace offers one example of this, where the increasing use of digital technology to manage disasters leads to a contraction of space through the proliferation of remote services.
Additional Readings on Data and Humanitarian Response
Kristin Bergtora Sandvik, et al. – Humanitarian technology: a critical research agenda. – Takes a critical look at the field of humanitarian technology, analyzing what challenges this poses to post-disaster and conflict environment.
Kristin Bergtora Sandvik – “The Risks of Technological Innovation.” – Suggests that despite the evident benefits such technology presents, it can also undermine humanitarian action and lead to “catastrophic events” themselves needing a new type of humanitarian response.
Kate Crawford – Is Data a Danger to the Developing World? – Argues that it is not simply risks to privacy that data poses to developing countries, but suggests that “data discrimination” can affect even the basic human rights of individuals, and introduce problematic power hierarchies between those who can access data and those who cannot.
Paul Currion –Eyes Wide Shut: The challenge of humanitarian biometrics – Examines the use of biometrics by humanitarian organizations and national governments, and suggests stronger accountability is needed to ensure data from marginalized groups remain protected.
Yves-Alexandre de Montjoye, Jake Kendall and Cameron F. Kerry – Enabling Humanitarian Use of Mobile Phone Data – Analyzes how data from mobile communication can provide insights into the spread of infectious disease, and how such data can also compromise individual privacy.
Gus Hosein and Carly Nyst – Aiding Surveillance – Suggests that the unregulated use of technologies and surveillance systems by humanitarian organizations create systems which pose serious threats to individuals’ rights, particularly their right to privacy.
Mary K.Pratt – Big Data’s role in humanitarian aid – A Computer World article which provides an overview of Big Data, and how it is improving the efficiency and efficacy of humanitarian response, especially in conflict zones.
Bertrand Taithe Róisínand and Roger Mac Ginty – Data hubris? Humanitarian information systems and the mirage of technology – Specifically looks at visual technology and crisis mapping, and big data, and suggests that there exists an over-enthusiasm in these claims made on behalf of technologically advanced humanitarian information systems.
UN Office for the Coordination of Humanitarian Affairs (UN-OCHA) – Big data and humanitarianism: 5 things you need to know – Briefly outlines five issues that face humanitarian organizations as they integrate big data into their operations.
* Thanks to: Kristen B. Sandvik; Zara Rahman; Jennifer Schulte; Sean McDonald; Paul Currion; Dinorah Cantú-Pedraza and the Responsible Data Listserve for valuable input.
“The Berkman Center is pleased to announce the publication of a new paper from the Privacy Tools for Sharing Research Data project team. In this paper, Effy Vayena, Urs Gasser, Alexandra Wood, and David O’Brien from the Berkman Center, with Micah Altman from MIT Libraries, outline elements of a new ethical framework for big data research.
Emerging large-scale data sources hold tremendous potential for new scientific research into human biology, behaviors, and relationships. At the same time, big data research presents privacy and ethical challenges that the current regulatory framework is ill-suited to address. In light of the immense value of large-scale research data, the central question moving forward is not whether such data should be made available for research, but rather how the benefits can be captured in a way that respects fundamental principles of ethics and privacy.
The authors argue that a framework with the following elements would support big data utilization and help harness the value of big data in a sustainable and trust-building manner:
Oversight should aim to provide universal coverage of human subjects research, regardless of funding source, across all stages of the information lifecycle.
New definitions and standards should be developed based on a modern understanding of privacy science and the expectations of research subjects.
Researchers and review boards should be encouraged to incorporate systematic risk-benefit assessments and new procedural and technological solutions from the wide range of interventions that are available.
Oversight mechanisms and the safeguards implemented should be tailored to the intended uses, benefits, threats, harms, and vulnerabilities associated with a specific research activity.
Development of a new ethical framework with these elements should be the product of a dynamic multistakeholder process that is designed to capture the latest scientific understanding of privacy, analytical methods, available safeguards, community and social norms, and best practices for research ethics as they evolve over time.
The full paper is available for download through the Washington and Lee Law Review Online as part of a collection of papers featured at the Future of Privacy Forum workshop Beyond IRBs: Designing Ethical Review Processes for Big Data Research held on December 10, 2015, in Washington, DC….(More)”
Payal Arora at the International Journal of Communication: “To date, little attention has been given to the impact of big data in the Global South, about 60% of whose residents are below the poverty line. Big data manifests in novel and unprecedented ways in these neglected contexts. For instance, India has created biometric national identities for her 1.2 billion people, linking them to welfare schemes, and social entrepreneurial initiatives like the Ushahidi project that leveraged crowdsourcing to provide real-time crisis maps for humanitarian relief.
While these projects are indeed inspirational, this article argues that in the context of the Global South there is a bias in the framing of big data as an instrument of empowerment. Here, the poor, or the “bottom of the pyramid” populace are the new consumer base, agents of social change instead of passive beneficiaries. This neoliberal outlook of big data facilitating inclusive capitalism for the common good sidelines critical perspectives urgently needed if we are to channel big data as a positive social force in emerging economies. This article proposes to assess these new technological developments through the lens of databased democracies, databased identities, and databased geographies to make evident normative assumptions and perspectives in this under-examined context….(More)”.
Karen E.C.Levy and David MerrittJohns in Big Data and Society: “Openness and transparency are becoming hallmarks of responsible data practice in science and governance. Concerns about data falsification, erroneous analysis, and misleading presentation of research results have recently strengthened the call for new procedures that ensure public accountability for data-driven decisions. Though we generally count ourselves in favor of increased transparency in data practice, this Commentary highlights a caveat. We suggest that legislative efforts that invoke the language of data transparency can sometimes function as “Trojan Horses” through which other political goals are pursued. Framing these maneuvers in the language of transparency can be strategic, because approaches that emphasize open access to data carry tremendous appeal, particularly in current political and technological contexts. We illustrate our argument through two examples of pro-transparency policy efforts, one historical and one current: industry-backed “sound science” initiatives in the 1990s, and contemporary legislative efforts to open environmental data to public inspection. Rules that exist mainly to impede science-based policy processes weaponize the concept of data transparency. The discussion illustrates that, much as Big Data itself requires critical assessment, the processes and principles that attend it—like transparency—also carry political valence, and, as such, warrant careful analysis….(More)”
LIMN issue edited by Boris Jardine and Christopher Kelty: “Vast accumulations saturate our world: phone calls and emails stored by security agencies; every preference of every individual collected by advertisers; ID numbers, and maybe an iris scan, for every Indian; hundreds of thousands of whole genome sequences; seed banks of all existing plants, and of course, books… all of them. Just what is the purpose of these optimistically total archives, and how are they changing us?
This issue of Limn asks authors and artists to consider how these accumulations govern us, where this obsession with totality came from and how we might think differently about big data and algorithms, by thinking carefully through the figure of the archive.
Alison Powell at LSE Media Policy Project Blog: “Algorithms are everywhere, or so we are told, and the black boxes of algorithmic decision-making make oversight of processes that regulators and activists argue ought to be transparent more difficult than in the past. But when, and where, and which machines do we wish to make accountable, and for what purpose? In this post I discuss how algorithms discussed by scholars are most commonly those at work on media platforms whose main products are the social networks and attention of individuals. Algorithms, in this case, construct individual identities through patterns of behaviour, and provide the opportunity for finely targeted products and services. While there are serious concerns about, for instance, price discrimination, algorithmic systems for communicating and consuming are, in my view, less inherently problematic than processes that impact on our collective participation and belonging as citizenship. In this second sphere, algorithmic processes – especially machine learning – combine with processes of governance that focus on individual identity performance to profoundly transform how citizenship is understood and undertaken.
Communicating and consuming
In the communications sphere, algorithms are what makes it possible to make money from the web for example through advertising brokerage platforms that help companies bid for ads on major newspaper websites. IP address monitoring, which tracks clicks and web activity, creates detailed consumer profiles and transform the everyday experience of communication into a constantly-updated production of consumer information. This process of personal profiling is at the heart of many of the concerns about algorithmic accountability. The consequence of perpetual production of data by individuals and the increasing capacity to analyse it even when it doesn’t appear to relate has certainly revolutionalised advertising by allowing more precise targeting, but what has it done for areas of public interest?
John Cheney-Lippold identifies how the categories of identity are now developed algorithmically, since a category like gender is not based on self-discloure, but instead on patterns of behaviour that fit with expectations set by previous alignment to a norm. In assessing ‘algorithmic identities’, he notes that these produce identity profiles which are narrower and more behaviour-based than the identities that we perform. This is a result of the fact that many of the systems that inspired the design of algorithmic systems were based on using behaviour and other markers to optimise consumption. Algorithmic identity construction has spread from the world of marketing to the broader world of citizenship – as evidenced by the Citizen Ex experiment shown at the Web We Want Festival in 2015.
Individual consumer-citizens
What’s really at stake is that the expansion of algorithmic assessment of commercially derived big data has extended the frame of the individual consumer into all kinds of other areas of experience. In a supposed ‘age of austerity’ when governments believe it’s important to cut costs, this connects with the view of citizens as primarily consumers of services, and furthermore, with the idea that a citizen is an individual subject whose relation to a state can be disintermediated given enough technology. So, with sensors on your garbage bins you don’t need to even remember to take them out. With pothole reporting platforms like FixMyStreet, a city government can be responsive to an aggregate of individual reports. But what aspects of our citizenship are collective? When, in the algorithmic state, can we expect to be together?
Put another way, is there any algorithmic process to value the long term education, inclusion, and sustenance of a whole community for example through library services?…
Seeing algorithms – machine learning in particular – as supporting decision-making for broad collective benefit rather than as part of ever more specific individual targeting and segmentation might make them more accountable. But more importantly, this would help algorithms support society – not just individual consumers….(More)”
Jessica Leber at Fast Co-Exist: “Cities have long seen the potential in big data to improve the government and the lives of citizens, and this is now being put into action in ways where governments touch citizens’ lives in very sensitive areas. New York City’s Department of Homelessness Services is mining apartment eviction filings, to see if they can understand who is at risk of becoming homeless and intervene early. And police departments all over the country have adopted predictive policing software that guides where officers should deploy, and at what time, leading to reduced crime in some cities.
In one study in Los Angeles, police officers deployed to certain neighborhoods by predictive policing software prevented 4.3 crimes per week, compared to 2 crimes per week when assigned to patrol a specific area by human crime analysts. Surely, a reduction in crime is a good thing. But community activists in places such as Bellingham, Washington, have grave doubts. They worry that outsiders can’t examine how the algorithms work, since the software is usually proprietary, and so citizens have no way of knowing what data the government is using to target them. They also worry that predictive policing is just exacerbating existing patterns of racial profiling. If the underlying crime data being used is the result of years of over-policing minority communities for minor offenses, then the predictions based on this biased data could create a feedback loop and lead to yet more over-policing.
At a smaller and more limited scale is the even more sensitive area of child protection services. Though the data isn’t really as “big” as in other examples, a few agencies are carefully exploring using statistical models to make decisions in several areas, such as which children in the system are most in danger of violence, which children are most in need of a trauma screening, and which are at risk of entering the criminal justice system.
In Hillsborough County, Florida, where a series of child homicides occurred, a private provider selected to manage the county’s child welfare system in 2012 came in and analyzed the data. Cases with the highest probability of serious injury or death had a few factors in common, they found: a child under the age of three, a “paramour” in the home, a substance abuse or domestic violence history, and a parent previously in the foster care system. They identified nine practices to use in these cases and hired a software provider to create a dashboard that allowed real-time feedback and dashboards. Their success has led to the program being implemented statewide….
“I think the opportunity is a rich one. At the same time, the ethical considerations need to be guiding us,” says Jesse Russell, chief program officer at the National Council on Crime and Delinquency, who has followed the use of predictive analytics in child protective services. Officials, he says, are treading carefully before using data to make decisions about individuals, especially when the consequences of being wrong—such as taking a child out of his or her home unnecessarily—are huge. And while caseworker decision-making can be flawed or biased, so can the programs that humans design. When you rely too much on data—if the data is flawed or incomplete, as could be the case in predictive policing—you risk further validating bad decisions or existing biases….
On the other hand, big data does have the potential to vastly expand our understanding of who we are and why we do what we do. A decade ago, serious scientists would have laughed someone out of the room who proposed a study of “the human condition.” It is a topic so broad and lacking in measurability. But perhaps the most important manifestation of big data in people’s lives could come from the ability for scientists to study huge, unwieldy questions they couldn’t before.
A massive scientific undertaking to study the human condition is set to launch in January of 2017. The Kavli Human Project, funded by the Kavli Foundation, plans to recruit 10,000 New Yorkers from all walks of life to be measured for 10 years. And by measured, they mean everything: all financial transactions, tax returns, GPS coordinates, genomes, chemical exposure, IQ, bluetooth sensors around the house, who subjects text and call—and that’s just the beginning. In all, the large team of academics expect to collect about a billion data points per person per year at an unprecedented low cost for each data point compared to other large research surveys.
The hope is with so much continuous data, researchers can for the first time start to disentangle the complex, seemingly unanswerable questions that have plagued our society, from what is causing the obesity epidemic to how to disrupt the poverty to prison cycle….(More)
Colum Lynch at Foreign Policy: “Can the wizards of Silicon Valley develop a set of killer apps to monitor the fragile Syria cease-fire without putting foreign boots on the ground in one of the world’s most dangerous countries?
They’re certainly going to try. The “cessation of hostilities” in Syria brokered by the United States and Russia last month has sharply reduced the levels of violence in the war-torn country and sparked a rare burst of optimism that it could lead to a broader cease-fire. But if the two sides lay down their weapons, the international community will face the challenge of monitoring the battlefield to ensure compliance without deploying peacekeepers or foreign troops. The emerging solution: using crowdsourcing, drones, satellite imaging, and other high-tech tools.
The high-level interest in finding a technological solution to the monitoring challenge was on full display last month at a closed-door meeting convened by the White House that brought together U.N. officials, diplomats, digital cartographers, and representatives of Google, DigitalGlobe, and other technology companies. Their assignment was to brainstorm ways of using high-tech tools to keep track of any future cease-fires from Syria to Libya and Yemen.
The off-the-record event came as the United States, the U.N., and other key powers struggle to find ways of enforcing cease-fires from Syria at a time when there is little political will to run the risk of sending foreign forces or monitors to such dangerous places. The United States has turned to high-tech weapons like armed drones as weapons of war; it now wants to use similar systems to help enforce peace.
Take the Syria Conflict Mapping Project, a geomapping program developed by the Atlanta-based Carter Center, a nonprofit founded by former U.S. President Jimmy Carter and his wife, Rosalynn, to resolve conflict and promote human rights. The project has developed an interactive digital map that tracks military formations by government forces, Islamist extremists, and more moderate armed rebels in virtually every disputed Syrian town. It is now updating its technology to monitor cease-fires.
The project began in January 2012 because of a single 25-year-old intern, Christopher McNaboe. McNaboe realized it was possible to track the state of the conflict by compiling disparate strands of publicly available information — including the shelling and aerial bombardment of towns and rebel positions — from YouTube, Twitter, and other social media sites. It has since developed a mapping program using software provided by Palantir Technologies, a Palo Alto-based big data company that does contract work for U.S. intelligence and defense agencies, from the CIA to the FBI….
Walter Dorn, an expert on technology in U.N. peace operations who attended the White House event, said he had promoted what he calls a “coalition of the connected.”
The U.N. or other outside powers could start by tracking social media sites, including Twitter and YouTube, for reports of possible cease-fire violations. That information could then be verified by “seeded crowdsourcing” — that is, reaching out to networks of known advocates on the ground — and technological monitoring through satellite imagery or drones.
Matthew McNabb, the founder of First Mile Geo, a start-up which develops geolocation technology that can be used to gather data in conflict zones, has another idea. McNabb, who also attended the White House event, believes “on-demand” technologies like SurveyMonkey, which provides users a form to create their own surveys, can be applied in conflict zones to collect data on cease-fire violations….(More)