The Risk to Civil Liberties of Fighting Crime With Big Data


 in the New York Times: “…Sharing data, both among the parts of a big police department and between the police and the private sector, “is a force multiplier,” he said.

Companies working with the military and intelligence agencies have long practiced these kinds of techniques, which the companies are bringing to domestic policing, in much the way surplus military gear has beefed upAmerican SWAT teams.

Palantir first built up its business by offering products like maps of social networks of extremist bombers and terrorist money launderers, and figuring out efficient driving routes to avoid improvised explosive devices.

Palantir used similar data-sifting techniques in New Orleans to spot individuals most associated with murders. Law enforcement departments around Salt Lake City used Palantir to allow common access to 40,000 arrest photos, 520,000 case reports and information like highway and airport data — building human maps of suspected criminal networks.

People in the predictive business sometimes compare what they do to controlling the other side’s “OODA loop,” a term first developed by a fighter pilot and military strategist named John Boyd.

OODA stands for “observe, orient, decide, act” and is a means of managing information in battle.

“Whether it’s war or crime, you have to get inside the other side’s decision cycle and control their environment,” said Robert Stasio, a project manager for cyberanalysis at IBM, and a former United States government intelligence official. “Criminals can learn to anticipate what you’re going to do and shift where they’re working, employ more lookouts.”

IBM sells tools that also enable police to become less predictable, for example, by taking different routes into an area identified as a crime hotspot. It has also conducted studies that show changing tastes among online criminals — for example, a move from hacking retailers’ computers to stealing health care data, which can be used to file for federal tax refunds.

But there are worries about what military-type data analysis means for civil liberties, even among the companies that get rich on it.

“It definitely presents challenges to the less sophisticated type of criminal,but it’s creating a lot of what is called ‘Big Brother’s little helpers,’” Mr.Bowman said. For now, he added, much of the data abundance problem is that “most police aren’t very good at this.”…(More)’

Big Data Is Not a Monolith


Book edited by Cassidy R. Sugimoto, Hamid R. Ekbia and Michael Mattioli: “Big data is ubiquitous but heterogeneous. Big data can be used to tally clicks and traffic on web pages, find patterns in stock trades, track consumer preferences, identify linguistic correlations in large corpuses of texts. This book examines big data not as an undifferentiated whole but contextually, investigating the varied challenges posed by big data for health, science, law, commerce, and politics. Taken together, the chapters reveal a complex set of problems, practices, and policies.

The advent of big data methodologies has challenged the theory-driven approach to scientific knowledge in favor of a data-driven one. Social media platforms and self-tracking tools change the way we see ourselves and others. The collection of data by corporations and government threatens privacy while promoting transparency. Meanwhile, politicians, policy makers, and ethicists are ill-prepared to deal with big data’s ramifications. The contributors look at big data’s effect on individuals as it exerts social control through monitoring, mining, and manipulation; big data and society, examining both its empowering and its constraining effects; big data and science, considering issues of data governance, provenance, reuse, and trust; and big data and organizations, discussing data responsibility, “data harm,” and decision making….(More)”

Civic Crowd Analytics: Making sense of crowdsourced civic input with big data tools


Paper by  that: “… examines the impact of crowdsourcing on a policymaking process by using a novel data analytics tool called Civic CrowdAnalytics, applying Natural Language Processing (NLP) methods such as concept extraction, word association and sentiment analysis. By drawing on data from a crowdsourced urban planning process in the City of Palo Alto in California, we examine the influence of civic input on the city’s Comprehensive City Plan update. The findings show that the impact of citizens’ voices depends on the volume and the tone of their demands. A higher demand with a stronger tone results in more policy changes. We also found an interesting and unexpected result: the city government in Palo Alto mirrors more or less the online crowd’s voice while citizen representatives rather filter than mirror the crowd’s will. While NLP methods show promise in making the analysis of the crowdsourced input more efficient, there are several issues. The accuracy rates should be improved. Furthermore, there is still considerable amount of human work in training the algorithm….(More)”

Ten Actions to Implement Big Data Initiatives: A Study of 65 Cities


Ten Actions to Implement Big Data Initiatives: A Study of 65 Cities
IBM Center for the Business of Government: “Professor Ho conducted a survey and phone interviews with city officials responsible for Big Data initiatives. Based on his research, the report presents a framework for Big Data initiatives which consists of two major cycles: the data cycle and the decision-making cycle. Each cycle is described in the report.

The trend toward Big Data initiatives is likely to accelerate in future years. In anticipation of the increased use of Big Data, Professor Ho identified factors that are likely to influence its adoption by local governments. He identified three organizational factors that influence adoption: leadership attention, adequate staff capacity, and pursuit of partners. In addition, he identified four organizational strategies that influence adoption: governance structures, team approach, incremental initiatives, and Big Data policies.

Based on his research findings, Professor Ho sets forth 10 recommendations for those responsible for implementing cities’ Big Data initiatives—five recommendations are directed to city leaders and five to city executives. A key recommendation is that city leaders should think about a “smart city system,” not just data. Another key recommendation is that city executives should develop a multi-year strategic data plan to enhance the effectiveness of Big Data initiatives….(More)”

Artificial Intelligence can streamline public comment for federal agencies


John Davis at the Hill: “…What became immediately clear to me was that — although not impossible to overcome — the lack of consistency and shared best practices across all federal agencies in accepting and reviewing public comments was a serious impediment. The promise of Natural Language Processing and cognitive computing to make the public comment process light years faster and more transparent becomes that much more difficult without a consensus among federal agencies on what type of data is collected – and how.

“There is a whole bunch of work we have to do around getting government to be more customer friendly and making it at least as easy to file your taxes as it is to order a pizza or buy an airline ticket,” President Obama recently said in an interview with WIRED. “Whether it’s encouraging people to vote or dislodging Big Data so that people can use it more easily, or getting their forms processed online more simply — there’s a huge amount of work to drag the federal government and state governments and local governments into the 21st century.”

…expanding the discussion around Artificial Intelligence and regulatory processes to include how the technology should be leveraged to ensure fairness and responsiveness in the very basic processes of rulemaking – in particular public notices and comments. These technologies could also enable us to consider not just public comments formally submitted to an agency, but the entire universe of statements made through social media posts, blogs, chat boards — and conceivably every other electronic channel of public communication.

Obviously, an anonymous comment on the Internet should not carry the same credibility as a formally submitted, personally signed statement, just as sworn testimony in court holds far greater weight than a grapevine rumor. But so much public discussion today occurs on Facebook pages, in Tweets, on news website comment sections, etc. Anonymous speech enjoys explicit protection under the Constitution, based on a justified expectation that certain sincere statements of sentiment might result in unfair retribution from the government.

Should we simply ignore the valuable insights about actual public sentiment on specific issues made possible through the power of Artificial Intelligence, which can ascertain meaning from an otherwise unfathomable ocean of relevant public conversations? With certain qualifications, I believe Artificial Intelligence, or AI, should absolutely be employed in the critical effort to gain insights from public comments – signed or anonymous.

“In the criminal justice system, some of the biggest concerns with Big Data are the lack of data and the lack of quality data,” the NSTC report authors state. “AI needs good data. If the data is incomplete or biased, AI can exacerbate problems of bias.” As a former federal criminal prosecutor and defense attorney, I am well familiar with the absolute necessity to weigh the relative value of various forms of evidence – or in this case, data…(More)

Privacy Preservation in the Age of Big Data


A survey and primer by John S. Davis II and Osonde Osoba at Rand: “Anonymization or de-identification techniques are methods for protecting the privacy of subjects in sensitive data sets while preserving the utility of those data sets. The efficacy of these methods has come under repeated attacks as the ability to analyze large data sets becomes easier. Several researchers have shown that anonymized data can be reidentified to reveal the identity of the data subjects via approaches such as so-called “linking.” In this report, we survey the anonymization landscape of approaches for addressing re-identification and we identify the challenges that still must be addressed to ensure the minimization of privacy violations. We also review several regulatory policies for disclosure of private data and tools to execute these policies….(More)”.

Remote Data Collection: Three Ways to Rethink How You Collect Data in the Field


Magpi : “As mobile devices have gotten less and less expensive – and as millions worldwide have climbed out of poverty – it’s become quite common to see a mobile phone in every person’s hand, or at least in every family, and this means that we can utilize an additional approach to data collection that were simply not possible before….

In our Remote Data Collection Guide, we discuss these new technologies and the:

  • Key benefits of remote data collection in each of three different situations.
  • The direct impact of remote data collection on reducing the cost of your efforts.
  • How to start the process of choosing the right option for your needs….(More)”

Three Use Cases How Big Data Helps Save The Earth


DataFloq: “The earth is having a difficult time, for quite some time already. Deforestation is still happening at a large scale across the globe. In Brazil alone 40,200 hectares were deforested in the past year. The great pacific garbage patch is still growing and smog in Beijing is more common than a normal bright day. This is nothing new unfortunately. A possible solution is however. Since a few years, scientists, companies and governments are turning to Big Data to solve such problems or even prevent them from happening. It turns out that Big Data can help save the earth and if done correctly, this could lead to significant results in the coming years. Let’s have a look at some fascinating use cases of how Big Data can contribute:

Monitoring Biodiversity Across the Globe

Conservation International, a non-profit environmental organization with a mission to protect nature and its biodiversity, crunches vast amounts of data from images to monitor biodiversity around the world. At 16 important sites across the continents, they have installed over a 1000 smart cameras. Thanks to the motion sensor, these cameras will captures images as soon as the sensor is triggered by animals passing by. Per site these cameras cover approximately 2.000 square kilometres…. They automatically determine which species have appeared in the images and enrich the data with other information such as climate data, flora and fauna data and land use data to better understand how animal populations change over time…. the Wildlife Picture Index (WPI) Analytics System, a project dashboard and analytics tool for visualizing user-friendly, near real-time data-driven insights on the biodiversity. The WPI monitors ground-dwelling tropical medium and large mammals and birds, species that are important economically, aesthetically and ecologically.

Using Satellite Imagery to Combat Deforestation

Mapping deforestation is becoming a lot easier today thanks to Big Data. Imagery analytics allows environmentalists and policy makers to monitor, almost in real-time, the status of forests around the globe with the help of satellite imagery. New tools like the Global Forest Watch uses a massive amount of high-resolution NASA satellite imagery to assist conservation organizations, governments and concerned citizens monitor deforestation in “near-real time.”…

But that’s not all. Planet Labs has developed a tiny satellite that they are currently sending into space, dozens at a time. The satellite measures only 10 by 10 by 30 centimeters but is outfitted with the latest technology. They aim to create a high-resolution image of every spot on the earth, updated daily. Once available, this will generate massive amounts of data that they will open source for others to develop applications that will improve earth.

Monitoring and Predicting with Smart Oceans

Over 2/3 of the world consists of oceans and also these oceans can be monitored. Earlier this year, IBM Canada and Ocean Networks Canada announced a three-year program to better understand British Colombia’s Oceans. Using the latest technology and sensors, they want to predict offshore accidents, natural disasters and tsunamis and forecast the impact of these incidents. Using hundreds of cabled marine sensors they are capable of monitoring waves, currents, water quality and vessel traffic in some of the major shipping channels….These are just three examples of how Big Data can help save the planet. There are of course a lot more fascinating examples and here is list of 10 of such use cases….(More)”

Seeing Cities Through Big Data


Book edited by Thakuriah, Piyushimita (Vonu), Tilahun, Nebiyou, and Zellner, Moira: “… introduces the latest thinking on the use of Big Data in the context of urban systems, including  research and insights on human behavior, urban dynamics, resource use, sustainability and spatial disparities, where it promises improved planning, management and governance in the urban sectors (e.g., transportation, energy, smart cities, crime, housing, urban and regional economies, public health, public engagement, urban governance and political systems), as well as Big Data’s utility in decision-making, and development of indicators to monitor economic and social activity, and for urban sustainability, transparency, livability, social inclusion, place-making, accessibility and resilience…(More)”

The challenges and limits of big data algorithms in technocratic governance


Paper by Marijn Janssen and George Kuk in Government Information Quarterly: “Big data is driving the use of algorithm in governing mundane but mission-critical tasks. Algorithms seldom operate on their own and their (dis)utilities are dependent on the everyday aspects of data capture, processing and utilization. However, as algorithms become increasingly autonomous and invisible, they become harder for the public to detect and scrutinize their impartiality status. Algorithms can systematically introduce inadvertent bias, reinforce historical discrimination, favor a political orientation or reinforce undesired practices. Yet it is difficult to hold algorithms accountable as they continuously evolve with technologies, systems, data and people, the ebb and flow of policy priorities, and the clashes between new and old institutional logics. Greater openness and transparency do not necessarily improve understanding. In this editorial we argue that through unraveling the imperceptibility, materiality and governmentality of how algorithms work, we can better tackle the inherent challenges in the curatorial practice of data and algorithm. Fruitful avenues for further research on using algorithm to harness the merits and utilities of a computational form of technocratic governance are presented….(More)