Big Data in U.S. Agriculture


Megan Stubbs at the Congressional Research Service: “Recent media and industry reports have employed the term big data as a key to the future of increased food production and sustainable agriculture. A recent hearing on the private elements of big data in agriculture suggests that Congress too is interested in potential opportunities and challenges big data may hold. While there appears to be great interest, the subject of big data is complex and often misunderstood, especially within the context of agriculture.

There is no commonly accepted definition of the term big data. It is often used to describe a modern trend in which the combination of technology and advanced analytics creates a new way of processing information that is more useful and timely. In other words, big data is just as much about new methods for processing data as about the data themselves. It is dynamic, and when analyzed can provide a useful tool in a decisionmaking process. Most see big data in agriculture at the end use point, where farmers use precision tools to potentially create positive results like increased yields, reduced inputs, or greater sustainability. While this is certainly the more intriguing part of the discussion, it is but one aspect and does not necessarily represent a complete picture.

Both private and public big data play a key role in the use of technology and analytics that drive a producer’s evidence-based decisions. Public-level big data represent records collected, maintained, and analyzed through publicly funded sources, specifically by federal agencies (e.g., farm program participant records and weather data). Private big data represent records generated at the production level and originate with the farmer or rancher (e.g., yield, soil analysis, irrigation levels, livestock movement, and grazing rates). While discussed separately in this report, public and private big data are typically combined to create a more complete picture of an agricultural operation and therefore better decisionmaking tools.

Big data may significantly affect many aspects of the agricultural industry, although the full extent and nature of its eventual impacts remain uncertain. Many observers predict that the growth of big data will bring positive benefits through enhanced production, resource efficiency, and improved adaptation to climate change. While lauded for its potentially revolutionary applications, big data is not without issues. From a policy perspective, issues related to big data involve nearly every stage of its existence, including its collection (how it is captured), management (how it is stored and managed), and use (how it is analyzed and used). It is still unclear how big data will progress within agriculture due to technical and policy challenges, such as privacy and security, for producers and policymakers. As Congress follows the issue a number of questions may arise, including a principal one—what is the federal role?…(More)”

Predictive Analytics


Revised book by Eric Siegel: “Prediction is powered by the world’s most potent, flourishing unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.

Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future drives millions of decisions more effectively, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.

In this lucid, captivating introduction — now in its Revised and Updated edition — former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction:

    • What type of mortgage risk Chase Bank predicted before the recession.
    • Predicting which people will drop out of school, cancel a subscription, or get divorced before they even know it themselves.
    • Why early retirement predicts a shorter life expectancy and vegetarians miss fewer flights.
    • Five reasons why organizations predict death — including one health insurance company.
    • How U.S. Bank and Obama for America calculated — and Hillary for America 2016 plans to calculate — the way to most strongly persuade each individual.
    • Why the NSA wants all your data: machine learning supercomputers to fight terrorism.
    • How IBM’s Watson computer used predictive modeling to answer questions and beat the human champs on TV’s Jeopardy!
    • How companies ascertain untold, private truths — how Target figures out you’re pregnant and Hewlett-Packard deduces you’re about to quit your job.
    • How judges and parole boards rely on crime-predicting computers to decide how long convicts remain in prison.
    • 183 examples from Airbnb, the BBC, Citibank, ConEd, Facebook, Ford, Google, the IRS, LinkedIn, Match.com, MTV, Netflix, PayPal, Pfizer, Spotify, Uber, UPS, Wikipedia, and more….(More)”

 

Big Data: A Tool for Inclusion or Exclusion? Understanding the Issues


Press Release: “A new report from the Federal Trade Commission outlines a number of questions for businesses to consider to help ensure that their use of big data analytics, while producing many benefits for consumers, avoids outcomes that may be exclusionary or discriminatory.

“Big data’s role is growing in nearly every area of business, affecting millions of consumers in concrete ways,” said FTC Chairwoman Edith Ramirez. “The potential benefits to consumers are significant, but businesses must ensure that their big data use does not lead to harmful exclusion or discrimination.”

The report, Big Data: A Tool for Inclusion or Exclusion? Understanding the Issues, looks specifically at big data at the end of its lifecycle – how it is used after being collected and analyzed, and draws on information from the FTC’s 2014 workshop, “Big Data: A Tool for Inclusion or Exclusion?,” as well as the Commission’s seminar on Alternative Scoring Products. The Commission also considered extensive public comments and additional public research in compiling the report.

The report highlights a number of innovative uses of big data that are providing benefits to underserved populations, including increased educational attainment, access to credit through non-traditional methods, specialized health care for underserved communities, and better access to employment.

In addition, the report looks at possible risks that could result from biases or inaccuracies about certain groups, including more individuals mistakenly denied opportunities based on the actions of others, exposing sensitive information, creating or reinforcing existing disparities, assisting in the targeting of vulnerable consumers for fraud, creating higher prices for goods and services in lower-income communities and weakening the effectiveness of consumer choice.

The report outlines some of the various laws that apply to the use of big data, especially in regards to possible issues of discrimination or exclusion, including the Fair Credit Reporting Act, FTC Act and equal opportunity laws. It also provides a range of questions for businesses to consider when they examine whether their big data programs comply with these laws.

The report also proposes four key policy questions that are drawn from research into the ways big data can both present and prevent harms. The policy questions are designed to help companies determine how best to maximize the benefit of their use of big data while limiting possible harms, by examining both practical questions of accuracy and built-in bias as well as whether the company’s use of big data raises ethical or fairness concerns….(More)”

Privacy, security and data protection in smart cities: a critical EU law perspective


CREATe Working Paper by Lilian Edwards: “Smart cities” are a buzzword of the moment. Although legal interest is growing, most academic responses at least in the EU, are still from the technological, urban studies, environmental and sociological rather than legal, sectors2 and have primarily laid emphasis on the social, urban, policing and environmental benefits of smart cities, rather than their challenges, in often a rather uncritical fashion3 . However a growing backlash from the privacy and surveillance sectors warns of the potential threat to personal privacy posed by smart cities . A key issue is the lack of opportunity in an ambient or smart city environment for the giving of meaningful consent to processing of personal data; other crucial issues include the degree to which smart cities collect private data from inevitable public interactions, the “privatisation” of ownership of both infrastructure and data, the repurposing of “big data” drawn from IoT in smart cities and the storage of that data in the Cloud.

This paper, drawing on author engagement with smart city development in Glasgow as well as the results of an international conference in the area curated by the author, argues that smart cities combine the three greatest current threats to personal privacy, with which regulation has so far failed to deal effectively; the Internet of Things(IoT) or “ubiquitous computing”; “Big Data” ; and the Cloud. While these three phenomena have been examined extensively in much privacy literature (particularly the last two), both in the US and EU, the combination is under-explored. Furthermore, US legal literature and solutions (if any) are not simply transferable to the EU because of the US’s lack of an omnibus data protection (DP) law. I will discuss how and if EU DP law controls possible threats to personal privacy from smart cities and suggest further research on two possible solutions: one, a mandatory holistic privacy impact assessment (PIA) exercise for smart cities: two, code solutions for flagging the need for, and consequences of, giving consent to collection of data in ambient environments….(More)

Toward WSIS 3.0: Adopting Next-Gen Governance Solutions for Tomorrow’s Information Society


Lea Kaspar  & Stefaan G. Verhulst at CircleID: “… Collectively, this process has been known as the “World Summit on the Information Society” (WSIS). During December 2015 in New York, twelve years after that first meeting in Geneva and with more than 3 billion people now online, member states of the United Nations unanimously adopted the final outcome document of the WSIS ten-year Review process.

The document (known as the WSIS+10 document) reflects on the progress made over the past decade and outlines a set of recommendations for shaping the information society in coming years. Among other things, it acknowledges the role of different stakeholders in achieving the WSIS vision, reaffirms the centrality of human rights, and calls for a number of measures to ensure effective follow-up.

For many, these represent significant achievements, leading observers to proclaim the outcome a diplomatic victory. However, as is the case with most non-binding international agreements, the WSIS+10 document will remain little more than a hollow guidepost until it is translated into practice. Ultimately, it is up to the national policy-makers, relevant international agencies, and the WSIS community as a whole to deliver meaningful progress towards achieving the WSIS vision.

Unfortunately, the WSIS+10 document provides little actual guidance for practitioners. What is even more striking, it reveals little progress in its understanding of emerging governance trends and methods since Geneva and Tunis, or how these could be leveraged in our efforts to harness the benefits of information and communication technologies (ICT).

As such, the WSIS remains a 20th-century approach to 21st-century challenges. In particular, the document fails to seek ways to make WSIS post 2015:

  • evidence-based in how to make decisions;
  • collaborative in how to measure progress; and
  • innovative in how to solve challenges.

Three approaches toward WSIS 3.0

Drawing on lessons in the field of governance innovation, we suggest in what follows three approaches, accompanied by practical recommendations, that could allow the WSIS to address the challenges raised by the information society in a more evidence-based, innovative and participatory way:

1. Adopt an evidence-based approach to WSIS policy making and implementation.

Since 2003, we have had massive experimentation in both developed and developing countries in a number of efforts to increase access to the Internet. We have seen some failures and some successes; above all, we have gained insight into what works, what doesn’t, and why. Unfortunately, much of the evidence remains scattered and ad-hoc, poorly translated into actionable guidance that would be effective across regions; nor is there any reflection on what we don’t know, and how we can galvanize the research and funding community to address information gaps. A few practical steps we could take to address this:….

2. Measure progress towards WSIS goals in a more open, collaborative way, founded on metrics and data developed through a bottom-up approach

The current WSIS+10 document has many lofty goals, many of which will remain effectively meaningless unless we are able to measure progress in concrete and specific terms. This requires the development of clear metrics, a process which is inevitably subjective and value-laden. Metrics and indicators must therefore be chosen with great care, particularly as they become points of reference for important decisions and policies. Having legitimate, widely-accepted indicators is critical. The best way to do this is to develop a participatory process that engages those actors who will be affected by WSIS-related actions and decisions. …These could include:…

3. Experiment with governance innovations to achieve WSIS objectives.

Over the last few years, we have seen a variety of innovations in governance that have provided new and often improved ways to solve problems and make decisions. They include, for instance:

  • The use of open and big data to generate new insights in both the problem and the solution space. We live in the age of abundant data — why aren’t we using it to inform our decision making? Data on the current landscape and the potential implications of policies could make our predictions and correlations more accurate.
  • The adoption of design thinking, agile development and user-focused research in developing more targeted and effective interventions. A linear approach to policy making with a fixed set of objectives and milestones allows little room for dealing with unforeseen or changing circumstances, making it difficult to adapt and change course. Applying lessons from software engineering — including the importance of feedback loops, continuous learning, and agile approach to project design — would allow policies to become more flexible and solutions more robust.
  • The application of behavioral sciences — for example, the concept of ‘nudging’ individuals to act in their own best interest or adopt behaviors that benefit society. How choices (e.g. to use new technologies) are presented and designed can be more powerful in informing adoption than laws, rules or technical standards.
  • The use of prizes and challenges to tap into the wisdom of the crowd to solve complex problems and identify new ideas. Resource constraints can be addressed by creating avenues for people/volunteers to act as resource in creating solutions, rather than being only their passive benefactors….(More)

Daedalus Issue on “The Internet”


Press release: “Thirty years ago, the Internet was a network that primarily delivered email among academic and government employees. Today, it is rapidly evolving into a control system for our physical environment through the Internet of Things, as mobile and wearable technology more tightly integrate the Internet into our everyday lives.

How will the future Internet be shaped by the design choices that we are making today? Could the Internet evolve into a fundamentally different platform than the one to which we have grown accustomed? As an alternative to big data, what would it mean to make ubiquitously collected data safely available to individuals as small data? How could we attain both security and privacy in the face of trends that seem to offer neither? And what role do public institutions, such as libraries, have in an environment that becomes more privatized by the day?

These are some of the questions addressed in the Winter 2016 issue of Daedalus on “The Internet.”  As guest editors David D. Clark (Senior Research Scientist at the MIT Computer Science and Artificial Intelligence Laboratory) and Yochai Benkler (Berkman Professor of Entrepreneurial Legal Studies at Harvard Law School and Faculty Co-Director of the Berkman Center for Internet and Society at Harvard University) have observed, the Internet “has become increasingly privately owned, commercial, productive, creative, and dangerous.”

Some of the themes explored in the issue include:

  • The conflicts that emerge among governments, corporate stakeholders, and Internet users through choices that are made in the design of the Internet
  • The challenges—including those of privacy and security—that materialize in the evolution from fixed terminals to ubiquitous computing
  • The role of public institutions in shaping the Internet’s privately owned open spaces
  • The ownership and security of data used for automatic control of connected devices, and
  • Consumer demand for “free” services—developed and supported through the sale of user data to advertisers….

Essays in the Winter 2016 issue of Daedalus include:

  • The Contingent Internet by David D. Clark (MIT)
  • Degrees of Freedom, Dimensions of Power by Yochai Benkler (Harvard Law School)
  • Edge Networks and Devices for the Internet of Things by Peter T. Kirstein (University College London)
  • Reassembling Our Digital Selves by Deborah Estrin (Cornell Tech and Weill Cornell Medical College) and Ari Juels (Cornell Tech)
  • Choices: Privacy and Surveillance in a Once and Future Internet by Susan Landau (Worcester Polytechnic Institute)
  • As Pirates Become CEOs: The Closing of the Open Internet by Zeynep Tufekci (University of North Carolina at Chapel Hill)
  • Design Choices for Libraries in the Digital-Plus Era by John Palfrey (Phillips Academy)…(More)

See also: Introduction

Digital Weberianism: Towards a reconceptualization of bureaucratic social order in the digital age


Working Paper by Chris Muellerleile & Susan Robertson: “The social infrastructures that the global economy relies upon are becoming dependent on digital code, big data, and algorithms. At the same time the digital is also changing the very nature of economic and social institutions. In this paper we attempt to make sense of the relationships between the emergence of digitalism, and transformations in both capitalism, and the ways that capitalism is regulated by digitized social relations. We speculate that the logic, rationalities, and techniques of Max Weber’s bureau, a foundational concept in his theory of modernity, helps explain the purported efficiency, objectivity, and rationality of digital technologies. We argue that digital rationality constitutes a common thread of social infrastructure that is increasingly overdetermining the nature of sociality. We employ the example of the smart city and the digitizing university to expose some of the contradictions of digital order, and we end by questioning what digital order might mean after the end of modernity….(More)”

Big Data Analysis: New Algorithms for a New Society


Book edited by Nathalie Japkowicz and Jerzy Stefanowski: “This edited volume is devoted to Big Data Analysis from a Machine Learning standpoint as presented by some of the most eminent researchers in this area.

It demonstrates that Big Data Analysis opens up new research problems which were either never considered before, or were only considered within a limited range. In addition to providing methodological discussions on the principles of mining Big Data and the difference between traditional statistical data analysis and newer computing frameworks, this book presents recently developed algorithms affecting such areas as business, financial forecasting, human mobility, the Internet of Things, information networks, bioinformatics, medical systems and life science. It explores, through a number of specific examples, how the study of Big Data Analysis has evolved and how it has started and will most likely continue to affect society. While the benefits brought upon by Big Data Analysis are underlined, the book also discusses some of the warnings that have been issued concerning the potential dangers of Big Data Analysis along with its pitfalls and challenges….(More)”

Privacy by design in big data


An overview of privacy enhancing technologies in the era of big data analytics by the European Union Agency for Network and Information Security (ENISA) : “The extensive collection and further processing of personal information in the context of big data analytics has given rise to serious privacy concerns, especially relating to wide scale electronic surveillance, profiling, and disclosure of private data. In order to allow for all the benefits of analytics without invading individuals’ private sphere, it is of utmost importance to draw the limits of big data processing and integrate the appropriate data protection safeguards in the core of the analytics value chain. ENISA, with the current report, aims at supporting this approach, taking the position that, with respect to the underlying legal obligations, the challenges of technology (for big data) should be addressed by the opportunities of technology (for privacy). To this end, in the present study we first explain the need to shift the discussion from “big data versus privacy” to “big data with privacy”, adopting the privacy and data protection principles as an essential value of big data, not only for the benefit of the individuals, but also for the very prosperity of big data analytics. In this respect, the concept of privacy by design is key in identifying the privacy requirements early at the big data analytics value chain and in subsequently implementing the necessary technical and organizational measures. Therefore, after an analysis of the proposed privacy by design strategies in the different phases of the big data value chain, we provide an overview of specific identified privacy enhancing technologies that we find of special interest for the current and future big data landscape. In particular, we discuss anonymization, the “traditional” analytics technique, the emerging area of encrypted search and privacy preserving computations, granular access control mechanisms, policy enforcement and accountability, as well as data provenance issues. Moreover, new transparency and access tools in big data are explored, together with techniques for user empowerment and control. Following the aforementioned work, one immediate conclusion that can be derived is that achieving “big data with privacy” is not an easy task and a lot of research and implementation is still needed. Yet, we find that this task can be possible, as long as all the involved stakeholders take the necessary steps to integrate privacy and data protection safeguards in the heart of big data, by design and by default. To this end, ENISA makes the following recommendations:

  • Privacy by design applied …
  • Decentralised versus centralised data analytics …
  • Support and automation of policy enforcement
  • Transparency and control….
  • User awareness and promotion of PETs …
  • A coherent approach towards privacy and big data ….(More)”

Big Data for Development: A Review of Promises and Challenges


Martin Hilbert in the Development Policy Review: “The article uses a conceptual framework to review empirical evidence and some 180 articles related to the opportunities and threats of Big Data Analytics for international development. The advent of Big Data delivers a cost-effective prospect for improved decision-making in critical development areas such as healthcare, economic productivity and security. At the same time, the well-known caveats of the Big Data debate, such as privacy concerns and human resource scarcity, are aggravated in developing countries by long-standing structural shortages in the areas of infrastructure, economic resources and institutions. The result is a new kind of digital divide: a divide in the use of data-based knowledge to inform intelligent decision-making. The article systematically reviews several available policy options in terms of fostering opportunities and minimising risks…..(More)”