The trouble with Big Data? It is called the “recency bias”.


One of the problems with such a rate of information increase is that the present moment will always loom far larger than even the recent past. Imagine looking back over a photo album representing the first 18 years of your life, from birth to adulthood. Let’s say that you have two photos for your first two years. Assuming a rate of information increase matching that of the world’s data, you will have an impressive 2,000 photos representing the years six to eight; 200,000 for the years 10 to 12; and a staggering 200,000,000 for the years 16 to 18. That’s more than three photographs for every single second of those final two years.

The moment you start looking backwards to seek the longer view, you have far too much of the recent stuff and far too little of the old

This isn’t a perfect analogy with global data, of course. For a start, much of the world’s data increase is due to more sources of information being created by more people, along with far larger and more detailed formats. But the point about proportionality stands. If you were to look back over a record like the one above, or try to analyse it, the more distant past would shrivel into meaningless insignificance. How could it not, with so many times less information available?

Here’s the problem with much of the big data currently being gathered and analysed. The moment you start looking backwards to seek the longer view, you have far too much of the recent stuff and far too little of the old. Short-sightedness is built into the structure, in the form of an overwhelming tendency to over-estimate short-term trends at the expense of history.

To understand why this matters, consider the findings from social science about ‘recency bias’, which describes the tendency to assume that future events will closely resemble recent experience. It’s a version of what is also known as the availability heuristic: the tendency to base your thinking disproportionately on whatever comes most easily to mind. It’s also a universal psychological attribute. If the last few years have seen exceptionally cold summers where you live, for example, you might be tempted to state that summers are getting colder – or that your local climate may be cooling. In fact, you shouldn’t read anything whatsoever into the data. You would need to take a far, far longer view to learn anything meaningful about climate trends. In the short term, you’d be best not speculating at all – but who among us can manage that?

Short-term analyses aren’t only invalid – they’re actively unhelpful and misleading

The same tends to be true of most complex phenomena in real life: stock markets, economies, the success or failure of companies, war and peace, relationships, the rise and fall of empires. Short-term analyses aren’t only invalid – they’re actively unhelpful and misleading. Just look at the legions of economists who lined up to pronounce events like the 2009 financial crisis unthinkable right until it happened. The very notion that valid predictions could be made on that kind of scale was itself part of the problem.

It’s also worth remembering that novelty tends to be a dominant consideration when deciding what data to keep or delete. Out with the old and in with the new: that’s the digital trend in a world where search algorithms are intrinsically biased towards freshness, and where so-called link rot infests everything from Supreme Court decisions to entire social media services. A bias towards the present is structurally engrained in almost all the technology surrounding us, not least thanks to our habit of ditching most of our once-shiny machines after about five years.

What to do? This isn’t just a question of being better at preserving old data – although this wouldn’t be a bad idea, given just how little is currently able to last decades rather than years. More importantly, it’s about determining what is worth preserving in the first place – and what it means meaningfully to cull information in the name of knowledge.

What’s needed is something that I like to think of as “intelligent forgetting”: teaching our tools to become better at letting go of the immediate past in order to keep its larger continuities in view. It’s an act of curation akin to organising a photograph album – albeit with more maths….(More)

White House Challenges Artificial Intelligence Experts to Reduce Incarceration Rates


Jason Shueh at GovTech: “The U.S. spends $270 billion on incarceration each year, has a prison population of about 2.2 million and an incarceration rate that’s spiked 220 percent since the 1980s. But with the advent of data science, White House officials are asking experts for help.

On Tuesday, June 7, the White House Office of Science and Technology Policy’s Lynn Overmann, who also leads the White House Police Data Initiative, stressed the severity of the nation’s incarceration crisis while asking a crowd of data scientists and artificial intelligence specialists for aid.

“We have built a system that is too large, and too unfair and too costly — in every sense of the word — and we need to start to change it,” Overmann said, speaking at a Computing Community Consortium public workshop.

She argued that the U.S., a country that has the highest amount incarcerated citizens in the world, is in need of systematic reforms with both data tools to process alleged offenders and at the policy level to ensure fair and measured sentences. As a longtime counselor, advisor and analyst for the Justice Department and at the city and state levels, Overman said she has studied and witnessed an alarming number of issues in terms of bias and unwarranted punishments.

For instance, she said that statistically, while drug use is about equal between African-Americans and Caucasians, African-Americans are more likely to be arrested and convicted. They also receive longer prison sentences compared to Caucasian inmates convicted of the same crimes….

Data and digital tools can help curb such pitfalls by increasing efficiency, transparency and accountability, she said.

“We think these types of data exchanges [between officials and technologists] can actually be hugely impactful if we can figure out how to take this information and operationalize it for the folks who run these systems,” Obermann noted.

The opportunities to apply artificial intelligence and data analytics, she said, might include using it to improve questions on parole screenings, using it to analyze police body camera footage, and applying it to criminal justice data for legislators and policy workers….

If the private sector is any indication, artificial intelligence and machine learning techniques could be used to interpret this new and vast supply of law enforcement data. In an earlier presentation by Eric Horvitz, the managing director at Microsoft Research, Horvitz showcased how the company has applied artificial intelligence to vision and language to interpret live video content for the blind. The app, titled SeeingAI, can translate live video footage, captured from an iPhone or a pair of smart glasses, into instant audio messages for the seeing impaired. Twitter’s live-streaming app Periscope has employed similar technology to guide users to the right content….(More)”

AI lawyer speeds up legal research


Springwise: “Lawyers have to maintain and recall vast amounts of information in the form of legislation, case law and secondary cases, and they spend up to a fifth of their time on legal research. But an AI app called Ross Intelligence could soon help with that. The program, which is built on IBM’s super-computer Watson, uses natural language processing to answer legal questions in a fraction of the time that it would take a legal assistant.

To begin, legal professionals can ask Ross a question as they would ask a colleague. Then the program reads through the entire body of law and returns a cited answer as well topical readings. Ross also monitors the law constantly to keep the user updated about changes that might affect their case, so they don’t need to sift through the mass of legal news….(More)”

Value and Vulnerability: The Internet of Things in a Connected State Government


Pressrelease: “The National Association of State Chief Information Officers (NASCIO) today released a policy brief on the Internet of Things (IoT) in state government. The paper focuses on the different ways state governments are using IoT now and in the future and the policy considerations involved.

“In NASCIO’s 2015 State CIO Survey, we asked state CIOs to what extent IoT was on their agenda. Just over half said they were in informal discussions, however only one in five had moved to the formal discussion phase. We believe IoT needs to be a formal part of each state’s policy considerations,” explained NASCIO Executive Director Doug Robinson.

The paper encourages state CIOs to make IoT part of the enterprise architecture discussions on asset management and risk assessment and to develop an IoT roadmap.

“Cities and municipalities have been working toward the designation of ‘smart city’ for a while now,” said Darryl Ackley, cabinet secretary for the New Mexico Department of Information Technology and NASCIO president. “While states provide different services than cities, we are seeing a lot of activity around IoT to improve citizen services and we see great potential for growth. The more organized and methodical states can be about implementing IoT, the more successful and useful the outcomes.”

Read the policy brief at www.NASCIO.org/ValueAndVulnerability 

Legal Aid With a Digital Twist


Tina Rosenberg in the New York Times: “Matthew Stubenberg was a law student at the University of Maryland in 2010 when he spent part of a day doing expungements. It was a standard law school clinic where students learn by helping clients — in this case, he helped them to fill out and file petitions to erase parts of their criminal records. (Last week I wrote about the lifelong effects of these records, even if there is no conviction, and the expungement process that makes them go away.)

Although Maryland has a public database called Case Search, using that data to fill out the forms was tedious. “We spent all this time moving data from Case Search onto our forms,” Stubenberg said. “We spent maybe 30 seconds on the legal piece. Why could this not be easier? This was a problem that could be fixed by a computer.”

Stubenberg knew how to code. After law school, he set out to build software that automatically did that tedious work. By September 2014 he had a prototype for MDExpungement, which went live in January 2015. (The website is not pretty — Stubenberg is a programmer, not a designer.)

With MDExpungement, entering a case number brings it up on Case Search. The software then determines whether the case is expungeable. If so, the program automatically transfers the information from Case Search to the expungement form. All that’s left is to print, sign and file it with the court.

In October 2015 a change in Maryland law made more cases eligible for expungement. Between then and March 2016, people filed 7,600 petitions to have their criminal records removed in Baltimore City District Court. More than two-thirds of them came from MDExpungement.

“With the ever-increasing amount of expungements we’re all doing, the app has just made it a lot easier,” said Mary-Denise Davis, a public defender in Baltimore. “I put in a case number and it fills the form out for me. Like magic.”

The rise of online legal forms may not be a gripping subject, but it matters. Tens of millions of Americans need legal help for civil problems — they need a divorce, child support or visitation, protection from abuse or a stay of eviction. They must hold off debt collectors or foreclosure, or get government benefits….(more)

These Online Platforms Make Direct Democracy Possible


Tom Ladendorf in InTheseTimes: “….Around the world, organizations from political parties to cooperatives are experimenting with new modes of direct democracy made possible by the internet.

“The world has gone through extraordinary technological innovation,” says Agustín Frizzera of Argentina’s Net Party. “But governments and political institutions haven’t innovated enough.”

The founders of the four-year-old party have also built an online platform, DemocracyOS, that lets users discuss and vote on proposals being considered by their legislators.

Anyone can adopt the technology, but the Net Party uses it to let Buenos Aires residents debate City Council measures. A 2013 thread, for example, concerned a plan to require bars and restaurants to make bathrooms free and open to the public.

“I recognize the need for freely available facilities, but it is the state who should be offering this service,” reads the top comment, voted most helpful by users. Others argued that private bathrooms open the door to discrimination. Ultimately, 56.9 percent of participants supported the proposal, while 35.3 percent voted against and 7.8 percent abstained….

A U.S. company called PlaceAVote, launched in 2014, takes what it calls a more pragmatic approach. According to cofounder Job Melton, PlaceAVote’s goal is to “work within the system we have now and fix it from the inside out” instead of attempting the unlikely feat of building a third U.S. party.

Like the Net Party and its brethren, PlaceAVote offers an online tool that lets voters participate in decision making. Right now, the technology is in public beta at PlaceAVote.com, allowing users nationwide to weigh in on legislation before Congress….

But digital democracy has applications that extend beyond electoral politics. A wide range of groups are using web-based decision-making tools internally. The Mexican government, for example, has used DemocracyOS to gather citizen feedback on a data-protection law, and Brazilian civil society organizations are using it to encourage engagement with federal and municipal policy-making.

Another direct-democracy tool in wide use is Loomio, developed by a cooperative in New Zealand. Ben Knight, one of Loomio’s cofounders, sums up his experience with Occupy as one of “seeing massive potential of collective decision making, and then realizing how difficult it could be in person.” After failing to find an online tool to facilitate the process, the Loomio team created a platform that enables online discussion with a personal element: Votes are by name and voters can choose to “disagree” with or even “block” proposals. Provo, Utah, uses Loomio for public consultation, and a number of political parties use Loomio for local decision making, including the Brazilian Pirate Party, several regional U.K. Green Party chapters and Spain’s Podemos. Podemos has enthusiastically embraced digital-democracy tools for everything from its selection of European Parliament candidates to the creation of its party platform….(More)”

Big data: big power shifts?


Special issue of Internet Policy Review: “Facing general conceptions of the power effects of big data, this thematic edition is interested in studies that scrutinise big data and power in concrete fields of application. It brings together scholars from different disciplines who analyse the fields agriculture, education, border control and consumer policy. As will be made explicit in the following, each of the articles tells us something about firstly, what big data is and how it relates to power. They secondly also shed light on how we should shape “the big data society” and what research questions need to be answered to be able to do so….

The ethics of big data in big agriculture
Isabelle M. Carbonell, University of California, Santa Cruz

Regulating “big data education” in Europe: lessons learned from the US
Yoni Har Carmel, University of Haifa

The borders, they are a-changin’! The emergence of socio-digital borders in the EU
Magdalena König, Maastricht University

Beyond consent: improving data protection through consumer protection law
Michiel Rhoen, Leiden University…

(More)”

Nudging – Possibilities, Limitations and Applications in European Law and Economics


Book edited by Mathis, Klaus and Tor, Avishalom: “This anthology provides an in-depth analysis and discusses the issues surrounding nudging and its use in legislation, regulation, and policy making more generally. The 17 essays in this anthology provide startling insights into the multifaceted debate surrounding the use of nudges in European Law and Economics.

Nudging is a tool aimed at altering people’s behaviour in a predictable way without forbidding any option or significantly changing economic incentives. It can be used to help people make better decisions to influence human behaviour without forcing them because they can opt out. Its use has sparked lively debates in academia as well as in the public sphere. This book explores who decides which behaviour is desired. It looks at whether or not the state has sufficient information for debiasing, and if there are clear-cut boundaries between paternalism, manipulation and indoctrination. The first part of this anthology discusses the foundations of nudging theory and the problems associated, as well as outlining possible solutions to the problems raised. The second part is devoted to the wide scope of applications of nudges from contract law, tax law and health claim regulations, among others.

This volume is a result of the flourishing annual Law and Economics Conference held at the law faculty of the University of Lucerne. The conferences have been instrumental in establishing a strong and ever-growing Law and Economics movement in Europe, providing unique insights in the challenges faced by Law and Economics when applied in European legal traditions….(More)”

Reining in the Big Promise of Big Data: Transparency, Inequality, and New Regulatory Frontiers


Paper by Philipp Hacker and Bilyana Petkova: “The growing differentiation of services based on Big Data harbors the potential for both greater societal inequality and for greater equality. Anti-discrimination law and transparency alone, however, cannot do the job of curbing Big Data’s negative externalities while fostering its positive effects.

To rein in Big Data’s potential, we adapt regulatory strategies from behavioral economics, contracts and criminal law theory. Four instruments stand out: First, active choice may be mandated between data collecting services (paid by data) and data free services (paid by money). Our suggestion provides concrete estimates for the price range of a data free option, sheds new light on the monetization of data collecting services, and proposes an “inverse predatory pricing” instrument to limit excessive pricing of the data free option. Second, we propose using the doctrine of unconscionability to prevent contracts that unreasonably favor data collecting companies. Third, we suggest democratizing data collection by regular user surveys and data compliance officers partially elected by users. Finally, we trace back new Big Data personalization techniques to the old Hartian precept of treating like cases alike and different cases – differently. If it is true that a speeding ticket over $50 is less of a disutility for a millionaire than for a welfare recipient, the income and wealth-responsive fines powered by Big Data that we suggest offer a glimpse into the future of the mitigation of economic and legal inequality by personalized law. Throughout these different strategies, we show how salience of data collection can be coupled with attempts to prevent discrimination against and exploitation of users. Finally, we discuss all four proposals in the context of different test cases: social media, student education software and credit and cell phone markets.

Many more examples could and should be discussed. In the face of increasing unease about the asymmetry of power between Big Data collectors and dispersed users, about differential legal treatment, and about the unprecedented dimensions of economic inequality, this paper proposes a new regulatory framework and research agenda to put the powerful engine of Big Data to the benefit of both the individual and societies adhering to basic notions of equality and non-discrimination….(More)”

Moneyballing Criminal Justice


Anne Milgram in the Atlantic: “…One area in which the potential of data analysis is still not adequately realized,however, is criminal justice. This is somewhat surprising given the success of CompStat, a law enforcement management tool that uses data to figure out how police resources can be used to reduce crime and hold law enforcement officials accountable for results. CompStat is widely credited with contributing to New York City’s dramatic reduction in serious crime over the past two decades. Yet data-driven decision-making has not expanded to the whole of the criminal justice system.

But it could. And, in this respect, the front end of the system — the part of the process that runs from arrest through sentencing — is particularly important. Atthis stage, police, prosecutors, defenders, and courts make key choices about how to deal with offenders — choices that, taken together, have an enormous impact on crime. Yet most jurisdictions do not collect or analyze the data necessary to know whether these decisions are being made in a way that accomplishes the most important goals of the criminal justice system: increased public safety,decreased recidivism, reduced cost, and the fair, efficient administration of justice.

Even in jurisdictions where good data exists, a lack of technology is often an obstacle to using it effectively. Police, jails, courts, district attorneys, and public defenders each keep separate information systems, the data from which is almost never pulled together and analyzed in a way that could answer the questions that matter most: Who is in our criminal justice system? What crimes have been charged? What risks do individual offenders pose? And which option would best protect the public and make the best use of our limited resources?

While debates about prison over-crowding, three strikes laws, and mandatory minimum sentences have captured public attention, the importance of what happens between arrest and sentencing has gone largely unnoticed. Even though I ran the criminal justice system in New Jersey, one of the largest states in the country, I had not realized the magnitude of the pretrial issues until I was tasked by theLaura and John Arnold Foundation with figuring out which aspects of criminal justice had the most need and presented the greatest opportunity for reform….

Technology could help us leverage data to identify offenders who will pose unacceptable risks to society if they are not behind bars and distinguish them from those defendants who will have lower recidivism rates if they are supervised in the community or given alternatives to incarceration before trial. Likewise, it could help us figure out which terms of imprisonment, alternatives to incarceration, and other interventions work best–and for whom. And the list does not end there.

The truth is our criminal justice system already makes these decisions every day.But it makes them without knowing whether they’re the right ones. That needs to change. If data is powerful enough to transform baseball, health care, and education, it can do the same for criminal justice….(More)”

…(More).