Paying Farmers to Welcome Birds


Jim Robbins in The New York Times: “The Central Valley was once one of North America’s most productive wildlife habitats, a 450-mile-long expanse marbled with meandering streams and lush wetlands that provided an ideal stop for migratory shorebirds on their annual journeys from South America and Mexico to the Arctic and back.

Farmers and engineers have long since tamed the valley. Of the wetlands that existed before the valley was settled, about 95 percent are gone, and the number of migratory birds has declined drastically. But now an unusual alliance of conservationists, bird watchers and farmers have joined in an innovative plan to restore essential habitat for the migrating birds.

The program, called BirdReturns, starts with data from eBird, the pioneering citizen science project that asks birders to record sightings on a smartphone app and send the information to the Cornell Lab of Ornithology in upstate New York.

By crunching data from the Central Valley, eBird can generate maps showing where virtually every species congregates in the remaining wetlands. Then, by overlaying those maps on aerial views of existing surface water, it can determine where the birds’ need for habitat is greatest….

BirdReturns is an example of the growing movement called reconciliation ecology, in which ecosystems dominated by humans are managed to increase biodiversity.

“It’s a new ‘Moneyball,’ ” said Eric Hallstein, an economist with the Nature Conservancy and a designer of the auctions, referring to the book and movie about the Oakland Athletics’ data-driven approach to baseball. “We’re disrupting the conservation industry by taking a new kind of data, crunching it differently and contracting differently.”

The Transformative Impact of Data and Communication on Governance


Steven Livingston at Brookings: “How do digital technologies affect governance in areas of limited statehood – places and circumstances characterized by the absence of state provisioning of public goods and the enforcement of binding rules with a monopoly of legitimate force?  In the first post in this series I introduced the limited statehood concept and then described the tremendous growth in mobile telephony, GIS, and other technologies in the developing world.   In the second post I offered examples of the use of ICT in initiatives intended to fill at least some of the governance vacuum created by limited statehood.  With mobile phones, for example, farmers are informed of market conditions, have access to liquidity through M-Pesa and similar mobile money platforms….
This brings to mind another type of ICT governance initiative.  Rather than fill in for or even displace the state some ICT initiatives can strengthen governance capacity.  Digital government – the use of digital technology by the state itself — is one important possibility.  Other initiatives strengthen the state by exerting pressure. Countries with weak governance sometimes take the form of extractive states or those, which cater to the needs of an elite, leaving the majority of the population in poverty and without basic public services. This is what Daron Acemoglu and James A. Robinson call extractive political and economic institutions.  Inclusive states, on the other hand, are pluralistic, bound by the rule of law, respectful of property rights, and, in general, accountable.  Accountability mechanisms such as a free press and competitive multiparty elections are instrumental to discourage extractive institutions.  What ICT-based initiatives might lend a hand in strengthening accountability? We can point to three examples.

Example One: Using ICT to Protect Human Rights

Nonstate actors now use commercial, high-resolution remote sensing satellites to monitor weapons programs and human rights violations.  Amnesty International’s Remote Sensing for Human Rights offers one example, and Satellite Sentinel offers another.  Both use imagery from DigitalGlobe, an American remote sensing and geospatial content company.   Other organizations have used commercially available remote sensing imagery to monitor weapons proliferation.  The Institute for Science and International Security, a Washington-based NGO, revealed the Iranian nuclear weapons program in 2003 using commercial satellite imagery…

Example Two: Crowdsourcing Election Observation

Others have used mobile phones and GIS to crowdsource election observation.  For the 2011 elections in Nigeria, The Community Life Project, a civil society organization, created ReclaimNaija, an elections process monitoring system that relied on GIS and amateur observers with mobile phones to monitor the elections.  Each of the red dots represents an aggregation of geo-located incidents reported to the ReclaimNaija platform.  In a live map, clicking on a dot disaggregates the reports, eventually taking the reader to individual reports.  Rigorous statistical analysis of ReclaimNaija results and the elections suggest it contributed to the effectiveness of the election process.

ReclaimNaija: Election Incident Reporting System Map

ReclaimNaija: Election Incident Reporting System Map

Example Three: Using Genetic Analysis to Identify War Crimes

In recent years, more powerful computers have led to major breakthroughs in biomedical science.  The reduction in cost of analyzing the human genome has actually outpaced Moore’s Law.  This has opened up new possibilities for the use of genetic analysis in forensic anthropology.   In Guatemala, the Balkans, Argentina, Peru and in several other places where mass executions and genocides took place, forensic anthropologists are using genetic analysis to find evidence that is used to hold the killers – often state actors – accountable…”

The Data Mining Techniques That Reveal Our Planet's Cultural Links and Boundaries


Emerging Technology From the arXiv: “The habits and behaviors that define a culture are complex and fascinating. But measuring them is a difficult task. What’s more, understanding the way cultures change from one part of the world to another is a task laden with challenges.
The gold standard in this area of science is known as the World Values Survey, a global network of social scientists studying values and their impact on social and political life. Between 1981 and 2008, this survey conducted over 250,000 interviews in 87 societies. That’s a significant amount of data and the work has continued since then. This work is hugely valuable but it is also challenging, time-consuming and expensive.
Today, Thiago Silva at the Universidade Federal de Minas Gerais in Brazil and a few buddies reveal another way to collect data that could revolutionize the study of global culture. These guys study cultural differences around the world using data generated by check-ins on the location-based social network, Foursquare.
That allows these researchers to gather huge amounts of data, cheaply and easily in a short period of time. “Our one-week dataset has a population of users of the same order of magnitude of the number of interviews performed in [the World Values Survey] in almost three decades,” they say.
Food and drink are fundamental aspects of society and so the behaviors and habits associated with them are important indicators. The basic question that Silva and co attempt to answer is: what are your eating and drinking habits? And how do these differ from a typical individual in another part of the world such as Japan, Malaysia, or Brazil?
Foursquare is ideally set up to explore this question. Users “check in” by indicating when they have reached a particular location that might be related to eating and drinking but also to other activities such as entertainment, sport and so on.
Silva and co are only interested in the food and drink preferences of individuals and, in particular, on the way these preferences change according to time of day and geographical location.
So their basic approach is to compare a large number individual preferences from different parts of the world and see how closely they match or how they differ.
Because Foursquare does not share its data, Silva and co downloaded almost five million tweets containing Foursquare check-ins, URLs pointing to the Foursquare website containing information about each venue. They discarded check-ins that were unrelated to food or drink.
That left them with some 280,000 check-ins related to drink from 160,000 individuals; over 400,000 check-ins related to fast food from 230,000 people; and some 400,000 check-ins relating to ordinary restaurant food or what Silva and co call slow food.
They then divide each of these classes into subcategories. For example, the drink class has 21 subcategories such as brewery, karaoke bar, pub, and so on. The slow food class has 53 subcategories such as Chinese restaurant, Steakhouse, Greek restaurant, and so on.
Each check-in gives the time and geographical location which allows the team to compare behaviors from all over the world. They compare, for example, eating and drinking times in different countries both during the week and at the weekend. They compare the choices of restaurants, fast food habits and drinking habits by continent and country. The even compare eating and drinking habits in New York, London, and Tokyo.
The results are a fascinating insight into humanity’s differing habits. Many places have similar behaviors, Malaysia and Singapore or Argentina and Chile, for example, which is just as expected given the similarities between these places.
But other resemblances are more unexpected. A comparison of drinking habits show greater similarity between Brazil and France, separated by the Atlantic Ocean, than they do between France and England, separated only by the English Channel…
They point out only two major differences. The first is that no Islamic cluster appears in the Foursquare data. Countries such as Turkey are similar to Russia, while Indonesia seems related to Malaysia and Singapore.
The second is that the U.S. and Mexico make up their own individual cluster in the Foursquare data whereas the World Values Survey has them in the “English-speaking” and “Latin American” clusters accordingly.
That’s exciting data mining work that has the potential to revolutionize the way sociologists and anthropologists study human culture around the world. Expect to hear more about it
Ref: http://arxiv.org/abs/1404.1009: You Are What You Eat (and Drink): Identifying Cultural Boundaries By Analyzing Food & Drink Habits In Foursquare”.

Politics and the Internet


Edited book by William H. Dutton (Routledge – 2014 – 1,888 pages: “It is commonplace to observe that the Internet—and the dizzying technologies and applications which it continues to spawn—has revolutionized human communications. But, while the medium’s impact has apparently been immense, the nature of its political implications remains highly contested. To give but a few examples, the impact of networked individuals and institutions has prompted serious scholarly debates in political science and related disciplines on: the evolution of ‘e-government’ and ‘e-politics’ (especially after recent US presidential campaigns); electronic voting and other citizen participation; activism; privacy and surveillance; and the regulation and governance of cyberspace.
As research in and around politics and the Internet flourishes as never before, this new four-volume collection from Routledge’s acclaimed Critical Concepts in Political Science series meets the need for an authoritative reference work to make sense of a rapidly growing—and ever more complex—corpus of literature. Edited by William H. Dutton, Director of the Oxford Internet Institute (OII), the collection gathers foundational and canonical work, together with innovative and cutting-edge applications and interventions.
With a full index and comprehensive bibliographies, together with a new introduction by the editor, which places the collected material in its historical and intellectual context, Politics and the Internet is an essential work of reference. The collection will be particularly useful as a database allowing scattered and often fugitive material to be easily located. It will also be welcomed as a crucial tool permitting rapid access to less familiar—and sometimes overlooked—texts. For researchers, students, practitioners, and policy-makers, it is a vital one-stop research and pedagogic resource.”

Eight (No, Nine!) Problems With Big Data


Gary Marcus and Ernest Davis in the New York Times: “BIG data is suddenly everywhere. Everyone seems to be collecting it, analyzing it, making money from it and celebrating (or fearing) its powers. Whether we’re talking about analyzing zillions of Google search queries to predict flu outbreaks, or zillions of phone records to detect signs of terrorist activity, or zillions of airline stats to find the best time to buy plane tickets, big data is on the case. By combining the power of modern computing with the plentiful data of the digital era, it promises to solve virtually any problem — crime, public health, the evolution of grammar, the perils of dating — just by crunching the numbers.

Or so its champions allege. “In the next two decades,” the journalist Patrick Tucker writes in the latest big data manifesto, “The Naked Future,” “we will be able to predict huge areas of the future with far greater accuracy than ever before in human history, including events long thought to be beyond the realm of human inference.” Statistical correlations have never sounded so good.

Is big data really all it’s cracked up to be? There is no doubt that big data is a valuable tool that has already had a critical impact in certain areas. For instance, almost every successful artificial intelligence computer program in the last 20 years, from Google’s search engine to the I.B.M. “Jeopardy!” champion Watson, has involved the substantial crunching of large bodies of data. But precisely because of its newfound popularity and growing use, we need to be levelheaded about what big data can — and can’t — do.

The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful. A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two. Likewise, from 1998 to 2007 the number of new cases of autism diagnosed was extremely well correlated with sales of organic food (both went up sharply), but identifying the correlation won’t by itself tell us whether diet has anything to do with autism.

Second, big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement. Molecular biologists, for example, would very much like to be able to infer the three-dimensional structure of proteins from their underlying DNA sequence, and scientists working on the problem use big data as one tool among many. But no scientist thinks you can solve this problem by crunching data alone, no matter how powerful the statistical analysis; you will always need to start with an analysis that relies on an understanding of physics and biochemistry.

Third, many tools that are based on big data can be easily gamed. For example, big data programs for grading student essays often rely on measures like sentence length and word sophistication, which are found to correlate well with the scores given by human graders. But once students figure out how such a program works, they start writing long sentences and using obscure words, rather than learning how to actually formulate and write clear, coherent text. Even Google’s celebrated search engine, rightly seen as a big data success story, is not immune to “Google bombing” and “spamdexing,” wily techniques for artificially elevating website search placement.

Fourth, even when the results of a big data analysis aren’t intentionally gamed, they often turn out to be less robust than they initially seem. Consider Google Flu Trends, once the poster child for big data. In 2009, Google reported — to considerable fanfare — that by analyzing flu-related search queries, it had been able to detect the spread of the flu as accurately and more quickly than the Centers for Disease Control and Prevention. A few years later, though, Google Flu Trends began to falter; for the last two years it has made more bad predictions than good ones.

As a recent article in the journal Science explained, one major contributing cause of the failures of Google Flu Trends may have been that the Google search engine itself constantly changes, such that patterns in data collected at one time do not necessarily apply to data collected at another time. As the statistician Kaiser Fung has noted, collections of big data that rely on web hits often merge data that was collected in different ways and with different purposes — sometimes to ill effect. It can be risky to draw conclusions from data sets of this kind.

A fifth concern might be called the echo-chamber effect, which also stems from the fact that much of big data comes from the web. Whenever the source of information for a big data analysis is itself a product of big data, opportunities for vicious cycles abound. Consider translation programs like Google Translate, which draw on many pairs of parallel texts from different languages — for example, the same Wikipedia entry in two different languages — to discern the patterns of translation between those languages. This is a perfectly reasonable strategy, except for the fact that with some of the less common languages, many of the Wikipedia articles themselves may have been written using Google Translate. In those cases, any initial errors in Google Translate infect Wikipedia, which is fed back into Google Translate, reinforcing the error.

A sixth worry is the risk of too many correlations. If you look 100 times for correlations between two variables, you risk finding, purely by chance, about five bogus correlations that appear statistically significant — even though there is no actual meaningful connection between the variables. Absent careful supervision, the magnitudes of big data can greatly amplify such errors.

Seventh, big data is prone to giving scientific-sounding solutions to hopelessly imprecise questions. In the past few months, for instance, there have been two separate attempts to rank people in terms of their “historical importance” or “cultural contributions,” based on data drawn from Wikipedia. One is the book “Who’s Bigger? Where Historical Figures Really Rank,” by the computer scientist Steven Skiena and the engineer Charles Ward. The other is an M.I.T. Media Lab project called Pantheon.

Both efforts get many things right — Jesus, Lincoln and Shakespeare were surely important people — but both also make some egregious errors. “Who’s Bigger?” claims that Francis Scott Key was the 19th most important poet in history; Pantheon has claimed that Nostradamus was the 20th most important writer in history, well ahead of Jane Austen (78th) and George Eliot (380th). Worse, both projects suggest a misleading degree of scientific precision with evaluations that are inherently vague, or even meaningless. Big data can reduce anything to a single number, but you shouldn’t be fooled by the appearance of exactitude.

FINALLY, big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common. For instance, programs that use big data to deal with text, such as search engines and translation programs, often rely heavily on something called trigrams: sequences of three words in a row (like “in a row”). Reliable statistical information can be compiled about common trigrams, precisely because they appear frequently. But no existing body of data will ever be large enough to include all the trigrams that people might use, because of the continuing inventiveness of language.

To select an example more or less at random, a book review that the actor Rob Lowe recently wrote for this newspaper contained nine trigrams such as “dumbed-down escapist fare” that had never before appeared anywhere in all the petabytes of text indexed by Google. To witness the limitations that big data can have with novelty, Google-translate “dumbed-down escapist fare” into German and then back into English: out comes the incoherent “scaled-flight fare.” That is a long way from what Mr. Lowe intended — and from big data’s aspirations for translation.

Wait, we almost forgot one last problem: the hype….

Smart cities are here today — and getting smarter


Computer World: “Smart cities aren’t a science fiction, far-off-in-the-future concept. They’re here today, with municipal governments already using technologies that include wireless networks, big data/analytics, mobile applications, Web portals, social media, sensors/tracking products and other tools.
These smart city efforts have lofty goals: Enhancing the quality of life for citizens, improving government processes and reducing energy consumption, among others. Indeed, cities are already seeing some tangible benefits.
But creating a smart city comes with daunting challenges, including the need to provide effective data security and privacy, and to ensure that myriad departments work in harmony.

The global urban population is expected to grow approximately 1.5% per year between 2025 and 2030, mostly in developing countries, according to the World Health Organization.

What makes a city smart? As with any buzz term, the definition varies. But in general, it refers to using information and communications technologies to deliver sustainable economic development and a higher quality of life, while engaging citizens and effectively managing natural resources.
Making cities smarter will become increasingly important. For the first time ever, the majority of the world’s population resides in a city, and this proportion continues to grow, according to the World Health Organization, the coordinating authority for health within the United Nations.
A hundred years ago, two out of every 10 people lived in an urban area, the organization says. As recently as 1990, less than 40% of the global population lived in a city — but by 2010 more than half of all people lived in an urban area. By 2050, the proportion of city dwellers is expected to rise to 70%.
As many city populations continue to grow, here’s what five U.S. cities are doing to help manage it all:

Scottsdale, Ariz.

The city of Scottsdale, Ariz., has several initiatives underway.
One is MyScottsdale, a mobile application the city deployed in the summer of 2013 that allows citizens to report cracked sidewalks, broken street lights and traffic lights, road and sewer issues, graffiti and other problems in the community….”

Public interest labs to test open governance solutions


Kathleen Hickey in GCN: “The Governance Lab at New York University (GovLab) and the MacArthur Foundation Research Network have formed a new network, Open Governance, to study how to enhance collaboration and decision-making in the public interest.
The MacArthur Foundation provided a three-year grant of $5 million for the project; Google’s philanthropic arm, Google.org, also contributed. Google.org’s technology will be used to develop platforms to solve problems more openly and to run agile, real-world experiments with governments and NGOs to discover ways to enhance decision-making in the public interest, according to the GovLab announcement.
Network members include 12 experts in computer science, political science, policy informatics, social psychology and philosophy, law, and communications. This group is supported by an advisory network of academics, technologists, and current and former government officials. The network will assess existing government programs and experiment with ways to improve decision-making at the local, national and international government levels.
The Network’s efforts focus on three areas that members say have the potential to make governance more effective and legitimate: getting expertise in, pushing data out and distributing responsibility.
Through smarter governance, they say, institutions can seek input from lay and expert citizens via expert networking, crowdsourcing or challenges.  With open data governance, institutions can publish machine-readable data so that citizens can easily analyze and use this information to detect and solve problems. And by shared governance, institutions can help citizens develop solutions through participatory budgeting, peer production or digital commons.
“Recognizing that we cannot solve today’s challenges with yesterday’s tools, this interdisciplinary group will bring fresh thinking to questions about how our governing institutions operate and how they can develop better ways to help address seemingly intractable social problems for the common good,” said MacArthur Foundation President Robert Gallucci.
GovLab’s mission is to study and launch “experimental, technology-enabled solutions that advance a collaborative, networked approach to re-invent existing institutions and processes of governance to improve people’s lives.” Earlier this year GovLab released a preview of its Open Data 500 study of 500 companies using open government data as a key business resource.”

Infomediary Business Models for Connecting Open Data Providers and Users


Paper by Marijn Janssen and Anneke Zuiderwijk in Social Science Computer Review: “Many public organizations are opening their data to the general public and embracing social media in order to stimulate innovation. These developments have resulted in the rise of new, infomediary business models, positioned between open data providers and users. Yet the variation among types of infomediary business models is little understood. The aim of this article is to contribute to the understanding of the diversity of existing infomediary business models that are driven by open data and social media. Cases presenting different modes of open data utilization in the Netherlands are investigated and compared. Six types of business models are identified: single-purpose apps, interactive apps, information aggregators, comparison models, open data repositories, and service platforms. The investigated cases differ in their levels of access to raw data and in how much they stimulate dialogue between different stakeholders involved in open data publication and use. Apps often are easy to use and provide predefined views on data, whereas service platforms provide comprehensive functionality but are more difficult to use. In the various business models, social media is sometimes used for rating and discussion purposes, but it is rarely used for stimulating dialogue or as input to policy making. Hybrid business models were identified in which both public and private organizations contribute to value creation. Distinguishing between different types of open data users was found to be critical in explaining different business models.”

Behavioural economics and public policy


Tim Harford in the Financial Times:  “The past decade has been a triumph for behavioural economics, the fashionable cross-breed of psychology and economics. First there was the award in 2002 of the Nobel Memorial Prize in economics to a psychologist, Daniel Kahneman – the man who did as much as anything to create the field of behavioural economics. Bestselling books were launched, most notably by Kahneman himself (Thinking, Fast and Slow , 2011) and by his friend Richard Thaler, co-author of Nudge (2008). Behavioural economics seems far sexier than the ordinary sort, too: when last year’s Nobel was shared three ways, it was the behavioural economist Robert Shiller who grabbed all the headlines.

Behavioural economics is one of the hottest ideas in public policy. The UK government’s Behavioural Insights Team (BIT) uses the discipline to craft better policies, and in February was part-privatised with a mission to advise governments around the world. The White House announced its own behavioural insights team last summer.

So popular is the field that behavioural economics is now often misapplied as a catch-all term to refer to almost anything that’s cool in popular social science, from the storycraft of Malcolm Gladwell, author of The Tipping Point (2000), to the empirical investigations of Steven Levitt, co-author of Freakonomics (2005).
Yet, as with any success story, the backlash has begun. Critics argue that the field is overhyped, trivial, unreliable, a smokescreen for bad policy, an intellectual dead-end – or possibly all of the above. Is behavioural economics doomed to reflect the limitations of its intellectual parents, psychology and economics? Or can it build on their strengths and offer a powerful set of tools for policy makers and academics alike?…”

Building a More Open Government


Corinna Zarek at the White House: “It’s Sunshine Week again—a chance to celebrate transparency and participation in government and freedom of information. Every year in mid-March, we take stock of our progress and where we are headed to make our government more open for the benefit of citizens.
In December, 2013, the Administration announced 23 ambitious commitments to further open up government over the next two years in U.S. Government’s  second Open Government National Action Plan. Those commitments are now all underway or in development, including:
·         Launching an improved Data.gov: The updated Data.gov debuted in January, 2014, and continues to grow with thousands of updated or new government data sets being proactively made available to the public.
·         Increasing public collaboration: Through crowdsourcing, citizen science, and other methods, Federal agencies continue to expand the ways they collaborate with the public. For example, the National Aeronautics and Space Administration, for instance, recently launched its third Asteroid Grand Challenge, a broad call to action, seeking the best and brightest ideas from non-traditional partners to enhance and accelerate the work NASA is already doing for planetary defense.
·         Improving We the People: The online petition platform We the People gives the public a direct way to participate in their government and is currently incorporating improvements to make it easier for the public to submit petitions and signatures.”