Five Headlines from a Big Month for the Data Revolution


Sarah T. Lucas at Post2015.org: “If the history of the data revolution were written today, it would include three major dates. May 2013, when theHigh Level Panel on the Post-2015 Development Agenda first coined the phrase “data revolution.” November 2014, when the UN Secretary-General’s Independent Expert Advisory Group (IEAG) set a vision for it. And April 2015, when five headliner stories pushed the data revolution from great idea to a concrete roadmap for action.

The April 2015 Data Revolution Headlines

1. The African Data Consensus puts Africa in the lead on bringing the data revolution to the regional level. TheAfrica Data Consensus (ADC) envisions “a profound shift in the way that data is harnessed to impact on development decision-making, with a particular emphasis on building a culture of usage.” The ADC finds consensus across 15 “data communities”—ranging from open data to official statistics to geospatial data, and is endorsed by Africa’s ministers of finance. The ADC gets top billing in my book, as the first contribution that truly reflects a large diversity of voices and creates a political hook for action. (Stay tuned for a blog from my colleague Rachel Quint on the ADC).

2. The Sustainable Development Solutions Network (SDSN) gets our minds (and wallets) around the data needed to measure the SDGs. The SDSN Needs Assessment for SDG Monitoring and Statistical Capacity Development maps the investments needed to improve official statistics. My favorite parts are the clear typology of data (see pg. 12), and that the authors are very open about the methods, assumptions, and leaps of faith they had to take in the costing exercise. They also start an important discussion about how advances in information and communications technology, satellite imagery, and other new technologies have the potential to expand coverage, increase analytic capacity, and reduce the cost of data systems.

3. The Overseas Development Institute (ODI) calls on us to find the “missing millions.” ODI’s The Data Revolution: Finding the Missing Millions presents the stark reality of data gaps and what they mean for understanding and addressing development challenges. The authors highlight that even that most fundamental of measures—of poverty levels—could be understated by as much as a quarter. And that’s just the beginning. The report also pushes us to think beyond the costs of data, and focus on how much good data can save. With examples of data lowering the cost of doing government business, the authors remind us to think about data as an investment with real economic and social returns.

4. Paris21 offers a roadmap for putting national statistic offices (NSOs) at the heart of the data revolution.Paris21’s Roadmap for a Country-Led Data Revolution does not mince words. It calls on the data revolution to “turn a vicious cycle of [NSO] underperformance and inadequate resources into a virtuous one where increased demand leads to improved performance and an increase in resources and capacity.” It makes the case for why NSOs are central and need more support, while also pushing them to modernize, innovate, and open up. The roadmap gets my vote for best design. This ain’t your grandfather’s statistics report!

5. The Cartagena Data Festival features real-live data heroes and fosters new partnerships. The Festival featured data innovators (such as terra-i using satellite data to track deforestation), NSOs on the leading edge of modernization and reform (such as Colombia and the Philippines), traditional actors using old data in new ways (such as the Inter-American Development Bank’s fantastic energy database), groups focused on citizen-generated data (such as The Data Shift and UN My World), private firms working with big data for social good (such asTelefónica), and many others—all reminding us that the data revolution is well underway and will not be stopped. Most importantly, it brought these actors together in one place. You could see the sparks flying as folks learned from each other and hatched plans together. The Festival gets my vote for best conference of a lifetime, with the perfect blend of substantive sessions, intense debate, learning, inspiration, new connections, and a lot of fun. (Stay tuned for a post from my colleague Kristen Stelljes and me for more on Cartagena).

This month full of headlines leaves no room for doubt—momentum is building fast on the data revolution. And just in time.

With the Financing for Development (FFD) conference in Addis Ababa in July, the agreement of Sustainable Development Goals in New York in September, and the Climate Summit in Paris in December, this is a big political year for global development. Data revolutionaries must seize this moment to push past vision, past roadmaps, to actual action and results…..(More)”

How Data Mining could have prevented Tunisia’s Terror attack in Bardo Museum


Wassim Zoghlami at Medium: “…Data mining is the process of posing queries and extracting useful patterns or trends often previously unknown from large amounts of data using various techniques such as those from pattern recognition and machine learning. Latelely there has been a big interest on leveraging the use of data mining for counter-terrorism applications

Using the data on more than 50.000+ ISIS connected twitter accounts , I was able to establish an understanding of some factors determined how often ISIS attacks occur , what different types of terror strikes are used in which geopolitical situations, and many other criteria through graphs about the frequency of hashtags usages and the frequency of a particular group of the words used in the tweets.

A simple data mining project of some of the repetitive hashtags and sequences of words used typically by ISIS militants in their tweets yielded surprising results. The results show a rise of some keywords on the tweets that started from Marsh 15, three days before Bardo museum attacks.

Some of the common frequent keywords and hashtags that had a unusual peak since marsh 15 , three days before the attack :

#طواغيت تونس : Tyrants of Tunisia = a reference to the military

بشرى تونس : Good news for Tunisia.

قريبا تونس : Soon in Tunisia.

#إفريقية_للإعلام : The head of social media of Afriqiyah

#غزوة_تونس : The foray of Tunis…

Big Data and Data Mining should be used for national security intelligence

The Tunisian national security has to leverage big data to predict such attacks and to achieve objectives as the volume of digital data. Some of the challenges facing the Data mining techniques are that to carry out effective data mining and extract useful information for counterterrorism and national security, we need to gather all kinds of information about individuals. However, this information could be a threat to the individuals’ privacy and civil liberties…(More)”

Wicked Opportunities


Essay by William D. Eggers & Anna Muoio: “Wicked problems”—ranging from malaria to dwindling water supplies—are being reframed as “wicked opportunities” and tackled by networks of nongovernmental organizations, social entrepreneurs, governments, and big businesses.

As a killer disease, malaria is the world’s third biggest, after only HIV/AIDS and tuberculosis. In 2013, an estimated 584,000 people died of it—90 percent of these deaths in Africa, mostly among children under five years of age.1 And because 3.2 billion people—almost half the world’s population—live in regions where malaria spreads easily, it is very hard to fight.2 Scores of organizations are embroiled in the complex search for solutions, sometimes pursuing conflicting priorities, always competing for scarce resources. Despite the daunting challenges, here’s how Bill Gates, who has already spent more than $2 billion of Gates Foundation money on the problem, characterizes the situation: “This is one of the greatest opportunities the global health world has ever had.”3

Opportunity? It’s a surprising word even for an optimistic mega-philanthropist to describe a scourge that people have been trying to eliminate, unsuccessfully, for hundreds of years. It’s also, however, a fair statement about what is possible in the 21st century. We’re seeing a trend by which many kinds of “wicked problems”—complex, dynamic, and seemingly intractable social challenges—are being reframed and attacked with renewed vigor through solution ecosystems. Unprecedented networks of non-governmental organizations (NGOs), social entrepreneurs, health professionals, governments, and international development institutions—and yes, businesses—are coalescing around them, and recasting them as wicked opportunities….(More)”

The road to better data


Johannes Jütting at OECDInsightsTradition tells us that more than 3,000 years ago, Moses went to the top of Mount Sinai and came back down with 10 commandments. When the world’s presidents and prime ministers go to the top of the Sustainable Development Goals (SDGs) mountain in New York late this summer they will come down with not 10 commandments but 169. Too many?

Some people certainly think so. “Stupid development goals,” The Economist said recently. It argued that the 17 SDGs and roughly 169 targets should “honour Moses and be pruned to ten goals”. Others disagree. In a report for the Overseas Development Institute, May Miller-Dawkins, warned of the dangers of letting practicality “blunt ambition”. She backed SDGs with “high ambition”.

The debate over the “right” number of goals and targets is interesting, important even. But it misses a key point: No matter how many goals and targets are finally agreed, if we can’t measure their real impact on people’s lives, on our societies and on the environment, then they risk becoming irrelevant.

Unfortunately, we already know that many developing countries have problems compiling even basic social and economic statistics, never mind the complex web of data that will be needed to monitor the SDGs. A few examples: In 2013, about 35% of all live births were not officially registered worldwide, rising to two-thirds in developing countries. In Africa, just seven countries have data on their total number of landholders and women landholders, and none have data from before 2004. Last but not least, fast-changing economies and associated measurement challenges mean we are not sure today if we have worldwide a billion people living in extreme poverty, half a billion or more than a billion.

Why does this matter? Without adequate data, we cannot identify the problems that planning and policymaking need to address. We also cannot judge if governments and others are meeting their commitments. As a report from the Centre for Global Development notes, “Data […] serve as a ‘currency’ for accountability among and within governments, citizens, and civil society at large, and they can be used to hold development agencies accountable.”…(More)”

Data Science and Ebola


Inaugural Lecture by Aske Plaat on the acceptance of the position of professor of Data Science at the Universiteit Leiden: “…Today, everybody and everything produces data. People produce large amounts of data in social networks and in commercial transactions. Medical, corporate, and government databases continue to grow. Ten years ago there were a billion Internet users. Now there are more than three billion, most of whom are mobile.1 Sensors continue to get cheaper and are increasingly connected, creating an Internet of Things. The next three billion users of the Internet will not all be human, and will generate a large amount of data. In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to medicine, to the behavioral sciences, to finance and commerce, to the humanities and to the arts. In every discipline people want to organize, analyze, optimize and understand their data to answer questions and to deepen insights. The availability of so much data and the ability to interpret it are changing the way the world operates. The number of sciences using this approach is increasing. The science that is transforming this ocean of data into a sea of knowledge is called data science. In many sciences the impact on the research methodology is profound—some even call it a paradigm shift.

…I will address the question of why there is so much interest in data. I will answer this question by discussing one of the most visible recent challenges to public health of the moment, the 2014 Ebola outbreak in West Africa…(More)”

New surveys reveal dynamism, challenges of open data-driven businesses in developing countries


Alla Morrison at World Bank Open Data blog: “Was there a class of entrepreneurs emerging to take advantage of the economic possibilities offered by open data, were investors keen to back such companies, were governments tuned to and responsive to the demands of such companies, and what were some of the key financing challenges and opportunities in emerging markets? As we began our work on the concept of an Open Fund, we partnered with Ennovent (India), MDIF (East Asia and Latin America) and Digital Data Divide (Africa) to conduct short market surveys to answer these questions, with a focus on trying to understand whether a financing gap truly existed in these markets. The studies were fairly quick (4-6 weeks) and reached only a small number of companies (193 in India, 70 in Latin America, 63 in South East Asia, and 41 in Africa – and not everybody responded) but the findings were fairly consistent.

  • Open data is still a very nascent concept in emerging markets. and there’s only a small class of entrepreneurs/investors that is aware of the economic possibilities; there’s a lot of work to do in the ‘enabling environment’
    • In many regions the distinction between open data, big data, and private sector generated/scraped/collected data was blurry at best among entrepreneurs and investors (some of our findings consequently are better indicators of  data-driven rather than open data-driven businesses)
  • There’s a small but growing number of open data-driven companies in all the markets we surveyed and these companies target a wide range of consumers/users and are active in multiple sectors
    • A large percentage of identified companies operate in sectors with high social impact – health and wellness, environment, agriculture, transport. For instance, in India, after excluding business analytics companies, a third of data companies seeking financing are in healthcare and a fifth in food and agriculture, and some of them have the low-income population or the rural segment of India as an intended beneficiary segment. In Latin America, the number of companies in business services, research and analytics was closely followed by health, environment and agriculture. In Southeast Asia, business, consumer services, and transport came out in the lead.
    • We found the highest number of companies in Latin America and Asia with the following countries leading the way – Mexico, Chile, and Brazil, with Colombia and Argentina closely behind in Latin America; and India, Indonesia, Philippines, and Malaysia in Asia
  • An actionable pipeline of data-driven companies exists in Latin America and in Asia
    • We heard demand for different kinds of financing (equity, debt, working capital) but the majority of the need was for equity and quasi-equity in amounts ranging from $100,000 to $5 million USD, with averages of between $2 and $3 million USD depending on the region.
  • There’s a significant financing gap in all the markets
    • The investment sizes required, while they range up to several million dollars, are generally small. Analysis of more than 300 data companies in Latin America and Asia indicates a total estimated need for financing of more than $400 million
  • Venture capitals generally don’t recognize data as a separate sector and club data-driven companies with their standard information communication technology (ICT) investments
    • Interviews with founders suggest that moving beyond seed stage is particularly difficult for data-driven startups. While many companies are able to cobble together an initial seed round augmented by bootstrapping to get their idea off the ground, they face a great deal of difficulty when trying to raise a second, larger seed round or Series A investment.
    • From the perspective of startups, investors favor banal e-commerce (e.g., according toTech in Asia, out of the $645 million in technology investments made public across the region in 2013, 92% were related to fashion and online retail) or consumer service startups and ignore open data-focused startups even if they have a strong business model and solid key performance indicators. The space is ripe for a long-term investor with a generous risk appetite and multiple bottom line goals.
  • Poor data quality was the number one issue these companies reported.
    • Companies reported significant waste and inefficiency in accessing/scraping/cleaning data.

The analysis below borrows heavily from the work done by the partners. We should of course mention that the findings are provisional and should not be considered authoritative (please see the section on methodology for more details)….(More).”

The International Handbook Of Public Administration And Governance


New book edited by Andrew Massey and Karen Johnston: “…Handbook explores key questions around the ways in which public administration and governance challenges can be addressed by governments in an increasingly globalized world. World-leading experts explore contemporary issues of government and governance, as well as the relationship between civil society and the political class. The insights offered will allow policy makers and officials to explore options for policy making in a new and informed way.

Adopting global perspectives of governance and public sector management, the Handbook includes scrutiny of current issues such as: public policy capacity, wicked policy problems, public sector reforms, the challenges of globalization and complexity management. Practitioners and scholars of public administration deliver a range of perspectives on the abiding wicked issues and challenges to delivering public services, and the way that delivery is structured. The Handbook uniquely provides international coverage of perspectives from Africa, Asia, North and South America, Europe and Australia.

Practitioners and scholars of public administration, public policy, public sector management and international relations will learn a great deal from this Handbook about the issues and structures of government and governance in an increasingly complex world. (Full table of contents)… (More).”

Bloomberg Philanthropies Launches $100 Million Data for Health Program in Developing Countries


Press Release: “Bloomberg Philanthropies, in partnership with the Australian government, is launching Data for Health, a $100 million initiative that will enable 20 low- and middle-income countries to vastly improve public health data collection.  Each year the World Health Organization estimates that 65% of all deaths worldwide – 35 million each year – go unrecorded. Millions more deaths lack a documented cause. This gap in data creates major obstacles for understanding and addressing public health problems. The Data for Health initiative seeks to provide governments, aid organizations, and public health leaders with tools and systems to better collect data – and use it to prioritize health challenges, develop policies, deploy resources, and measure success. Over the next four years, Data for Health aims to help 1.2 billion people in 20 countries across Africa, Asia, and Latin America live healthier, longer lives….

“Australia’s partnership on Data for Health coincides with the launch of innovationXchange, a new initiative to embrace exploration, experimentation, and risk through a focus on innovation,” said the Hon Julie Bishop MP, Australia’s Minister for Foreign Affairs. “Greater innovation in development assistance will allow us to do a better job of tackling the world’s most daunting problems, such as a lack of credible health data.”

In addition to improving the recording of births and deaths, Data for Health will support new mechanisms for conducting public health surveys. These surveys will monitor major risk factors for early death, including non-communicable diseases (chronic diseases that are not transmitted from person to person such as cancer and diabetes). With information from these surveys, illness caused by day-to-day behaviors such as tobacco use and poor nutrition habits can be targeted, addressed and prevented. Data for Health will take advantage of the wide-spread use of mobile phone devices in developing countries to enhance the efficiency of traditional household surveys, which are typically time-consuming and expensive…(More)”

Public interest models: a powerful tool for the advocacy agenda


at Open Oil: “Open financial models can clearly put analysis into a genuinely independent public space, and also trigger a rise in public understanding which could enrich the governance debate in many countries.

But there is a third function public models can serve: that of advocacy for targeted disclosure of information.

The stress here is on “targeted”. A lot of transparency debates are generic – the need to disclose data as a matter of principle.

It is striking that as the transparency agenda has advanced, and won many battles, so has a debate about whether it is contributing to an increase in accountability. As Paul Collier said: “transparency has to lead to accountability otherwise we’re just ticking loads of boxes”.

We need all these campaigns to continue, and we need to pursue maximum disclosure. Because while transparency does not guarantee accountability, it is its essential prerequisite. Necessary but not sufficient.

But here’s where modeling can help to provide some examples of how data can be used, in a very specific way, to advance accountability.

Let’s take the example of an oil project in Africa. A financial model has to deal with uncertainty and so provides three scenarios for future production and prices, which all have a radical impact on the revenues the government could expect to see. That’s unavoidable. Under the “God, Exxon and everyone else” principle, future price and to some extent production are hard to foresee.

But then there is a second layer of uncertainty caused specifically by the model having to use public domain data. The company, and the government if it exercised its rights of access to information, does not face this second layer because it has access to real data, whereas the public interest model must use estimates and extrapulations. These can be justified, written out and explained – they can be well-informed guesses, in other words, and in the blog on the analytical power of public models, we argue that you can still arrive at useful analysis and conclusions despite this handicap.

Nevertheless, they are guesses. And unlike the first layer of uncertainty, relating to future prices and the ever-changing global market, this second layer can be directly addressed by information the government already has to hand – or could get under its contractual right of access to information….(More)”

Open-Data Project Adds Transparency to African Elections


Jessica Weiss at the International Center for Journalists: “An innovative tool developed to help people register to vote in Kenya is proving to be a valuable asset to voters across the African continent.

GotToVote was created in 2012 by two software developers under the guidance of ICFJ’s Knight International Journalism Fellow Justin Arenstein for use during Kenya’s general elections. In just 24 hours, the developers took voter registration information in a government PDF and turned it into a simple website with usable data that helped people locate the nearest voting center where they could register for elections. Kenyan media drove a large audience to the site, which resulted in a major boost in voter registrations.

Since then, GotToVote has helped people register to vote in Malawi and Zimbabwe. Now, it is being adapted for use in national elections in Ghana and Uganda in 2016.

Ugandan civic groups led by The African Freedom of Information Centre are planning to use it to help people register, to verify registrations and for SMS registration drives. They are also proposing new features—including digital applications to help citizens post issues of concern and compare political positions between parties and candidates so voters better understand the choices they are being offered.

In Ghana, GotToVote is helping citizens find their nearest registration center to make sure they are eligible to vote in that country’s 2016 national elections. The tool, which is optimized for mobile devices, makes voter information easily accessible to the public. It explains who is eligible to register for the 2016 general elections and gives a simple overview of the voter registration process. It also tells users what documentation to take with them to register…..

Last year, Malawi’s national government used GotToVote to check whether voters were correctly registered. As a result, more than 20,000 were found to be incorrectly registered, because they were not qualified voters or were registered in the wrong constituency. In 2013, thousands used GotToVote via their mobile and tablet devices to find their polling places in Zimbabwe.

The successful experiment provides a number of lessons about the power and feasibility of open data projects, showing that they don’t require large teams, big budgets or a lot of time to build…(More)